Improper use of Generics could breach type safety

I was going through Java discussion forums and happened to come across an interesting discussion related to Generics. It was about a loop hole in the Java Generics feature that allows a method to throw a Checked Exception without declaring it in the throws clause.

This is definitely going to be freaky and I wouldn’t encourage you to take advantage of this in your development. Initially I thought this could be a loop hole in the Compiler implementation, but later realized that the behavior is inherent with the Generics design.

public class RethrowCheckedException {
    public static void rethrow(Throwable t) {//No throws clause
        RethrowCheckedException.throwing(t);
    }

    public static  void throwing(Throwable t) throws T {
        throw (T)t;
    }
    public static void main(String[] args) {
        try {
            throw new Exception("dummy");
        } catch (Exception e) {
            rethrow(e);
        }
    }
}

Using the rethrow() method we can make java.lang.Exception to escape from main() without declaring it in its throws clause. The type <RuntimeException> specified while invoking the throwing() method is know as “explicit type argument”. (which is complementary to the type inference discussed below)

Type Inference

To understand how the above thing works, we need to get an idea about type inference and explicit typing. Take a look at the the following example

public class ArrayStore {
    public static  void store(T[] array, T item) {
        if (array.length > 0)
            array[0] = item;
    }
    public static void main(String[] args) {
        store(new String[1], "abc"); //ok
        store(new String[1], new Integer(10)); //??
    }
}

Surprisingly, both the store method invocations shown, compile. Compiler infers T to String for the first store() invocation as both the arguments are Strings, this process is known as “type inference”. We expect the second one to fail as T cannot be evaluated to both a String as well as an Integer. But, the Compiler infers T to a common super class or interface other than java.lang.Object. As String and Integer have Serializable and Comparable as super interfaces, Compiler deduces T to be a Serializable or Comparable and allows compilation. The result of which is an ArrayStoreException at runtime, defying the whole point of Generics and type safety.

How do we resolve this problem? We can use the “explicit type argument” in such cases.

ArrayStore.store(new String[1], new Integer(10));//compilation Error

Explicitly specifying the type argument as String, forces the compiler to evaluate T to String (overriding the compiler’s type inference mechanism) and hence compilation fails.

Now that we got a hang on type inference, we’ll get back to the loop hole.

public static  void throwing(Throwable t) throws T {
    throw (T)t;
}

The throwing() method defines type <T extends Throwable> which is declared in its throws clause and is no way related to the argument passed. The rethrow() method explicitly specifies the type as RuntimeException, making the method signature of the throwing() method to appear as shown below, at compile time.

public static void throwing(Throwable t) throws RuntimeException {
    throw (RuntimeException)t;
}

Compiler evaluates the method signature by replacing the type variable T with its inferred or explicitly specified type, which is RuntimeException in our case. As a result of this Compiler doesn’t enforce exception handling or re-throwing it in the rethrow() method. After compilation and type erasure, all the generic type information is erased as shown below.

public static void throwing(Throwable t) throws Throwable {
    throw (Throwable)t;
}

Some thoughts
Type inference or explicit type argument could do some damage when the Generic type information is not captured properly. In the loop hole case, since the exception argument passed to the throwing() method is not related to the Generic type (T), compiler was blind folded. If we modify the method signature as shown below, method invocation fails to compile.

public static  void throwing(T t) throws T {
    throw (T)t;
}

Similarly, in the ArrayStore example we cannot expect the programmer to invoke the store method by explicitly specifying the type everytime at the calling context. A better way of capturing the generic type information in this case is to establish a relationship between Type variables(T’s), to enforce expected compile time checks.

public static  void store(T[] array, S item) { ... }

And interestingly, the following generic type declarations work as expected unlike the ArrayStore example.

public static  setMap(Map map) { ... }
public static  addToList(List list, T element) { ...}
addToList(new ArrayList(), new Integer(10)); //fails to compile

Java Language Specification(JLS) 3.0, 15.12.2.7 states “The process of type inference is inherently complex.”

Yes indeed it is really complex to understand type inference. All this appeals for better understanding of the type inference mechanism and how it applies. Probably you can get more information from FAQ by Angelika Langer.

Understanding Generics with Collections

In Java (prior to 5.0), a lot of times you are compelled to downcast your object to a more specific one. For example, when you add a String to a List, and when you want to retrieve your String back then you need to downcast.

List myList = new ArrayList();
myList.add("abc");
String str = (String)myList.get(0);

Downcast is inevitable. Moreover, adding objects of any type to the list is allowed and the developer is responsible to remember the type of each object and perform the appropriate downcast while retrieving them. This gives a way to the type safety problems in Java, as every downcast in the code is a potential case for the wicked ClassCastException. Generics have been introduced to rescue us from these situations, they let you mark the collection to contain elements of a particular data type only, say a List of Strings. The syntax to specify a List of Strings would look like this. (Note: ‘type’ refers to any class definition in Java, e.g. String, Integer, Collection, MyClass etc. are all denoted as types.)

List myList = new ArrayList();
myList.add("abc");
String str = myList.get(0);

The syntax is fairly simple, you need to specify the ‘type’ of the Collection in the angular brackets following the Collection type. Such types are known as Generic types or Parameterized types. We’ll get to know more about defining collection types and substitution in the following sections.

With the above syntax, it is not allowed to add objects or retrieve objects of any other type other than String to the above List and doing that would result in a compile-time error. This is much better than the ClassCastException at runtime, and would definitely save a lot of your development time, isn’t it? And more importantly the downcast should be made extinct now, as the type of the elements within the collection is explicity informed to the compiler through the Generic syntax.

Purpose of Generics
Generics make your program well formed enabling the compiler to perform enough type checks based on the static type information provided and avoids unexpected type errors that could occur at runtime. Let us get into more practical matters.

Collections and Substitution rules
Collections are the primary motivation for Generics in Java.

Let us take a look at some substitution rules. The following are some legal assignments:

List li = new ArrayList();
Collection ci = new LinkedList();
Collection cs = new HashSet();
List lst = new ArrayList();
List li = new ArrayList();//warning

Here is the substitution principle for collections, (Rule 1) RHS of the assignment should contain a Collection implementation compatible with that of LHS and generified with the same type as that of LHS. The last assignment two assignments are valid, these are allowed to provide compatibility of non-generic (prior to Java 5) code with the new generic approach and vice versa. But, if you compile your code with -Xlint:unchecked option, the last assignment results in a unchecked conversion warning. (Rule 2) Do not ignore such compilation warnings, as they indicate your code to be unsafe (could break at runtime with ClassCastExceptions).

List ls = new ArrayList();//1
ls.add("abc");//2
ls.add("xyz");//3
Iterator its = ls.listIterator();//4
while(its.next()) {//5
    String s = its.next();//6
    System.out.println("Element: "+s);
}

Does this compile? No, Iterator is not a generic type and hence the assignment of iterator’s element to the String ‘s’ (line 6) fails with a compilation error. Basically, in line 4 we lost the type information while obtaining the iterator and so we need an explicit cast here. Hence, you need to make the Iterator parameterized with String type to avoid the explicit cast.

Iterator its = ls.listIterator();//4

(Rule 3) When you get an iterator, keySet, entrySet or values from a collection, assign to an appropriate parameterized type as shown above. This is because, these methods are modified to return their corresponding generic types to benifit no-cast code. Most of the Java 5 aware IDEs can do this job for you automatically, rely on them.

The following assignments are invalid:

Set ss = new HashSet();//Incompatible Types
List<Object> lo = new ArrayList<String>();//compile-time error

Though String is a subtype of Object, the second assignment is not allowed. Collection of Objects is a bigger set comprising of elements of various types (Strings, Integers, Cats, Dogs etc.), but a Collection of Strings strictly contains Strings and both of these cannot be equated (Rule 4). In programmatic sense if this were allowed, we would end up adding objects of any type to a List of Strings, defying the purpose of generics and hence this is not allowed.

Well, with the above restriction, how would you implement a method that accepts a collection of any type, iterate over it and print the elements? For such purposes, Wildcards are introduced for generic types to represent unknown collections.

Wildcards
We know that Object[] is the supertype of all arrays, similarly Collection<?> is the supertype of all generic collections which is pronounced as “Collection of unknown”. (Note: Collection<?> represents List<?>, ArrayList<?>, HashSet<?> etc. And Collection<?> is only a reference type and you cannot instantiate it, i.e. new ArrayList<?>() or new HashSet<?>() is not allowed.) (Rule 5) Collections parameterized with wildcards cannot be instantiated.

Using Collection<?> we can implement the iterate and print method as shown below.

public void printElements(Collection<?> c) {
    for(Object o : c) {
        System.out.println(o);
    }
}
...
List<String> ls = new ArrayList<String>();
List<Cat> lc = new ArrayList<Cat>();
...
printElements(ls);
printElements(lc);

Is Collection<?> same as plain old Collection? No, there are lot of differences between the plain old Collection, Collection<?> and Collection<Object>.

The following are the differences between them:

  • Collection<?> is a homogenous collection that represents a family of generic instantiations of Collection (i.e. Collection<String>, Collection<Integer> etc.)
  • Collection<Object> is a heterogenous collection or a mixed bag that contains elements of all types, close to the plain old Collection but not same
  • Collection<?> ensures that you don’t add aribtrary objects, as we do not know the type of the collection (Rule 6)
  • Collection<?> cannot be treated as a read-only collection, as it allows remove() and clear() operations
  • You can assign Collection<String> or Collection<Number> to a Collection<?> reference type, but not to a Collection<Object> (Refer to Rule 4)
List<String> list = new ArrayList<String>();
list.add("Tiger");
list.add("Mustang");
Collection<?> c = list;
Object o = c.get(0); //returns "Tiger" downcasted to Object
c.contains("Tiger"); //returns true
Iterator itr = c.iterator();
while(itr.hasNext()) {
    Object o = itr.next();
    System.out.println(o);
}
c.remove("Mustang"); //removes "Mustang" from the List
c.add("Dolphin"); //compile-time error (as per Rule 6)

Collection<?> appears very restrictive as you do not known the type information. When you obtain your elements from this collection you need to work with objects and would sometimes end up in explicit cast. So, strictly encourage Collection<?> when you need no type specific operations (Rule 7). But, there would be very few such use-cases in practice, where as more frequently you may need to perform operations on a base interface and you do not bother about the implementation type. In such cases you can benifit with the ‘bounded wildcards’.

Bounded wildcards
List<? extends Number> is an example of a bounded wildcard. This represents a homogenous List that contains elements that are subtypes of Number. Bounded wildcards only indicate an unknown type which is a subtype of Number.

public void addInteger(List<? extends Number> lnum) {
    Number num = lnum.get(0);
    byte b = num.byteValue();
    lnum.add(new Integer(10));//not allowed, compile-time error
}

So, we can obtain elements from the collection assuming the type to be a Number. But, you are not allowed to add anything to the collection as we do not know which subtype of Number the collection contains.

Differences between List<Number> and List<? extends Number>:

  • List<Number> is a heterogenous collection of Number objects (i.e. it can contain instances of Integer, Float, Long, etc.)
  • List<? extends Number> represents a homogenous collection of Number or its subtypes. It is instantiated with any of List<Integer>, List<Float> etc.

“? extends Type” is known as the upper bound, and we also have “? super Type” which is the lower bound where the unknown type denotes a super type of the specified Type. This is rarely useful with collections but could come handy when we define our own generic types.

We’ll see more about defining generic types, generic methods and type erasure semantics in my next posts.