Type Inference Consider Harmful

Warren Burton Thu, 30 Dec 1993 15:28:06 -0500

     The Haskell type system is complicated and much of the complication 
seems to result from supporting type inference.

     Already we have a variety of situations where explicit type 
signatures are required.  For example:

1.  Because of the monomorphism restriction, a type signatures is 
sometimes required.  (See page 40 of the Haskell Report, Version 1.2.)

2.  To resolve ambiguous overloading, a type signatures is sometimes 
required.  (See page 34 of the Haskell Report.)

3.  A type signatures may be used to give a variable a less general type 
than would be inferred.

     Recently, Simon Peyton Jones has proposed that polymorphic 
recursion be allowed in Haskell 1.3, with an explicit type signature.  
An example Simon gives is:

>  data Foo a = A | B (Foo [a])

As written this is not permitted in Haskell 1.2, and in any case it 
can't be implemented without an explicit type signature.  However, 
Lennart has shown that this type can be defined in Haskell 1.2 in a 
roundabout way using mutually recursive modules, where one needs a hand 
written inferface file for one of the modules.  It appears that the 
signature in the hand written interface file is the explicit signature 
that is required.

     Other generalizations of the type system have been discussed.  
While some of these do not require explicit type signatures, other 
problems relating to type inference arise.  John Hughes commented on one 
of these proposals as follows:
     
>     The biggest worry for me in using graph unification for type 
>     checking is the potential effect on error messages from the 
>     type checker. While dropping the restriction makes some 
>     sensible programs well-typed, it also makes some senseless 
>     programs well-typed, thus delaying the moment at which type 
>     errors are discovered.
>     
>     For example, consider
>     
>     iterate 0 f x = x
>     iterate (n+1) f x = f (iterate n x)
>     
>     
>     At present this produces an occurrence type error. With 
>     recursive type synonyms, it would be well-typed with the 
>     type
>     
>     iterate :: Num a => a -> b -> b
>                where b = (b->b) -> b
>     
>     The error can only be discovered at a call of iterate.
>     
>     
>     We already have a problem understanding type errors. 
>     Consider a very simple example:
>          \f -> f 1 + f 1 2
>     A Pascal compiler, if Pascal had lambda-expressions, could 
>     report a missing parameter. An ML compiler objects that 
>     Int->*a and Int cannot be unified. Haskell is worse: hbc 
>     says there is no instance Num (a->b). Not easy for the naive 
>     user to diagnose.
>     
>     
>     I think it would be wise to implement graph unification and 
>     get some experience with it first, before adding it to 
>     Haskell. Personally, I get `occurs check' messages from the 
>     type checker fairly often, and they are almost always caused 
>     by mistakes I am happy to find. I am not at all convinced 
>     that, on balance, graph unification would make Haskell more 
>     useable.

These problems would be eliminated if explicit type signatures were 
provided.

     Type classes are, in my opinion, a very important feature in 
Haskell, and a step towards providing some of the advantages of object 
oriented programming (whatever that is) in a functional language.  
However, with type classes, the value of an expression may depend on the 
types of variables used in the expression.  This leads to some subtle 
problems.  In particular, equational reasoning is no longer valid!

     Consider the following example (which is a variation of an example 
I have given in an earlier message):

>  module Main (main) where
>  main = prints result "\n"
>  w = const [] y
>  x = const [] z
>  result = (y, z)
>  y = 'a'  : w
>  z = True : x

     This is a valid program that gives the expected result.  By 
equational reasoning, we can change
>  w = const [] y
to
>  w = []
and/or
>  x = const [] z
to
>  x = []
The program remains valid after either of both of these changes are 
made.  Neither of these changes alters the value or even the type of 
anything.  Having carefully checked this out, we will now leave our 
program with one of these changes in place but not the other.  Since the 
change made no difference, even to the type of w or x (both of which 
remain [a]), we will simply forget about this issue and go on to other 
things.

     Now suppose we change
>  result = (y, z)
to
>  result = (w ++ x)
If we made the first change above (w = []), the program will now print
"[]".  If we made the second change (x = []) then the program will print
"" (i.e. nothing).  Figure out why!  Of course, with either working 
version of the program, changing the definitions of y and z can also
change the value of result.  Now quickly figure out what changing the
definitions of w and x to
>  w = const [] z
>  x = []
will produce.

     This example involves dependency analysis and the monomorphism 
restrictions, but is similar to an earlier example of mine that involved 
neither.  The problem is that the type of a variable, and hence the 
value produced by an overloaded operator, is determined by something 
that may be textually distant and appears not to relate to the answer.  

     Another variation of the above example is:

>  module Main (main) where
>  main = print (show x == show y)
>  x = 5
>  y = 5
>  z = x + 2.3  -- used for side effect only

This program prints "False".  The default mechanism is a factor here.

     I contend that type inference is no longer making things simpler on 
balance.  So what is the solution?

     There are several possible solutions.  The simplest is to make a 
practice of writing type signatures for most named objects.  Many people 
already do this with functions and other nontrivial objects, since it 
helps in locating errors and is good documentation.  Hence, the status 
quo isn't really too bad!

     Some minor changes to Haskell could make this easier.  These might 
include:

(1)   Make it were possible to provide a type for any named object.  
Pages 39 and 40 of the Haskell Report gives the example,

>  f x = let g y z = ([x,y], z) in

where it is not possible to give a type signature for g.  The signature

>  g :: a -> b -> ([a],b)

is too general, because x and y must have the same type.  Perhaps this 
problem could be solved using a TypeOf operator, permitting

>  g :: TypeOf x -> b -> ([TypeOf x], b)

(2)  Allow types to be specified for lambda bound variables.  For 
example,

>  (\ n :: Int -> n + 1)

(3)  Changing the default mechanism so the form of a literal determines 
its type without regard to its use, with no overloading of literals.  
For example, with the default default, which is

>  default (Int, Double)

in the example

>  x = 5
>  y = 5
>  z = x + 2.3  -- used for side effect only

x would have type Int and the definition of z would be illegal.

     Of course the most drastic change would be to eliminate type 
inference completely.  This would make it possible to extend the Haskell 
type system in various ways, as discussed above, and might result in a 
simplier type system.  The main disadvantage to this solution is that, 
when diehard C programmers are forced to start using Haskell because it 
outperforms C on future parallel machines, the subtleties of type 
inference may continue to make life worthwhile for them.  Seriously, I 
don't mind if a language has features I don't like (e.g. the goto in 
Pascal, too much type inference, etc.) as long as I can avoid using 
these features.

     Along this lines, some simple forms of type inference can't do much 
harm and will reduce unnecessary clutter.  For example, if the type of a 
variable can be determined from its definition alone (not its use), and 
the type is unique (e.g. contains no type variables) then type inference 
should not cause any problems as far as I can see.  This would probably 
provide as much type inference as most people are likely to really want.

     To satisfy both the puritans and the hackers, Haskell could allow a 
programmer to specify the degree of type inference he of she would like 
to have, using a mechanism similar to the current default mechanism.

     How useful do you find type inference to be in practice?

Warren Burton
Type Inference Consider Harmful

Reply via email to