The Haskell type system is complicated and much of the complication
seems to result from supporting type inference.
Already we have a variety of situations where explicit type
signatures are required. For example:
1. Because of the monomorphism restriction, a type signatures is
sometimes required. (See page 40 of the Haskell Report, Version 1.2.)
2. To resolve ambiguous overloading, a type signatures is sometimes
required. (See page 34 of the Haskell Report.)
3. A type signatures may be used to give a variable a less general type
than would be inferred.
Recently, Simon Peyton Jones has proposed that polymorphic
recursion be allowed in Haskell 1.3, with an explicit type signature.
An example Simon gives is:
> data Foo a = A | B (Foo [a])
As written this is not permitted in Haskell 1.2, and in any case it
can't be implemented without an explicit type signature. However,
Lennart has shown that this type can be defined in Haskell 1.2 in a
roundabout way using mutually recursive modules, where one needs a hand
written inferface file for one of the modules. It appears that the
signature in the hand written interface file is the explicit signature
that is required.
Other generalizations of the type system have been discussed.
While some of these do not require explicit type signatures, other
problems relating to type inference arise. John Hughes commented on one
of these proposals as follows:
> The biggest worry for me in using graph unification for type
> checking is the potential effect on error messages from the
> type checker. While dropping the restriction makes some
> sensible programs well-typed, it also makes some senseless
> programs well-typed, thus delaying the moment at which type
> errors are discovered.
>
> For example, consider
>
> iterate 0 f x = x
> iterate (n+1) f x = f (iterate n x)
>
>
> At present this produces an occurrence type error. With
> recursive type synonyms, it would be well-typed with the
> type
>
> iterate :: Num a => a -> b -> b
> where b = (b->b) -> b
>
> The error can only be discovered at a call of iterate.
>
>
> We already have a problem understanding type errors.
> Consider a very simple example:
> \f -> f 1 + f 1 2
> A Pascal compiler, if Pascal had lambda-expressions, could
> report a missing parameter. An ML compiler objects that
> Int->*a and Int cannot be unified. Haskell is worse: hbc
> says there is no instance Num (a->b). Not easy for the naive
> user to diagnose.
>
>
> I think it would be wise to implement graph unification and
> get some experience with it first, before adding it to
> Haskell. Personally, I get `occurs check' messages from the
> type checker fairly often, and they are almost always caused
> by mistakes I am happy to find. I am not at all convinced
> that, on balance, graph unification would make Haskell more
> useable.
These problems would be eliminated if explicit type signatures were
provided.
Type classes are, in my opinion, a very important feature in
Haskell, and a step towards providing some of the advantages of object
oriented programming (whatever that is) in a functional language.
However, with type classes, the value of an expression may depend on the
types of variables used in the expression. This leads to some subtle
problems. In particular, equational reasoning is no longer valid!
Consider the following example (which is a variation of an example
I have given in an earlier message):
> module Main (main) where
> main = prints result "\n"
> w = const [] y
> x = const [] z
> result = (y, z)
> y = 'a' : w
> z = True : x
This is a valid program that gives the expected result. By
equational reasoning, we can change
> w = const [] y
to
> w = []
and/or
> x = const [] z
to
> x = []
The program remains valid after either of both of these changes are
made. Neither of these changes alters the value or even the type of
anything. Having carefully checked this out, we will now leave our
program with one of these changes in place but not the other. Since the
change made no difference, even to the type of w or x (both of which
remain [a]), we will simply forget about this issue and go on to other
things.
Now suppose we change
> result = (y, z)
to
> result = (w ++ x)
If we made the first change above (w = []), the program will now print
"[]". If we made the second change (x = []) then the program will print
"" (i.e. nothing). Figure out why! Of course, with either working
version of the program, changing the definitions of y and z can also
change the value of result. Now quickly figure out what changing the
definitions of w and x to
> w = const [] z
> x = []
will produce.
This example involves dependency analysis and the monomorphism
restrictions, but is similar to an earlier example of mine that involved
neither. The problem is that the type of a variable, and hence the
value produced by an overloaded operator, is determined by something
that may be textually distant and appears not to relate to the answer.
Another variation of the above example is:
> module Main (main) where
> main = print (show x == show y)
> x = 5
> y = 5
> z = x + 2.3 -- used for side effect only
This program prints "False". The default mechanism is a factor here.
I contend that type inference is no longer making things simpler on
balance. So what is the solution?
There are several possible solutions. The simplest is to make a
practice of writing type signatures for most named objects. Many people
already do this with functions and other nontrivial objects, since it
helps in locating errors and is good documentation. Hence, the status
quo isn't really too bad!
Some minor changes to Haskell could make this easier. These might
include:
(1) Make it were possible to provide a type for any named object.
Pages 39 and 40 of the Haskell Report gives the example,
> f x = let g y z = ([x,y], z) in
where it is not possible to give a type signature for g. The signature
> g :: a -> b -> ([a],b)
is too general, because x and y must have the same type. Perhaps this
problem could be solved using a TypeOf operator, permitting
> g :: TypeOf x -> b -> ([TypeOf x], b)
(2) Allow types to be specified for lambda bound variables. For
example,
> (\ n :: Int -> n + 1)
(3) Changing the default mechanism so the form of a literal determines
its type without regard to its use, with no overloading of literals.
For example, with the default default, which is
> default (Int, Double)
in the example
> x = 5
> y = 5
> z = x + 2.3 -- used for side effect only
x would have type Int and the definition of z would be illegal.
Of course the most drastic change would be to eliminate type
inference completely. This would make it possible to extend the Haskell
type system in various ways, as discussed above, and might result in a
simplier type system. The main disadvantage to this solution is that,
when diehard C programmers are forced to start using Haskell because it
outperforms C on future parallel machines, the subtleties of type
inference may continue to make life worthwhile for them. Seriously, I
don't mind if a language has features I don't like (e.g. the goto in
Pascal, too much type inference, etc.) as long as I can avoid using
these features.
Along this lines, some simple forms of type inference can't do much
harm and will reduce unnecessary clutter. For example, if the type of a
variable can be determined from its definition alone (not its use), and
the type is unique (e.g. contains no type variables) then type inference
should not cause any problems as far as I can see. This would probably
provide as much type inference as most people are likely to really want.
To satisfy both the puritans and the hackers, Haskell could allow a
programmer to specify the degree of type inference he of she would like
to have, using a mechanism similar to the current default mechanism.
How useful do you find type inference to be in practice?
Warren Burton