Implementation of type classes

Heribert Schuetz Sat, 11 Sep 1999 23:32:58 +0200 (MEST)
Hello,

I'm trying to get a bit deeper understanding about how type classes are
(or could be) implemented.

At least in simple cases it seems possible to eliminate class and
instance declarations from Haskell programs. See the example below.

The idea is that for every class assertion in the type of a variable,
the variable gets an additional parameter that will be instantiated by a
value representing the appropriate instance declaration. These values
are tuples (let's call them "instance tuples") containing
- one component for every superclass (viz. an instance tuple) and
- one component for every class method (viz. a function with an
  additional argument expecting the instance tuple)

A class declaration is converted into a declaration of a type of
instance tuples.

A simple instance declaration is converted into the definition of such a
value.

An instance declaration with constraints (e.g., the Eq or the Ord
instance for lists below) is converted into a function taking one
instance tuple for each constraint and returning an instance tuple.

Now here are the questions I have:

- Does this work completely as intended or have I missed something, such
  as strictness problems or the like?

- Does this still work if more complex features of the type/class system
  are used?
  - default definitions for class methods
    (This will probably work without too much effort.)
  - constructor classes
    (Here I would guess the same.)
  - multi-parameter type classes
  - overlapping instances
  - existential types
  - other extensions of the class system that are, e.g., implemented in
    GHC or Hugs
  - dependent types
    (Perhaps here the functions returning instance tuples generated from
    instance declarations must accept "normal" values and not only
    instance tuples.)
  - ...

- Can the transformed program be evaluated as efficiently as the
  original one? (The original program gets an advantage if some class
  constraints are resolved at compile time. But perhaps a similar effect
  for the transformed program can be achieved with GHC's transformation
  rules.)

- Do type classes provide an advantage with respect to expressiveness of
  types? For example, can the types of an API be formulated in a more
  flexible way with classes as compared to the transformed style?

- Unless I missed something, the Haskell report lacks a description of
  the Haskell kernel except for the implicit "every language construct
  for which there is no translation given". And according to this I
  think that classes and instances belong to the kernel. With a
  translation like the one given by example below, could the need for
  classes and instances in the kernel be removed?

Most of this is probably well-known stuff and written down in papers.
Which ones? The Haskell report concentrates on the static semantics of
classes and instances. It looks like the dynamic semantics is expected
to be already understood by the reader or intuitively clear. Are the
references [6] (Jones: A system of constructor classes: overloading and
implicit higher-order polymorphism) and [11] (Wadler/Blott: How to make
ad hoc polymorphism less ad hoc) of the report available on the Web?


Thanks in advance for your hints,

Heribert.

======================================================================

Now for the example. It consists of some simplified declarations and
definitions from the prelude (marked with <) and their transformed
versions (marked with >):

----------------------------------------------------------------------
< class Eq a where
<   (==) :: a -> a -> Bool
<
< (/=) :: Eq a => a -> a -> Bool
< x /= y = not (x==y)

> data Eq_ a = Eq_                -- could be a newtype;
>   { (==-) :: a -> a -> Bool
>   }
>
> (/=-) :: Eq_ a -> a -> a -> Bool
> (/=-) eq_a x y  =  not ((==-) eq_a x y)

In order to avoid systematic name clashes with the "standard"
definitions, "-" or "_" has been appended to certain identifiers.

(BTW: Wouldn't it be nice if we could write

  x `(/=-) eq_` y  =  not (x `(==-) eq_` y)

That is, backquotes can not only be applied to varids but to arbitrary
expressions or patterns.)
----------------------------------------------------------------------
< class (Eq a) => Ord a where
<   compare :: a -> a -> Ordering
<
< (<=) :: Ord a => a -> a -> Bool
< x <= y  =  compare x y /= GT
<
< min  :: Ord a => a -> a -> a
< min x y | x <= y     =  x
<         | otherwise  =  y

> data Ord_ a = Ord_
>   { eq_Ord :: Eq_ a
>   , compare_ :: a -> a -> Ordering
>   }
>
> (<=-) :: Ord_ a -> a -> a -> Bool
> (<=-) ord_a x y  =  (/=-) eq_Ordering (compare_ ord_a x y) GT
>
> min_ :: Ord_ a -> a -> a -> a
> min_ ord_a x y | (<=-) ord_a x y  =  x
>                | otherwise        =  y

----------------------------------------------------------------------
< data Bool = False | True      deriving (Eq, Ord)

> eq_Bool :: Eq_ Bool
> eq_Bool = Eq_
>   { (==-)  =  (==)                      -- (*)
>   }
>
> ord_Bool :: Ord_ Bool
> ord_Bool = Ord_
>   { eq_Ord  =  eq_Bool
>   , compare_  =  compare                -- (*)
>   }

In the bindings marked with (*) the right-hand sides refer to standard
functions with a constrained type variable, which looks a bit like
cheating. But note that this is done only because there is no access to
the low-level primitives generated by the "deriving" clause. The
functions are used monomorphically (i.e., with the constrained type
variable instantiated), as if we directly used the underlying
unrestricted primitive.
----------------------------------------------------------------------
< data Ordering = LT | EQ | GT  deriving (Eq, Ord)

> eq_Ordering :: Eq_ Ordering
> eq_Ordering = Eq_
>   { (==-)  =  (==)                      -- (*)
>   }
>
> ord_Ordering :: Ord_ Ordering
> ord_Ordering = Ord_
>   { eq_Ord  =  eq_Ordering
>   , compare_  =  compare                -- (*)
>   }

----------------------------------------------------------------------
The prelude says:

< data  [a]  =  [] | a : [a]  deriving (Eq, Ord)

But for demonstration purposes we consider explicit instance
declarations:

< instance Eq a => Eq [a] where
<     []     == []      =  True
<     (x:xs) == (y:ys)  =  x==y && xs==ys
<     _      == _       =  False
<
< instance Ord a => Ord [a] where
<     compare []     (_:_)   =  LT
<     compare []     []      =  EQ
<     compare (_:_)  []      =  GT
<     compare (x:xs) (y:ys)  =  case compare x y of
<                               EQ -> compare xs ys
<                               o  -> o

> eq_List :: Eq_ a -> Eq_ [a]
> eq_List eq_a = Eq_
>   { (==-)  =  \v1 v2 -> case (v1, v2) of
>                           ([]  , []  ) -> True
>                           (x:xs, y:ys) -> (==-) eq_a           x  y
>                                        && (==-) (eq_List eq_a) xs ys
>                           (_   , _   ) -> False
>   }
>
> ord_List :: Ord_ a -> Ord_ [a]
> ord_List ord_a = Ord_
>   { eq_Ord  =  eq_List (eq_Ord ord_a)
>   , compare_  =  \v1 v2 -> case (v1, v2) of
>                          ([]  , _:_ ) -> LT
>                          ([]  , []  ) -> EQ
>                          (_:_ , []  ) -> GT
>                          (x:xs, y:ys) -> case compare_ ord_a x y of
>                                            EQ -> compare_ (ord_List ord_a)
>                                                           xs ys
>                                            o  -> o
>   }

(BTW: Wouldn't it be nice if we could write these definitions of (==-)
and compare_ in a way syntactically similar to the definition of (==)
and compare above, i.e., with separate equations for the 3 or 4
top-level cases and with full patterns to the left of "="?)

We might have introduced a local name for (eq_List eq_a) above:

  eq_List :: Eq_ a -> Eq_ [a]
  eq_List eq_a  =  eq_List_a
    where
    eq_List_a  =  Eq_
      { (==-)  =  \v1 v2 -> case (v1, v2) of
                              ([]  , []  ) -> True
                              (x:xs, y:ys) -> (==-) eq_a      x  y
                                           && (==-) eq_List_a xs ys
                              (_   , _   ) -> False
      }

This also works with (ord_List ord_a). Does it affect efficiency or
semantics?
----------------------------------------------------------------------

Now we can see that it works:

< test  =  min [True, False] [True]

> test  =  min_ (ord_List ord_Bool) [True, False] [True]
Implementation of type classes

Reply via email to