[Haskell] class assosiated types, via GADTs.

2005-02-14 Thread John Meacham
I believe there is a realationship between GADTs and class assosiated
types which hints at  a much simpler implementation of class assosiated
types (CATs) and possibly a lessoning of the constraints mentioned in
the paper.  Assuming you have already implemented GADTs the translation
is straightforward, which is why I think it might be relevant to ghc
developers.

An example is worth a thousand words so here it is:

Take the example from the paper:

> class MapKey k where
> data Map k v 
> lookup :: k -> Map k v -> Maybe v
> empty :: Map k v
> 
> instance MapKey Int where
> data Map k v = MapInt (IntMap v) 
> lookup k (MapInt m) = Data.IntMap.lookup k m 
> empty = MapInt Data.IntMap.empty 
> 
> instance MapKey () where
> data Map k v = MapUnit (Maybe v)
> lookup () (MapUnit x) = x
> empty = MapUnit Nothing

now here is the translation: 

> class MapKey k where
> lookup :: k -> Map k v -> Maybe v
> empty :: Map k v
> 
> data MapKey k => Map k v where
> MapUnit :: Maybe v -> Map () v
> MapInt :: IntMap v -> Map Int v
> 
> instance MapKey Int where
> lookup k (MapInt m) =  Data.IntMap.lookup k m 
> empty = MapInt Data.IntMap.empty 
> 
> instance MapKey () where
> lookup () (MapUnit x) =  x
> empty = MapUnit Nothing

The idea should be clear, each CAT data declaration is lifted to the
top level. each instance declares a new constructor for said class with
an explicit GADT type fixing the type which the instance is being
declared for. 

I have not worked out any theory behind it, but it seems to work in
manual translation testing and has a certain intuitiveness about it. One
might wonder whether we are paying a price for turning Map into a
multi-construtor type, but we are just moving an implicit branch on an
unknown function in the dictonary into an explicit case. (which is a win
in my books)  Since a 'Map Int v' cannot come about except via the
MapInt constructor, we are guarenteed that the pattern matching in the
instance declarations cannot fail if the program typechecks.

The only caveats I can think of to actually implementing this in ghc are

separate compilation, the Map GADT must be marked somehow as 'open'
since later instances might create new constructors for it. Hopefully
this won't interfere with optimization too much. 
Perhaps some internally generated rules to turn 
lookup :: Int -> Map Int v  -> Maybe v 
into 
lookupInt :: Int -> IntMap v -> Maybe v 
would suffice. (this transformation is valid because we know that if the
type of Map is Map Int v then its constructor has to be MapInt, despite
the 'open' nature of the Map datatype)
Once a rule fires, we no longer have any reference to the open GADT so
can optimize like normal.

* The front end will still have to be modified to type CATs similarly to
as described in the paper, if nothing else for sane user error messages.
(however, I believe some of the GADT machinery may be reusable)
After typechecking, the translation can be carried out even before (and
independently of) the desugaring of typeclasses.

* It might just not work with multiparameter type classes, I have not
thought about it at all, but then again CATs obviate much of the need
for multi-parameter type classes, so having them mutually exclusive (at
first) doesn't seem like too much of a restriction.

I may be completely wrong about this, or perhaps it is already well
known. or perhaps it is what the paper actually says to do and I am just
misreading it :) I thought I'd share because I have really wanted the
class assosiated types in various situations and I wasn't sure if I'd have
time (or knowledge) to explore this seriously too much further.

John



-- 
John Meacham - ârepetae.netâjohnâ 
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] class assosiated types, via GADTs.

2005-02-15 Thread Keean Schupke
Perhaps i'm being dumb, but I dont see the need for either GADTs or 
class-associated-types to do this... I am pretty sure it can be done 
using fundeps,
using the techniques from the HList paper... of course I haven't coded 
it yet so there
might be some problem I haven't considered.

By the way Oleg has already shown that there is an equivalence between 
GADTs and the type-class encoding used for HLists.

   Keean.
I believe there is a realationship between GADTs and class assosiated
types which hints at  a much simpler implementation of class assosiated
types (CATs) and possibly a lessoning of the constraints mentioned in
the paper.  Assuming you have already implemented GADTs the translation
is straightforward, which is why I think it might be relevant to ghc
developers.
An example is worth a thousand words so here it is:
Take the example from the paper:
 

class MapKey k where
   data Map k v 
   lookup :: k -> Map k v -> Maybe v
   empty :: Map k v

instance MapKey Int where
   data Map k v = MapInt (IntMap v) 
   lookup k (MapInt m) = Data.IntMap.lookup k m 
   empty = MapInt Data.IntMap.empty 

instance MapKey () where
   data Map k v = MapUnit (Maybe v)
   lookup () (MapUnit x) = x
   empty = MapUnit Nothing
   

now here is the translation: 

 

class MapKey k where
   lookup :: k -> Map k v -> Maybe v
   empty :: Map k v
data MapKey k => Map k v where
   MapUnit :: Maybe v -> Map () v
   MapInt :: IntMap v -> Map Int v
instance MapKey Int where
   lookup k (MapInt m) =  Data.IntMap.lookup k m 
   empty = MapInt Data.IntMap.empty 

instance MapKey () where
   lookup () (MapUnit x) =  x
   empty = MapUnit Nothing
   

   
The idea should be clear, each CAT data declaration is lifted to the
top level. each instance declares a new constructor for said class with
an explicit GADT type fixing the type which the instance is being
declared for. 

I have not worked out any theory behind it, but it seems to work in
manual translation testing and has a certain intuitiveness about it. One
might wonder whether we are paying a price for turning Map into a
multi-construtor type, but we are just moving an implicit branch on an
unknown function in the dictonary into an explicit case. (which is a win
in my books)  Since a 'Map Int v' cannot come about except via the
MapInt constructor, we are guarenteed that the pattern matching in the
instance declarations cannot fail if the program typechecks.
The only caveats I can think of to actually implementing this in ghc are
separate compilation, the Map GADT must be marked somehow as 'open'
since later instances might create new constructors for it. Hopefully
this won't interfere with optimization too much. 
Perhaps some internally generated rules to turn 
lookup :: Int -> Map Int v  -> Maybe v 
into 
lookupInt :: Int -> IntMap v -> Maybe v 
would suffice. (this transformation is valid because we know that if the
type of Map is Map Int v then its constructor has to be MapInt, despite
the 'open' nature of the Map datatype)
Once a rule fires, we no longer have any reference to the open GADT so
can optimize like normal.

* The front end will still have to be modified to type CATs similarly to
as described in the paper, if nothing else for sane user error messages.
(however, I believe some of the GADT machinery may be reusable)
After typechecking, the translation can be carried out even before (and
independently of) the desugaring of typeclasses.
* It might just not work with multiparameter type classes, I have not
thought about it at all, but then again CATs obviate much of the need
for multi-parameter type classes, so having them mutually exclusive (at
first) doesn't seem like too much of a restriction.
I may be completely wrong about this, or perhaps it is already well
known. or perhaps it is what the paper actually says to do and I am just
misreading it :) I thought I'd share because I have really wanted the
class assosiated types in various situations and I wasn't sure if I'd have
time (or knowledge) to explore this seriously too much further.
   John

 

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] class assosiated types, via GADTs.

2005-02-15 Thread John Meacham
On Tue, Feb 15, 2005 at 10:16:49AM +, Keean Schupke wrote:
> Perhaps i'm being dumb, but I dont see the need for either GADTs or 
> class-associated-types to do this... I am pretty sure it can be done 
> using fundeps,
> using the techniques from the HList paper... of course I haven't coded 
> it yet so there
> might be some problem I haven't considered.

Yeah, in the CAT paper, the relationship between functional dependencies
and CATs is explored. (see section 5). An encoding of just this example but 
using
functional dependencies is given and some of their properties are
explored. 

 http://research.microsoft.com/Users/simonpj/papers/assoc-types/index.htm

However CATs are a much clearer expression of ones intent (IMHO) and
have other nice properties in that useful functions can be typed via
CATs but not via fundeps. The case for them is pretty compelling.  


> By the way Oleg has already shown that there is an equivalence between 
> GADTs and the type-class encoding used for HLists.

It would not surprise me. However, just because there is an
equivalance, it doesn't mean it is useful for what one wants. I
encoded (a certain) CAT before via a GADT, but this encoding is not
directly useful to users because it would require all instances to be
known a priori, thus breaking one of the main properties of type
classes, their extensibility. However, the compiler could pull off this
translation internally (I hope) thus levaraging the work put into GADTs.

The main advantage of this translation over the one in the paper is that
it is not intertwined with the dictionary generation and typeclass
desugaring code, which is pretty hairy to begin with. Rather it is an
orthogonal transformation so hopefully will be easier to implement
without touching too much of ghcs internals.

John


-- 
John Meacham - ârepetae.netâjohnâ 
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] class assosiated types, via GADTs.

2005-02-15 Thread Keean Schupke
John Meacham wrote:
The main advantage of this translation over the one in the paper is that
it is not intertwined with the dictionary generation and typeclass
desugaring code, which is pretty hairy to begin with. Rather it is an
orthogonal transformation so hopefully will be easier to implement
without touching too much of ghcs internals.
   John
 

Yes, however my worry is that we are heading towards the proliferation 
of not very well thought out overlaping extensions (in the sense that 
how the extensions fit with other proposed and existing extensions does 
not seem well thought out). If the functionality duplicates something 
already achievable using functional dependancies, then what is the point 
of yet another extension to the language. I would rather have a small 
set of powerful tools, than an expansive set of single purpose gadgets.

In my opinion a lot of these things could be implemented as syntactic 
sugar ontop of fundeps (making it easier to read). Template-Haskell 
could be used to generate the required classes from a simplified syntax 
without altering the compiler at all.

   Keean.
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] class assosiated types, via GADTs or FDs

2005-02-15 Thread Manuel M T Chakravarty
On Tue, 2005-02-15 at 10:16 +, Keean Schupke wrote:
> Perhaps i'm being dumb, but I dont see the need for either GADTs or 
> class-associated-types to do this... I am pretty sure it can be done 
> using fundeps,

Yes, you can encode this example with functional dependencies (see
Section 3.4 of [1]), but functional dependencies have three drawbacks,
which are discussed in the associated types paper [2].  In case you
don't want to read the paper, the problems are the following.  Your
class declaration with functional dependencies becomes

  class MapKey k fm | k -> fm where
lookup :: k -> fm v -> Maybe v
empty :: fm v
  -- NB: The signature of empty doesn't mention k, which
  -- is already problematic, but you can fix this by
  -- adding it as a dummy argument to fm.

Drawback #1:

If you want to add an instance for keys of pair type, you need an
instance of the form

  data MapKeyPair fm1 fm2 v = ...
  class (MapKey k1 fm1, MapKey k2 fm2) => 
MapKey (k1, k2) (MapKeyPair fm1 fm2) where
...

If you look into Section 6.1 of the functional dependencies paper [3],
you'll see that this definition is actually not admitted.  GHC allows it
nonetheless.  As a result, the type checker will on some programs, that
it ought to reject, simply not terminate - that's been pointed out by
[4].

Drawback #2:

There's been an interesting study about the support for a certain form
of generic programming in six different programming languages (four OO
languages and two FP languages) [5].  In that study the not quite
satisfactory support of associated types in Haskell via functional
dependencies is cited as the *only* shortcoming of Haskell in the
summary of Table 1 (Haskell gets the best results of all six languages
in that table, btw).  The main complaint cited by the authors is that
the representation type (fm in the example above) needs to be included
in the argument list of the class.  For larger examples, with more
associated types, such as the graph library studied in [5], this is
unwieldy.

Drawback #3:

The functional dependency encoding prevents you from ever making the
representation of your tries/maps (in the MapKey class example)
abstract.  It does not only prevent you from declaring it as an abstract
data type, but it actually forces users of a library built in this way
to understand the representation types and manually encode them in their
application code.  For example, assume we have an instance as follows

  instance MapKey Int (Data.IntMap v) where
lookup k m = Data.IntMap.lookup k m 
empty = Data.IntMap.empty

The full type of the class method lookup is

lookup :: MapKey k fm => k -> fm v -> Maybe v

Now suppose a user wants to define a function

  lookupInt :: MapKey Int fm => k -> fm v -> Maybe v
  lookupInt = lookup

then they will find that the compiler rejects this definition.  (The
reason lies in when the type checker uses Mark Jones' improvement rule
[3].)

In fact, the only valid definition is

  lookupInt :: k -> Data.IntMap v -> Maybe v
  lookupInt = lookup

That's quite unsatisfactory, as a user of a such a tries library will
have to know how maps of Int keys are represented.  Moreover, if the
tries library ever changes that representation, all user code will have
to be changed.  This problem is aggravated by that representation types
for tries get quite complicated once you use more involved types as
keys.

This third drawback makes functional dependencies IMHO rather unsuitable
for encoding associated types in anything, but toy code.

Manuel

[1] Ralf Hinze, Johan Jeuring, and Andres Löh. Type-indexed data types. 
Science of Computer Programming, MPC Special Issue, 51:117-151, 
2004.
http://www.informatik.uni-bonn.de/~ralf/publications/SCP2004.pdf

[2] Manuel M. T. Chakravarty, Gabriele Keller, Simon Peyton Jones, and 
Simon Marlow. Associated Types with Class.  POPL'05.
http://www.cse.unsw.edu.au/~chak/papers/papers.html#assoc

[3] Mark P. Jones. Type Classes with Functional Dependencies.
Proceedings of the 9th European Symposium on Programming Languages
and Systems, LNCS 1782, 2000.
http://www.cse.ogi.edu/~mpj/pubs/fundeps-esop2000.pdf

[4] Gregory J. Duck, Simon Peyton Jones, Peter J. Stuckey, and Martin 
Sulzmann.  Sound and Decidable Type Inference for Functional 
Dependencies.  ESOP'04.
http://research.microsoft.com/Users/simonpj/Papers/fd-chr/

[5] Ronald Garcia, Jaakko Järvi, Andrew Lumsdaine, Jeremy G. Siek, and 
Jeremiah Willcock.  A Comparative Study of Language Support for 
Generic Programming. In Proceedings of the 2003 ACM SIGPLAN 
conference on Object-oriented programming, systems, languages, and 
applications (OOPSLA'03), 2003.

http://www.osl.iu.edu/publications/pubs/2003/comparing_generic_programming03.pdf


___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/list

Re: [Haskell] class assosiated types, via GADTs or FDs

2005-02-16 Thread Martin Sulzmann
Hi,

you mention three drawbacks FDs have against CATs
I believe that all three drawbacks can be addressed.

*** Short summary ***
(I'm using Manuel's numbering scheme, see his original email below)

Drawback #2: "FDs introduce clutter"

True, I think CATs are a great idea but it's also important
to note that FDs are strictly more powerful. I further claim that
CATs can be viewed as syntactic sugar for a special class of FDs.

Drawback #1: "CAT to FD encoding introduces (potentially)
non-terminating instances"

I claim that there exists a translation from CAT to FD
such that FD instances generated are terminating.

Drawback #3: "Lack of abstraction"

There's well-typed translation scheme which retains abstraction.

Hence, I conclude that CATs can be translated to FDs
(retaining termination and abstraction) such that the
resulting FD program is accepted by GHC.

Here are more details.


*** Drawback #2: "FDs introduce clutter" ***


FDs are strictly more powerful.
We can express more complicated type relations such as

class Foo a b c | a c -> b, b c -> a

Anyway, I agree there are styles of programming where we'd like to
hide all "functionally defined" parameters. I claim that rather than 
developing CATs as a new type class extension we should view them
as syntactic sugar for (a special class of) FDs. 
So, let's see how to address drawbacks 1 and 3.

*
*** Drawback #1: "CAT to FD encoding introduces (potentially) ***
***non-terminating instances" ***
*

Short summary: There exists a translation from CAT to FD
   such that FD instances generated are terminating.

In [2] you show how to encode the Array example.

Array CAT example:

class ArrayElem e where
  data Array e
  index :: Array e -> Int -> e

instance (ArrayElem a, ArrayElem b) =>
 ArrayElem (a,b) where
  data Array (a,b) = PairArray (Array a) (Array b)

I repeat your FD encoding:

class ArrayElem a b | a->b where
index :: b -> Int -> a

instance (ArrayElem a a2, ArrayElem b b2) => ArrayElem (a,b) (a2,b2)

In [4] it is shown that such instances may lead to non-termination
(of type inference). Assume we encounter the constraint
ArrayElem (a,b) a, then the process of type improvement and context
reduction will not terminate.
E.g., 

ArrayElem (a,b) a 
--> ArrayElem ((c,d),b) (c,d), a=(c,d)
--> ArrayElem (c,d) c, ArrayElem b d, a=(c,d)
--> ...

I claim that this is not the "proper" way to encode CATs via FDs.

Proper FD encoding:

class Array a b | a->b where
   index :: b->Int->a

-- Note that in the internal translation of CATs to System F,
-- a new data type is introduced for each instance.
-- Let's simply do that for the FD encoding as well.

data ArrayPair a b ar br = PairArray ar br
  ^

instance (Array a ar, Array b br) => (*)
 Array (a,b) (ArrayPair a b ar br)


The underlined (phantom) parameters seem somewhat redundant
but are the main trick to ensure termination.

MP Jones states in [3] the following "termination" condition:
(For simplicity, I only show the case for two-parameter
type classes)

Assume class F a b | a->b, then for each 
instance ... => F t1 t2 we have that fv(t2) subsetof fv(t1).

In [4] it is shown that this condition guarantees termination.
Note that instance (*) doesn't satisfy this condition.

For the proper CAT via FD encoding we need a new termination
condition.

Variation of termination condition:

Assume class F a b | a->b, then for each 
instance ... => F t1 t2 we have that fv(t1) subsetof fv(t2).
 ^^

Note that t1 and t2 have switched their roles.
Instances (*) satisfies this new condition.

The short story is that this is another condition which
guarantees termination. Let's reconsider the "critical"
example from above. We find that

ArrayElem (a,b) a 
--> ArrayElem ((c,d),b) (c,d), a=ArrayPair a b (c,d)
--> fail!  ^
   occurs check fails

Obviously, I've only shown you one example.
My experience with CATs and FDs tells me this is
the way to go. It's fairly straightforward to
generalize the above encoding principle.
If somebody is interesting in fully formalizing
the translation from CAT to a decidable
fragment of FDs pl feel free to contact me.



*** Drawback #3: Lack of abstraction ***
 

I repeat the MapKey example.

 > Now suppose a user wants to define a function
 > 
 >   lookupInt :: MapKey Int fm => k -> fm v -> Maybe v
 >   lookupInt = lookup
 > 
 > then they will find that the compiler rejects this definition.  (The
 > reason 

Re: [Haskell] class assosiated types, via GADTs or FDs

2005-02-17 Thread Keean Schupke
I am working (with ralf & oleg) on using template-haskell to introduce 
syntactic sugar for the kinds of tricks used in the HList and OOHaskell 
papers, to make these
programming styles more user friendly. It might be nice to incorporate 
CATs as well, although I am not sure what an "unlifted" CAT looks like. 
Its not going to happen real soon now as there is still a lot of work to 
do on the translation, but I am certainly interested in the idea.

   Keean.
Martin Sulzmann wrote:
Hi,
you mention three drawbacks FDs have against CATs
I believe that all three drawbacks can be addressed.
*** Short summary ***
(I'm using Manuel's numbering scheme, see his original email below)
Drawback #2: "FDs introduce clutter"
True, I think CATs are a great idea but it's also important
to note that FDs are strictly more powerful. I further claim that
CATs can be viewed as syntactic sugar for a special class of FDs.
Drawback #1: "CAT to FD encoding introduces (potentially)
   non-terminating instances"
I claim that there exists a translation from CAT to FD
such that FD instances generated are terminating.
Drawback #3: "Lack of abstraction"
There's well-typed translation scheme which retains abstraction.
Hence, I conclude that CATs can be translated to FDs
(retaining termination and abstraction) such that the
resulting FD program is accepted by GHC.
Here are more details.

*** Drawback #2: "FDs introduce clutter" ***

FDs are strictly more powerful.
We can express more complicated type relations such as
class Foo a b c | a c -> b, b c -> a
Anyway, I agree there are styles of programming where we'd like to
hide all "functionally defined" parameters. I claim that rather than 
developing CATs as a new type class extension we should view them
as syntactic sugar for (a special class of) FDs. 
So, let's see how to address drawbacks 1 and 3.

*
*** Drawback #1: "CAT to FD encoding introduces (potentially) ***
***non-terminating instances" ***
*
Short summary: There exists a translation from CAT to FD
  such that FD instances generated are terminating.
In [2] you show how to encode the Array example.
Array CAT example:
class ArrayElem e where
 data Array e
 index :: Array e -> Int -> e
instance (ArrayElem a, ArrayElem b) =>
ArrayElem (a,b) where
 data Array (a,b) = PairArray (Array a) (Array b)
I repeat your FD encoding:
class ArrayElem a b | a->b where
   index :: b -> Int -> a
instance (ArrayElem a a2, ArrayElem b b2) => ArrayElem (a,b) (a2,b2)
In [4] it is shown that such instances may lead to non-termination
(of type inference). Assume we encounter the constraint
ArrayElem (a,b) a, then the process of type improvement and context
reduction will not terminate.
E.g., 

   ArrayElem (a,b) a 
--> ArrayElem ((c,d),b) (c,d), a=(c,d)
--> ArrayElem (c,d) c, ArrayElem b d, a=(c,d)
--> ...

I claim that this is not the "proper" way to encode CATs via FDs.
Proper FD encoding:
class Array a b | a->b where
  index :: b->Int->a
-- Note that in the internal translation of CATs to System F,
-- a new data type is introduced for each instance.
-- Let's simply do that for the FD encoding as well.
data ArrayPair a b ar br = PairArray ar br
 ^
instance (Array a ar, Array b br) => (*)
Array (a,b) (ArrayPair a b ar br)
   
The underlined (phantom) parameters seem somewhat redundant
but are the main trick to ensure termination.
MP Jones states in [3] the following "termination" condition:
(For simplicity, I only show the case for two-parameter
type classes)
Assume class F a b | a->b, then for each 
instance ... => F t1 t2 we have that fv(t2) subsetof fv(t1).

In [4] it is shown that this condition guarantees termination.
Note that instance (*) doesn't satisfy this condition.
For the proper CAT via FD encoding we need a new termination
condition.
Variation of termination condition:
Assume class F a b | a->b, then for each 
instance ... => F t1 t2 we have that fv(t1) subsetof fv(t2).
^^

Note that t1 and t2 have switched their roles.
Instances (*) satisfies this new condition.
The short story is that this is another condition which
guarantees termination. Let's reconsider the "critical"
example from above. We find that
   ArrayElem (a,b) a 
--> ArrayElem ((c,d),b) (c,d), a=ArrayPair a b (c,d)
--> fail!  ^
  occurs check fails

Obviously, I've only shown you one example.
My experience with CATs and FDs tells me this is
the way to go. It's fairly straightforward to
generalize the above encoding principle.
If somebody is interesting in fully formalizing
the translation from CAT to a decidable
fragment