[Haskell] class assosiated types, via GADTs.
I believe there is a realationship between GADTs and class assosiated types which hints at a much simpler implementation of class assosiated types (CATs) and possibly a lessoning of the constraints mentioned in the paper. Assuming you have already implemented GADTs the translation is straightforward, which is why I think it might be relevant to ghc developers. An example is worth a thousand words so here it is: Take the example from the paper: > class MapKey k where > data Map k v > lookup :: k -> Map k v -> Maybe v > empty :: Map k v > > instance MapKey Int where > data Map k v = MapInt (IntMap v) > lookup k (MapInt m) = Data.IntMap.lookup k m > empty = MapInt Data.IntMap.empty > > instance MapKey () where > data Map k v = MapUnit (Maybe v) > lookup () (MapUnit x) = x > empty = MapUnit Nothing now here is the translation: > class MapKey k where > lookup :: k -> Map k v -> Maybe v > empty :: Map k v > > data MapKey k => Map k v where > MapUnit :: Maybe v -> Map () v > MapInt :: IntMap v -> Map Int v > > instance MapKey Int where > lookup k (MapInt m) = Data.IntMap.lookup k m > empty = MapInt Data.IntMap.empty > > instance MapKey () where > lookup () (MapUnit x) = x > empty = MapUnit Nothing The idea should be clear, each CAT data declaration is lifted to the top level. each instance declares a new constructor for said class with an explicit GADT type fixing the type which the instance is being declared for. I have not worked out any theory behind it, but it seems to work in manual translation testing and has a certain intuitiveness about it. One might wonder whether we are paying a price for turning Map into a multi-construtor type, but we are just moving an implicit branch on an unknown function in the dictonary into an explicit case. (which is a win in my books) Since a 'Map Int v' cannot come about except via the MapInt constructor, we are guarenteed that the pattern matching in the instance declarations cannot fail if the program typechecks. The only caveats I can think of to actually implementing this in ghc are separate compilation, the Map GADT must be marked somehow as 'open' since later instances might create new constructors for it. Hopefully this won't interfere with optimization too much. Perhaps some internally generated rules to turn lookup :: Int -> Map Int v -> Maybe v into lookupInt :: Int -> IntMap v -> Maybe v would suffice. (this transformation is valid because we know that if the type of Map is Map Int v then its constructor has to be MapInt, despite the 'open' nature of the Map datatype) Once a rule fires, we no longer have any reference to the open GADT so can optimize like normal. * The front end will still have to be modified to type CATs similarly to as described in the paper, if nothing else for sane user error messages. (however, I believe some of the GADT machinery may be reusable) After typechecking, the translation can be carried out even before (and independently of) the desugaring of typeclasses. * It might just not work with multiparameter type classes, I have not thought about it at all, but then again CATs obviate much of the need for multi-parameter type classes, so having them mutually exclusive (at first) doesn't seem like too much of a restriction. I may be completely wrong about this, or perhaps it is already well known. or perhaps it is what the paper actually says to do and I am just misreading it :) I thought I'd share because I have really wanted the class assosiated types in various situations and I wasn't sure if I'd have time (or knowledge) to explore this seriously too much further. John -- John Meacham - ârepetae.netâjohnâ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] class assosiated types, via GADTs.
Perhaps i'm being dumb, but I dont see the need for either GADTs or class-associated-types to do this... I am pretty sure it can be done using fundeps, using the techniques from the HList paper... of course I haven't coded it yet so there might be some problem I haven't considered. By the way Oleg has already shown that there is an equivalence between GADTs and the type-class encoding used for HLists. Keean. I believe there is a realationship between GADTs and class assosiated types which hints at a much simpler implementation of class assosiated types (CATs) and possibly a lessoning of the constraints mentioned in the paper. Assuming you have already implemented GADTs the translation is straightforward, which is why I think it might be relevant to ghc developers. An example is worth a thousand words so here it is: Take the example from the paper: class MapKey k where data Map k v lookup :: k -> Map k v -> Maybe v empty :: Map k v instance MapKey Int where data Map k v = MapInt (IntMap v) lookup k (MapInt m) = Data.IntMap.lookup k m empty = MapInt Data.IntMap.empty instance MapKey () where data Map k v = MapUnit (Maybe v) lookup () (MapUnit x) = x empty = MapUnit Nothing now here is the translation: class MapKey k where lookup :: k -> Map k v -> Maybe v empty :: Map k v data MapKey k => Map k v where MapUnit :: Maybe v -> Map () v MapInt :: IntMap v -> Map Int v instance MapKey Int where lookup k (MapInt m) = Data.IntMap.lookup k m empty = MapInt Data.IntMap.empty instance MapKey () where lookup () (MapUnit x) = x empty = MapUnit Nothing The idea should be clear, each CAT data declaration is lifted to the top level. each instance declares a new constructor for said class with an explicit GADT type fixing the type which the instance is being declared for. I have not worked out any theory behind it, but it seems to work in manual translation testing and has a certain intuitiveness about it. One might wonder whether we are paying a price for turning Map into a multi-construtor type, but we are just moving an implicit branch on an unknown function in the dictonary into an explicit case. (which is a win in my books) Since a 'Map Int v' cannot come about except via the MapInt constructor, we are guarenteed that the pattern matching in the instance declarations cannot fail if the program typechecks. The only caveats I can think of to actually implementing this in ghc are separate compilation, the Map GADT must be marked somehow as 'open' since later instances might create new constructors for it. Hopefully this won't interfere with optimization too much. Perhaps some internally generated rules to turn lookup :: Int -> Map Int v -> Maybe v into lookupInt :: Int -> IntMap v -> Maybe v would suffice. (this transformation is valid because we know that if the type of Map is Map Int v then its constructor has to be MapInt, despite the 'open' nature of the Map datatype) Once a rule fires, we no longer have any reference to the open GADT so can optimize like normal. * The front end will still have to be modified to type CATs similarly to as described in the paper, if nothing else for sane user error messages. (however, I believe some of the GADT machinery may be reusable) After typechecking, the translation can be carried out even before (and independently of) the desugaring of typeclasses. * It might just not work with multiparameter type classes, I have not thought about it at all, but then again CATs obviate much of the need for multi-parameter type classes, so having them mutually exclusive (at first) doesn't seem like too much of a restriction. I may be completely wrong about this, or perhaps it is already well known. or perhaps it is what the paper actually says to do and I am just misreading it :) I thought I'd share because I have really wanted the class assosiated types in various situations and I wasn't sure if I'd have time (or knowledge) to explore this seriously too much further. John ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] class assosiated types, via GADTs.
On Tue, Feb 15, 2005 at 10:16:49AM +, Keean Schupke wrote: > Perhaps i'm being dumb, but I dont see the need for either GADTs or > class-associated-types to do this... I am pretty sure it can be done > using fundeps, > using the techniques from the HList paper... of course I haven't coded > it yet so there > might be some problem I haven't considered. Yeah, in the CAT paper, the relationship between functional dependencies and CATs is explored. (see section 5). An encoding of just this example but using functional dependencies is given and some of their properties are explored. http://research.microsoft.com/Users/simonpj/papers/assoc-types/index.htm However CATs are a much clearer expression of ones intent (IMHO) and have other nice properties in that useful functions can be typed via CATs but not via fundeps. The case for them is pretty compelling. > By the way Oleg has already shown that there is an equivalence between > GADTs and the type-class encoding used for HLists. It would not surprise me. However, just because there is an equivalance, it doesn't mean it is useful for what one wants. I encoded (a certain) CAT before via a GADT, but this encoding is not directly useful to users because it would require all instances to be known a priori, thus breaking one of the main properties of type classes, their extensibility. However, the compiler could pull off this translation internally (I hope) thus levaraging the work put into GADTs. The main advantage of this translation over the one in the paper is that it is not intertwined with the dictionary generation and typeclass desugaring code, which is pretty hairy to begin with. Rather it is an orthogonal transformation so hopefully will be easier to implement without touching too much of ghcs internals. John -- John Meacham - ârepetae.netâjohnâ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] class assosiated types, via GADTs.
John Meacham wrote: The main advantage of this translation over the one in the paper is that it is not intertwined with the dictionary generation and typeclass desugaring code, which is pretty hairy to begin with. Rather it is an orthogonal transformation so hopefully will be easier to implement without touching too much of ghcs internals. John Yes, however my worry is that we are heading towards the proliferation of not very well thought out overlaping extensions (in the sense that how the extensions fit with other proposed and existing extensions does not seem well thought out). If the functionality duplicates something already achievable using functional dependancies, then what is the point of yet another extension to the language. I would rather have a small set of powerful tools, than an expansive set of single purpose gadgets. In my opinion a lot of these things could be implemented as syntactic sugar ontop of fundeps (making it easier to read). Template-Haskell could be used to generate the required classes from a simplified syntax without altering the compiler at all. Keean. ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] class assosiated types, via GADTs or FDs
On Tue, 2005-02-15 at 10:16 +, Keean Schupke wrote: > Perhaps i'm being dumb, but I dont see the need for either GADTs or > class-associated-types to do this... I am pretty sure it can be done > using fundeps, Yes, you can encode this example with functional dependencies (see Section 3.4 of [1]), but functional dependencies have three drawbacks, which are discussed in the associated types paper [2]. In case you don't want to read the paper, the problems are the following. Your class declaration with functional dependencies becomes class MapKey k fm | k -> fm where lookup :: k -> fm v -> Maybe v empty :: fm v -- NB: The signature of empty doesn't mention k, which -- is already problematic, but you can fix this by -- adding it as a dummy argument to fm. Drawback #1: If you want to add an instance for keys of pair type, you need an instance of the form data MapKeyPair fm1 fm2 v = ... class (MapKey k1 fm1, MapKey k2 fm2) => MapKey (k1, k2) (MapKeyPair fm1 fm2) where ... If you look into Section 6.1 of the functional dependencies paper [3], you'll see that this definition is actually not admitted. GHC allows it nonetheless. As a result, the type checker will on some programs, that it ought to reject, simply not terminate - that's been pointed out by [4]. Drawback #2: There's been an interesting study about the support for a certain form of generic programming in six different programming languages (four OO languages and two FP languages) [5]. In that study the not quite satisfactory support of associated types in Haskell via functional dependencies is cited as the *only* shortcoming of Haskell in the summary of Table 1 (Haskell gets the best results of all six languages in that table, btw). The main complaint cited by the authors is that the representation type (fm in the example above) needs to be included in the argument list of the class. For larger examples, with more associated types, such as the graph library studied in [5], this is unwieldy. Drawback #3: The functional dependency encoding prevents you from ever making the representation of your tries/maps (in the MapKey class example) abstract. It does not only prevent you from declaring it as an abstract data type, but it actually forces users of a library built in this way to understand the representation types and manually encode them in their application code. For example, assume we have an instance as follows instance MapKey Int (Data.IntMap v) where lookup k m = Data.IntMap.lookup k m empty = Data.IntMap.empty The full type of the class method lookup is lookup :: MapKey k fm => k -> fm v -> Maybe v Now suppose a user wants to define a function lookupInt :: MapKey Int fm => k -> fm v -> Maybe v lookupInt = lookup then they will find that the compiler rejects this definition. (The reason lies in when the type checker uses Mark Jones' improvement rule [3].) In fact, the only valid definition is lookupInt :: k -> Data.IntMap v -> Maybe v lookupInt = lookup That's quite unsatisfactory, as a user of a such a tries library will have to know how maps of Int keys are represented. Moreover, if the tries library ever changes that representation, all user code will have to be changed. This problem is aggravated by that representation types for tries get quite complicated once you use more involved types as keys. This third drawback makes functional dependencies IMHO rather unsuitable for encoding associated types in anything, but toy code. Manuel [1] Ralf Hinze, Johan Jeuring, and Andres Löh. Type-indexed data types. Science of Computer Programming, MPC Special Issue, 51:117-151, 2004. http://www.informatik.uni-bonn.de/~ralf/publications/SCP2004.pdf [2] Manuel M. T. Chakravarty, Gabriele Keller, Simon Peyton Jones, and Simon Marlow. Associated Types with Class. POPL'05. http://www.cse.unsw.edu.au/~chak/papers/papers.html#assoc [3] Mark P. Jones. Type Classes with Functional Dependencies. Proceedings of the 9th European Symposium on Programming Languages and Systems, LNCS 1782, 2000. http://www.cse.ogi.edu/~mpj/pubs/fundeps-esop2000.pdf [4] Gregory J. Duck, Simon Peyton Jones, Peter J. Stuckey, and Martin Sulzmann. Sound and Decidable Type Inference for Functional Dependencies. ESOP'04. http://research.microsoft.com/Users/simonpj/Papers/fd-chr/ [5] Ronald Garcia, Jaakko Järvi, Andrew Lumsdaine, Jeremy G. Siek, and Jeremiah Willcock. A Comparative Study of Language Support for Generic Programming. In Proceedings of the 2003 ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA'03), 2003. http://www.osl.iu.edu/publications/pubs/2003/comparing_generic_programming03.pdf ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/list
Re: [Haskell] class assosiated types, via GADTs or FDs
Hi, you mention three drawbacks FDs have against CATs I believe that all three drawbacks can be addressed. *** Short summary *** (I'm using Manuel's numbering scheme, see his original email below) Drawback #2: "FDs introduce clutter" True, I think CATs are a great idea but it's also important to note that FDs are strictly more powerful. I further claim that CATs can be viewed as syntactic sugar for a special class of FDs. Drawback #1: "CAT to FD encoding introduces (potentially) non-terminating instances" I claim that there exists a translation from CAT to FD such that FD instances generated are terminating. Drawback #3: "Lack of abstraction" There's well-typed translation scheme which retains abstraction. Hence, I conclude that CATs can be translated to FDs (retaining termination and abstraction) such that the resulting FD program is accepted by GHC. Here are more details. *** Drawback #2: "FDs introduce clutter" *** FDs are strictly more powerful. We can express more complicated type relations such as class Foo a b c | a c -> b, b c -> a Anyway, I agree there are styles of programming where we'd like to hide all "functionally defined" parameters. I claim that rather than developing CATs as a new type class extension we should view them as syntactic sugar for (a special class of) FDs. So, let's see how to address drawbacks 1 and 3. * *** Drawback #1: "CAT to FD encoding introduces (potentially) *** ***non-terminating instances" *** * Short summary: There exists a translation from CAT to FD such that FD instances generated are terminating. In [2] you show how to encode the Array example. Array CAT example: class ArrayElem e where data Array e index :: Array e -> Int -> e instance (ArrayElem a, ArrayElem b) => ArrayElem (a,b) where data Array (a,b) = PairArray (Array a) (Array b) I repeat your FD encoding: class ArrayElem a b | a->b where index :: b -> Int -> a instance (ArrayElem a a2, ArrayElem b b2) => ArrayElem (a,b) (a2,b2) In [4] it is shown that such instances may lead to non-termination (of type inference). Assume we encounter the constraint ArrayElem (a,b) a, then the process of type improvement and context reduction will not terminate. E.g., ArrayElem (a,b) a --> ArrayElem ((c,d),b) (c,d), a=(c,d) --> ArrayElem (c,d) c, ArrayElem b d, a=(c,d) --> ... I claim that this is not the "proper" way to encode CATs via FDs. Proper FD encoding: class Array a b | a->b where index :: b->Int->a -- Note that in the internal translation of CATs to System F, -- a new data type is introduced for each instance. -- Let's simply do that for the FD encoding as well. data ArrayPair a b ar br = PairArray ar br ^ instance (Array a ar, Array b br) => (*) Array (a,b) (ArrayPair a b ar br) The underlined (phantom) parameters seem somewhat redundant but are the main trick to ensure termination. MP Jones states in [3] the following "termination" condition: (For simplicity, I only show the case for two-parameter type classes) Assume class F a b | a->b, then for each instance ... => F t1 t2 we have that fv(t2) subsetof fv(t1). In [4] it is shown that this condition guarantees termination. Note that instance (*) doesn't satisfy this condition. For the proper CAT via FD encoding we need a new termination condition. Variation of termination condition: Assume class F a b | a->b, then for each instance ... => F t1 t2 we have that fv(t1) subsetof fv(t2). ^^ Note that t1 and t2 have switched their roles. Instances (*) satisfies this new condition. The short story is that this is another condition which guarantees termination. Let's reconsider the "critical" example from above. We find that ArrayElem (a,b) a --> ArrayElem ((c,d),b) (c,d), a=ArrayPair a b (c,d) --> fail! ^ occurs check fails Obviously, I've only shown you one example. My experience with CATs and FDs tells me this is the way to go. It's fairly straightforward to generalize the above encoding principle. If somebody is interesting in fully formalizing the translation from CAT to a decidable fragment of FDs pl feel free to contact me. *** Drawback #3: Lack of abstraction *** I repeat the MapKey example. > Now suppose a user wants to define a function > > lookupInt :: MapKey Int fm => k -> fm v -> Maybe v > lookupInt = lookup > > then they will find that the compiler rejects this definition. (The > reason
Re: [Haskell] class assosiated types, via GADTs or FDs
I am working (with ralf & oleg) on using template-haskell to introduce syntactic sugar for the kinds of tricks used in the HList and OOHaskell papers, to make these programming styles more user friendly. It might be nice to incorporate CATs as well, although I am not sure what an "unlifted" CAT looks like. Its not going to happen real soon now as there is still a lot of work to do on the translation, but I am certainly interested in the idea. Keean. Martin Sulzmann wrote: Hi, you mention three drawbacks FDs have against CATs I believe that all three drawbacks can be addressed. *** Short summary *** (I'm using Manuel's numbering scheme, see his original email below) Drawback #2: "FDs introduce clutter" True, I think CATs are a great idea but it's also important to note that FDs are strictly more powerful. I further claim that CATs can be viewed as syntactic sugar for a special class of FDs. Drawback #1: "CAT to FD encoding introduces (potentially) non-terminating instances" I claim that there exists a translation from CAT to FD such that FD instances generated are terminating. Drawback #3: "Lack of abstraction" There's well-typed translation scheme which retains abstraction. Hence, I conclude that CATs can be translated to FDs (retaining termination and abstraction) such that the resulting FD program is accepted by GHC. Here are more details. *** Drawback #2: "FDs introduce clutter" *** FDs are strictly more powerful. We can express more complicated type relations such as class Foo a b c | a c -> b, b c -> a Anyway, I agree there are styles of programming where we'd like to hide all "functionally defined" parameters. I claim that rather than developing CATs as a new type class extension we should view them as syntactic sugar for (a special class of) FDs. So, let's see how to address drawbacks 1 and 3. * *** Drawback #1: "CAT to FD encoding introduces (potentially) *** ***non-terminating instances" *** * Short summary: There exists a translation from CAT to FD such that FD instances generated are terminating. In [2] you show how to encode the Array example. Array CAT example: class ArrayElem e where data Array e index :: Array e -> Int -> e instance (ArrayElem a, ArrayElem b) => ArrayElem (a,b) where data Array (a,b) = PairArray (Array a) (Array b) I repeat your FD encoding: class ArrayElem a b | a->b where index :: b -> Int -> a instance (ArrayElem a a2, ArrayElem b b2) => ArrayElem (a,b) (a2,b2) In [4] it is shown that such instances may lead to non-termination (of type inference). Assume we encounter the constraint ArrayElem (a,b) a, then the process of type improvement and context reduction will not terminate. E.g., ArrayElem (a,b) a --> ArrayElem ((c,d),b) (c,d), a=(c,d) --> ArrayElem (c,d) c, ArrayElem b d, a=(c,d) --> ... I claim that this is not the "proper" way to encode CATs via FDs. Proper FD encoding: class Array a b | a->b where index :: b->Int->a -- Note that in the internal translation of CATs to System F, -- a new data type is introduced for each instance. -- Let's simply do that for the FD encoding as well. data ArrayPair a b ar br = PairArray ar br ^ instance (Array a ar, Array b br) => (*) Array (a,b) (ArrayPair a b ar br) The underlined (phantom) parameters seem somewhat redundant but are the main trick to ensure termination. MP Jones states in [3] the following "termination" condition: (For simplicity, I only show the case for two-parameter type classes) Assume class F a b | a->b, then for each instance ... => F t1 t2 we have that fv(t2) subsetof fv(t1). In [4] it is shown that this condition guarantees termination. Note that instance (*) doesn't satisfy this condition. For the proper CAT via FD encoding we need a new termination condition. Variation of termination condition: Assume class F a b | a->b, then for each instance ... => F t1 t2 we have that fv(t1) subsetof fv(t2). ^^ Note that t1 and t2 have switched their roles. Instances (*) satisfies this new condition. The short story is that this is another condition which guarantees termination. Let's reconsider the "critical" example from above. We find that ArrayElem (a,b) a --> ArrayElem ((c,d),b) (c,d), a=ArrayPair a b (c,d) --> fail! ^ occurs check fails Obviously, I've only shown you one example. My experience with CATs and FDs tells me this is the way to go. It's fairly straightforward to generalize the above encoding principle. If somebody is interesting in fully formalizing the translation from CAT to a decidable fragment