Re: *safe* coerce: four methods compared
This message illustrates how safe casting with multiple universes can be extended to new user-defined, polymorphic datatypes. We show a _portable_ mapping of polymorphic types to integers. Different instances of a polymorphic type map to different integers. Phantom types can be either disregarded or accounted for -- as the user wishes. Furthermore, if two different applications running on two different machines agree on the same type heap, then they agree on the type encoding. An application can use multiple typeheaps at the same time. It is easy therefore to dedicate one particular typeheap for portable encoding of types across several computers. Incidentally, our encoding of types may take into account _values_ of some of the components of a polymorphic type. For example, given a value of a type 'Foo Int a', the encoding may use the _value_ of the first component and the _type_ of the second component. When we do the cast, we can check not only for the desired type but also for the desired values. We can thus approach dependent types. This message hopefully replies to Ralf Laemmel's comment: ] You should make very clear that there is ] not just a single safeCoerce, but the TI/init_typeseq argument has to ] be constructed and supplied by the programmer in a way that (s)he ] decides what array of types can be handled. So if you wanted to use ] your approach to scrap boilerplate [1], say deal with many datatypes, ] this becomes quite a burden. Think of actually building initial type ] sequences. Think of how combinators need to be parameterised to take ] type sequences. with which I agree. Below I try to make it very implicit what depends on TI/init_typeseq and what doesn't, and how much work is involved in adding new datatypes and extending type heaps. I don't know if the proposed approach is better than the others in many circumstances. It's certainly different. Ralf Laemmel also wrote: ] Software-engineering-wise your approach suffers from an important ] weakness: a closed world assumption. The programmer has to maintain ] your "TI" and pass it on in all kinds of contexts for the array of ] types to be handled. I thought it is a feature. I thought a programmer can import some type heap, partially apply the needed function to it, and re-export the latter. The example below demonstrates what is involved when a new datatype is added. It seems not that much. ] I didn't need undecidable not even overlapping instances. I don't actually need overlapping-instances extensions. An earlier message on the subject of MRefs used similar type heaps without any need for -fallow-overlapping-instances. However I had to use numeral types such as Succ (Succ ... Zero) rather than Int for type indices. The current approach looks more elegant. Besides, you seem to in favor of giving overlapping instances more acceptance, support and legitimacy. ] Is it obvious to see that fetching stuff from the type sequences would ] be indeed efficient for long PLists? The sequence of 'cdr' operations needed for fetching stuff is known statically. A compiler might therefore do something intelligent. This whole message is self-contained, and can be loaded as it is in GHCi, given the flags -fglasgow-exts -fallow-undecidable-instances -fallow-overlapping-instances We start with the boilerplate, which has changed a little (for example, the class PLists now has a member function pllen). > data Nil t r = Nil > data Cons t r = Cons t r > > class PList ntype vtype cdrtype where > cdr:: ntype vtype cdrtype -> cdrtype > empty:: ntype vtype cdrtype -> Bool > value:: ntype vtype cdrtype -> vtype > pllen:: ntype vtype cdrtype -> Int > > instance PList Nil vtype cdrtype where > empty = const True > pllen = const 0 > > instance (PList n v r) => PList Cons v' (n v r) where > empty = const False > value (Cons v r) = v > cdr (Cons v r) = r > pllen (Cons v r) = 1 + pllen r > > class TypeSeq t s where > type_index:: t -> s -> Int > fetch:: t -> s -> t > alter:: t -> s -> s > > instance (PList Cons t r) => TypeSeq t (Cons t r) where > type_index _ _ = 0 > fetch _ (Cons v _) = v > alter newv (Cons v r) = Cons newv r > > instance (PList Cons t' r', TypeSeq t r') => TypeSeq t (Cons t' r') where > type_index v s = 1 + (type_index v $ cdr s) > fetch v s = fetch v $ cdr s > alter newv (Cons v' r') = Cons v' $ alter newv r' The initial typesequence: > init_typeseq = Cons (undefined::Char) $ >Cons (undefined::Int) $ > Cons (undefined::Bool) $ > Cons (undefined::String) $ > (Nil::Nil () ()) and its type. See the previous message for more discussion of the latter. > type TI > = (Cons > Char > (Cons Int (Cons Bool (Cons String (Nil () ()) Because we will be dealing with existential types, we need to extend the compile-time indexing into run-time: > class (TypeSeq t u) => Type
Re: *safe* coerce: four methods compared
[EMAIL PROTECTED] wrote: > > This is a "Related Work" section of the previous message. > > ... again cunning stuff omitted ... > I buy most of this but IMHO you should make very clear that there is not just a single safeCoerce, but the TI/init_typeseq argument has to be constructed and supplied by the programmer in a way that (s)he decides what array of types can be handled. So if you wanted to use your approach to scrap boilerplate [1], say deal with many datatypes, this becomes quite a burden. Think of actually building initial type sequences. Think of how combinators need to be parameterised to take type sequences. (That's what I called a CWA yesterday.) On the other hand, you mention this duality between type classes vs. type heaps. Yes, I would say that type classes and type case are somewhat dual. You provide a type case. What I like about your type case vs. the approach taken in [1] is that your type case will be very precise. That is, you don't say one can just try anything what is Typeable but you rather restrict questions to the types in the supplied initial type sequence. This is certainly beneficial for applications other than scraping boilerplate. Ralf [1} "Scrap your boilerplate: a practical design pattern for generic programming" by Ralf Lämmel and Simon Peyton-Jones, appeared in Proceedings of TLDI 2003, ACM Press http://www.cs.vu.nl/boilerplate/#paper -- Ralf Laemmel VU & CWI, Amsterdam, The Netherlands http://www.cs.vu.nl/~ralf/ http://www.cwi.nl/~ralf/ ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
*safe* coerce: four methods compared
This is a "Related Work" section of the previous message. We compare three main methods of achieving safe casts. It seems that the method proposed in the earlier message is quite different -- especially in terms of extensibility. In this message, we compare the extensibility of four techniques. Stephanie Weirich ICFP'00 paper points out another solution, which relies on mutable IORefs. Since that technique can only be used with IO monad, we do not consider it here. Some of the methods below require type classes and algebraic datatype declarations. Some require only an algebraic datatype, or only a typeclass. In either case, we run into an extensibility problem: to add support for a new datatype, we must either add an instance declaration, or a new alternative to the datatype declaration. These are non-trivial, non-modular extensions. For example, when we add a new alternative to a datatype declaration, we must physically update the corresponding file. We must then re-compile all dependent modules. Surprisingly, the solution in the earlier message is free from these drawbacks. We can extend the type heap in a modular fashion. We do not need to alter type or data declarations. It seems our type heaps are sort of reified lists of instances. There appear to be a duality between typeclasses and our type heaps. Only our type heaps are first-class. The idea behind all type-safe casts is simple: to cast a value of a type 'a' into a target type 'b', we inject the value into some universe and then project it to the target type 'b'. To illustrate the differences in implementations of that idea, we will be using James Cheney and Ralf Hinze's example: a generic comparison function. To avoid unnecessary complications, we limit ourselves to built-in and scalar types. Extensions to products, exponential, recursive and polymorphic types are possible, but too messy. We also will not consider existential types, so we avoid introducing type classes if they are not needed in the static case. This whole message is self-contained, and can be loaded as it is in GHCi, given the flags -fglasgow-exts -fallow-undecidable-instances -fallow-overlapping-instances Approach 1: Tcl approach The universal type is a string. The values to cast must belong to the class Show and the class Read. The injection and projection functions are trivial: > sh_inj x = show x > sh_prj x = read x Generic equality and the cast functions are trivial as well: > sh_gequal x y = sh_inj x == sh_inj y > sh_cast x = sh_prj $ sh_inj x Here's the test: > sh_test1 = [sh_gequal 1 2, sh_gequal True True, sh_gequal 'a' 'b'] To add support for a new datatype, we have to place that datatype in the class Show and the class Read. That is, we have to add the corresponding instance declarations _and_ we have to implement methods 'show' and 'read'. Functions sh_gequal and sh_cast do not have to be modified. The Tcl approach is also the most generous with respect to type equivalence: for example, Int and Integer are considered equivalent, and sh_cast may cast between them. Incidentally, when GHC can derive Binary, this approach becomes far more appealing. Approach 2: The universe is the tagged union > data TU = TChar Char | TBool Bool | TInt Int > > class TURepr t where > tu_inj:: t -> TU > tu_prj:: TU -> t > > instance TURepr Char where > tu_inj = TChar > tu_prj (TChar x) = x > > instance TURepr Bool where > tu_inj = TBool > tu_prj (TBool x) = x > > instance TURepr Int where > tu_inj = TInt > tu_prj (TInt x) = x > > tu_gequal x y = cmp (tu_inj x) (tu_inj y) > where > cmp (TChar x) (TChar y) = x == y > cmp (TBool x) (TBool y) = x == y > cmp (TInt x) (TInt y) = x == y > > tu_cast x = tu_prj $ tu_inj x > > tu_test1 = [tu_gequal (1::Int) (2::Int), tu_gequal True True, > tu_gequal 'a' 'b'] To add support for a new datatype, we have to add a new alternative to the declaration of the datatype TU and we have to add a new instance for the class TURepr (with the implementation of the tu_inj and tu_proj methods). We also have to add another clause to the tu_gequal function. Clearly this is the least extensible approach. Approach 3: by Cheney and Ralf Hinze's The universe is a set of inject/project pairs > data IPP a = IPPInt (a->Int) (Int->a) > | IPPChar (a->Char) (Char->a) > | IPPBool (a->Bool) (Bool->a) > > ipp_gequal (IPPInt prj inj) x y = prj x == prj y > ipp_gequal (IPPChar prj inj) x y = prj x == prj y > ipp_gequal (IPPBool prj inj) x y = prj x == prj y > > ipp_cast (IPPInt xprj xinj) x (IPPInt yprj yinj) = yinj $ xprj x > -- more should follow... > > ipp_test1 = [ipp_gequal (IPPInt id id) (1::Int) (2::Int), > ipp_gequal (IPPBool id id) True True, > ipp_gequal (IPPChar id id) 'a' 'b'] To add a new primitive datatype, we should modify the declaration of the datatype IPP and add a new alternative. We also need to add cl