This is a "Related Work" section of the previous message. We compare three main methods of achieving safe casts. It seems that the method proposed in the earlier message is quite different -- especially in terms of extensibility. In this message, we compare the extensibility of four techniques. Stephanie Weirich ICFP'00 paper points out another solution, which relies on mutable IORefs. Since that technique can only be used with IO monad, we do not consider it here.
Some of the methods below require type classes and algebraic datatype declarations. Some require only an algebraic datatype, or only a typeclass. In either case, we run into an extensibility problem: to add support for a new datatype, we must either add an instance declaration, or a new alternative to the datatype declaration. These are non-trivial, non-modular extensions. For example, when we add a new alternative to a datatype declaration, we must physically update the corresponding file. We must then re-compile all dependent modules. Surprisingly, the solution in the earlier message is free from these drawbacks. We can extend the type heap in a modular fashion. We do not need to alter type or data declarations. It seems our type heaps are sort of reified lists of instances. There appear to be a duality between typeclasses and our type heaps. Only our type heaps are first-class. The idea behind all type-safe casts is simple: to cast a value of a type 'a' into a target type 'b', we inject the value into some universe and then project it to the target type 'b'. To illustrate the differences in implementations of that idea, we will be using James Cheney and Ralf Hinze's example: a generic comparison function. To avoid unnecessary complications, we limit ourselves to built-in and scalar types. Extensions to products, exponential, recursive and polymorphic types are possible, but too messy. We also will not consider existential types, so we avoid introducing type classes if they are not needed in the static case. This whole message is self-contained, and can be loaded as it is in GHCi, given the flags -fglasgow-exts -fallow-undecidable-instances -fallow-overlapping-instances Approach 1: Tcl approach The universal type is a string. The values to cast must belong to the class Show and the class Read. The injection and projection functions are trivial: > sh_inj x = show x > sh_prj x = read x Generic equality and the cast functions are trivial as well: > sh_gequal x y = sh_inj x == sh_inj y > sh_cast x = sh_prj $ sh_inj x Here's the test: > sh_test1 = [sh_gequal 1 2, sh_gequal True True, sh_gequal 'a' 'b'] To add support for a new datatype, we have to place that datatype in the class Show and the class Read. That is, we have to add the corresponding instance declarations _and_ we have to implement methods 'show' and 'read'. Functions sh_gequal and sh_cast do not have to be modified. The Tcl approach is also the most generous with respect to type equivalence: for example, Int and Integer are considered equivalent, and sh_cast may cast between them. Incidentally, when GHC can derive Binary, this approach becomes far more appealing. Approach 2: The universe is the tagged union > data TU = TChar Char | TBool Bool | TInt Int > > class TURepr t where > tu_inj:: t -> TU > tu_prj:: TU -> t > > instance TURepr Char where > tu_inj = TChar > tu_prj (TChar x) = x > > instance TURepr Bool where > tu_inj = TBool > tu_prj (TBool x) = x > > instance TURepr Int where > tu_inj = TInt > tu_prj (TInt x) = x > > tu_gequal x y = cmp (tu_inj x) (tu_inj y) > where > cmp (TChar x) (TChar y) = x == y > cmp (TBool x) (TBool y) = x == y > cmp (TInt x) (TInt y) = x == y > > tu_cast x = tu_prj $ tu_inj x > > tu_test1 = [tu_gequal (1::Int) (2::Int), tu_gequal True True, > tu_gequal 'a' 'b'] To add support for a new datatype, we have to add a new alternative to the declaration of the datatype TU and we have to add a new instance for the class TURepr (with the implementation of the tu_inj and tu_proj methods). We also have to add another clause to the tu_gequal function. Clearly this is the least extensible approach. Approach 3: by Cheney and Ralf Hinze's The universe is a set of inject/project pairs > data IPP a = IPPInt (a->Int) (Int->a) > | IPPChar (a->Char) (Char->a) > | IPPBool (a->Bool) (Bool->a) > > ipp_gequal (IPPInt prj inj) x y = prj x == prj y > ipp_gequal (IPPChar prj inj) x y = prj x == prj y > ipp_gequal (IPPBool prj inj) x y = prj x == prj y > > ipp_cast (IPPInt xprj xinj) x (IPPInt yprj yinj) = yinj $ xprj x > -- more should follow... > > ipp_test1 = [ipp_gequal (IPPInt id id) (1::Int) (2::Int), > ipp_gequal (IPPBool id id) True True, > ipp_gequal (IPPChar id id) 'a' 'b'] To add a new primitive datatype, we should modify the declaration of the datatype IPP and add a new alternative. We also need to add clauses to ipp_gequal and ipp_cast. Incidentally, (IPPInt id id) specifies that the type Int is equivalent only to Int (that is, for type Int equality is the same as identity). However, we can be more generous: for example, we can cast between any enumerable type and Int: *> *Main> ipp_cast (IPPInt fromEnum toEnum) () (IPPInt id id) *> 0 *> *Main> ipp_cast (IPPInt fromEnum toEnum) True (IPPInt id id) *> 1 Approach 4: the approach of an earlier message. We can have as many universes as we wish. For historical reasons, injection is called 'alter' and projection 'fetch'. > data Nil t r = Nil > data Cons t r = Cons t r > > class PList ntype vtype cdrtype where > cdr:: ntype vtype cdrtype -> cdrtype > empty:: ntype vtype cdrtype -> Bool > value:: ntype vtype cdrtype -> vtype > > instance PList Nil vtype cdrtype where > empty = const True > > instance (PList n v r) => PList Cons v' (n v r) where > empty = const False > value (Cons v r) = v > cdr (Cons v r) = r > > class TypeSeq t s where > type_index:: t -> s -> Int > fetch:: t -> s -> t > alter:: t -> s -> s > > instance (PList Cons t r) => TypeSeq t (Cons t r) where > type_index _ _ = 0 > fetch _ (Cons v _) = v > alter newv (Cons v r) = Cons newv r > > instance (PList Cons t' r', TypeSeq t r') => TypeSeq t (Cons t' r') where > type_index v s = 1 + (type_index v $ cdr s) > fetch v s = fetch v $ cdr s > alter newv (Cons v' r') = Cons v' $ alter newv r' The initial type heap (the initial universe). > th_init = Cons 'a' $ Cons True $ Cons (1::Int) $ Nil > > th_gequal tenv x y | type_index (undefined::Char) tenv == type_index x tenv > = let t1 = alter x tenv > t2 = alter y tenv > in fetch (undefined::Char) t1 == fetch (undefined::Char) t2 > th_gequal tenv x y | type_index (undefined::Int) tenv == type_index x tenv > = let t1 = alter x tenv > t2 = alter y tenv > in fetch (undefined::Int) t1 == fetch (undefined::Int) t2 > th_gequal tenv x y | type_index (undefined::Bool) tenv == type_index x tenv > = let t1 = alter x tenv > t2 = alter y tenv > in fetch (undefined::Bool) t1 == fetch (undefined::Bool) t2 > > th_cast tenv x :: y = fetch (undefined::y) $ alter x tenv > > th_test1 = [th_gequal th_init (1::Int) (2::Int), > th_gequal th_init True True, > th_gequal th_init 'a' 'b'] Let us see what is involved in adding a new datatype. Let us add Float: First, we introduce an extended universe: > th_heap2 = Cons (1.0::Float) $ th_init Then we extend the function th_gequal. We do _not_ need to modify the code of the latter: > th_gequal2 tenv x y | type_index (undefined::Float) tenv == type_index x tenv > = let t1 = alter x tenv > t2 = alter y tenv > in fetch (undefined::Float) t1 == fetch (undefined::Float) t2 > -- delegate the rest to the old th_gequal > th_gequal2 tenv x y = th_gequal tenv x y > > -- th_cast doesn't need to change > > th_test2 = [th_gequal2 th_heap2 (1.0::Float) (2.0::Float), > th_gequal2 th_heap2 (3.14::Float) (3.14::Float), > th_gequal2 th_heap2 (1::Int) (2::Int), > th_gequal2 th_heap2 True True, > th_gequal th_heap2 'a' 'b'] -- old th_gequal used here! *> *Main> th_test2 *> [False,True,False,True,False] We should emphasize the modularity of the latter approach. No _declarations_ need to be changed. In the present, static case, there are no type class instances to add. Furthermore, no code needs to be altered either. The function th_gequal can still be used with the extended heap. _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell