That's really nice. Very interesting. Thank you. On Fri, Jun 25, 2010 at 9:55 PM, Edward Kmett <ekm...@gmail.com> wrote:
> I've obtained permission to repost Gershom's slides on how to deserialize > GADTs by typechecking them yourself, which are actually a literate haskell > source file, that he was rendering to slides. Consequently, I've just pasted > the content below as a literate email. > > -Edward Kmett > > - > Deserializing strongly typed values > > (four easy pieces about typechecking) > > Gershom Bazerman > > - > prior art: > Oleg (of course) > http://okmij.org/ftp/tagless-final/course/course.html > > ...but also > Stephanie Weirich > http://www.comlab.ox.ac.uk/projects/gip/school/tc.hs > > = > Ahem... > \begin{code} > {-# LANGUAGE DeriveDataTypeable, > ExistentialQuantification, > FlexibleContexts, > FlexibleInstances, > FunctionalDependencies, > GADTs, > RankNTypes, > ScopedTypeVariables > #-} > \end{code} > = > ahem. > \begin{code} > import Data.Typeable > import Data.Maybe > import Control.Monad.Error > import Control.Applicative > import qualified Data.Map as M > import Unsafe.Coerce > \end{code} > = > A simple ADT. > > \begin{code} > data SimpleExpr = SOpBi String SimpleExpr SimpleExpr > | SOpUn String SimpleExpr > | SDbl Double > | SBool Bool deriving (Read, Show, Typeable) > > \end{code} > Yawn. > = > An awesome GADT! > > \begin{code} > data Expr a where > EDbl :: Double -> Expr Double > EBool :: Bool -> Expr Bool > EBoolOpBi :: BoolOpBi -> Expr Bool -> Expr Bool -> Expr Bool > ENumOpBi :: NumOpBi -> Expr Double -> Expr Double -> Expr Double > ENumOpUn :: NumOpUn -> Expr Double -> Expr Double > deriving Typeable > > data NumOpBi = Add | Sub deriving (Eq, Show) > data NumOpUn = Log | Exp deriving (Eq, Show) > data BoolOpBi = And | Or deriving (Eq, Show) > \end{code} > > The GADT is well typed. It cannot go wrong. > - > It also cannot derive show. > = > But we can write show... > > \begin{code} > showIt :: Expr a -> String > showIt (EDbl d) = show d > showIt (EBool b) = show b > showIt (EBoolOpBi op x y) = "(" ++ show op > ++ " " ++ showIt x > ++ " " ++ showIt y ++ ")" > showIt (ENumOpBi op x y) = "(" ++ show op > ++ " " ++ showIt x > ++ " " ++ showIt y ++ ")" > showIt (ENumOpUn op x) = show op ++ "(" ++ showIt x ++ ")" > \end{code} > = > And eval is *much nicer*. > It cannot go wrong --> no runtime typechecks. > > \begin{code} > evalIt :: Expr a -> a > evalIt (EDbl x) = x > evalIt (EBool x) = x > evalIt (EBoolOpBi op expr1 expr2) > | op == And = evalIt expr1 && evalIt expr2 > | op == Or = evalIt expr2 || evalIt expr2 > > evalIt (ENumOpBi op expr1 expr2) > | op == Add = evalIt expr1 + evalIt expr2 > | op == Sub = evalIt expr1 - evalIt expr2 > \end{code} > = > But how do we write read!? > > read "EBool False" = Expr Bool > read "EDbl 12" = Expr Double > > The type being read depends on the content of the string. > > Even worse, we want to read not from a string that looks obvious > to Haskell (i.e. a standard showlike instance) but from > something that looks pretty to the user -- we want to *parse*. > > So we parse into our simple ADT. > > Then we turn our simple ADT into our nice GADT. > - > But how? > > How do we go from untyped... to typed? > > [And in general -- not just into an arbitrary GADT, > but an arbitrary inhabitant of a typeclass.] > > [i.e. tagless final, etc] > > = > Take 1: > Even if we do not know what type we are creating, > we eventually will do something with it. > > So we paramaterize our typechecking function over > an arbitrary continuation. > > \begin{code} > mkExpr :: (forall a. (Show a, Typeable a) => Expr a -> r) -> SimpleExpr -> > r > mkExpr k expr = case expr of > SDbl d -> k $ EDbl d > SBool b -> k $ EBool b > SOpUn op expr1 -> case op of > "log" -> k $ mkExpr' (ENumOpUn Log) expr1 > "exp" -> k $ mkExpr' (ENumOpUn Exp) expr1 > _ -> error "bad unary op" > SOpBi op expr1 expr2 -> case op of > "add" -> k $ mkExprBi (ENumOpBi Add) expr1 expr2 > "sub" -> k $ mkExprBi (ENumOpBi Sub) expr1 expr2 > \end{code} > = > Where's the typechecking? > > \begin{code} > mkExpr' k expr = mkExpr (appCast $ k) expr > > > mkExprBi k expr1 expr2 = mkExpr' (mkExpr' k expr1) expr2 > > > appCast :: forall a b c r. (Typeable a, Typeable b) => (c a -> r) -> c b -> > r > appCast f x = maybe err f $ gcast x > where err = error $ "Type error. Expected: " ++ show (typeOf > (undefined::a)) > ++ ", Inferred: " ++ show (typeOf (undefined::b)) > \end{code} > > ... We let Haskell do all the work! > = > Hmmm... the continuation can be anything. > So let's just make it an existential constructor. > > \begin{code} > data ExprBox = forall a. Typeable a => ExprBox (Expr a) > > appExprBox :: (forall a. Expr a -> res) -> ExprBox -> res > appExprBox f (ExprBox x) = f x > > tcCast :: forall a b c. (Typeable a, Typeable b) => Expr a -> Either String > (Expr b) > tcCast x = maybe err Right $ gcast x > where err = Left $ "Type error. Expected: " ++ show (typeOf > (undefined::a)) > ++ ", Inferred: " ++ show (typeOf (undefined::b)) > \end{code} > > Now we can delay deciding what to apply until later. > Typecheck once, execute repeatedly! > > = > One more trick -- monadic notation lets us > extend the context of unpacked existentials > to the end of the do block > > \begin{code} > retBox x = return (ExprBox $ x, typeOf x) > > typeCheck :: SimpleExpr -> Either String (ExprBox, TypeRep) > typeCheck (SDbl d) = retBox (EDbl d) > typeCheck (SBool b) = retBox (EBool b) > typeCheck (SOpBi op s1 s2) = case op of > "add" -> tcBiOp (ENumOpBi Add) > "sub" -> tcBiOp (ENumOpBi Sub) > "and" -> tcBiOp (EBoolOpBi And) > "or" -> tcBiOp (EBoolOpBi Or) > where > tcBiOp constr = do > (ExprBox e1, _) <- typeCheck s1 > (ExprBox e2, _) <- typeCheck s2 > e1' <- tcCast e1 > e2' <- tcCast e2 > retBox $ constr e1' e2' > \end{code} > = > So that's fine for *very* simple expressions. > How does it work for interesting GADTs? > (like, for example, HOAS)? > > (The prior art doesn't demonstrate HOAS -- > it uses DeBruijn.) > = > Our simple world > > \begin{code} > type Ident = String > type TypeStr = String > > data STerm = SNum Double > | SApp STerm STerm > | SVar Ident > | SLam Ident TypeRep STerm > \end{code} > > Note.. terms are Church style -- each var introduced > has a definite type. > > Determining this type is left as an exercise. > > = > Over the rainbow in well-typed land. > > \begin{code} > data Term a where > TNum :: Double -> Term Double > TApp :: Term (a -> b) -> Term a -> Term b > TLam :: Typeable a => (Term a -> Term b) -> Term (a -> b) > TVar :: Typeable a => Int -> Term a > deriving Typeable > \end{code} > > Wait! DeBrujin (TVar) *and* HOAS (TLam)! The worst of both worlds. > > Don't worry. In the final product all TVars are eliminated by construction. > > Exercise to audience: rewrite the code so that TVar can be eliminated > from the Term type. > = > Show and eval... > > \begin{code} > showT :: Int -> Term a -> String > showT c (TNum d) = show d > showT c (TApp f x) = "App (" ++ showT c f > ++ ") (" ++ showT c x ++ ")" > showT c (TLam f) = "Lam " ++ ("a"++show c) > ++ " -> " ++ (showT (succ c) $ f (TVar c)) > showT c (TVar i) = "a"++show i > > runT :: Term a -> Term a > runT (TNum d) = (TNum d) > runT (TLam f) = (TLam f) > runT (TApp f x) = case runT f of TLam f' -> runT (f' x) > runT (TVar i) = error (show i) > \end{code} > = > The existential > > \begin{code} > data TermBox = forall a. Typeable a => TermBox (Term a) > appTermBox :: (forall a. Typeable a => Term a -> res) -> TermBox -> res > appTermBox f (TermBox x) = f x > \end{code} > = > The typechecker returns a box *and* a typeRep. > > Cast is the usual trick. > > \begin{code} > retTBox :: Typeable a => Term a -> Either String (TermBox, TypeRep) > retTBox x = return (TermBox $ x, typeOf x) > > type Env = M.Map Ident (TermBox, TypeRep) > > trmCast :: forall a b c. (Typeable a, Typeable b) => > Term a -> Either String (Term b) > trmCast x = maybe err Right $ gcast x > where err = Left $ "Type error. Expected: " ++ > show (typeOf (undefined::a)) > ++ ", Inferred: " ++ > show (typeOf (undefined::b)) > \end{code} > > = > \begin{code} > typeCheck' :: STerm -> Env -> Either String (TermBox, TypeRep) > typeCheck' t env = go t env 0 > where > go (SNum d) _ idx = retTBox (TNum d) > go (SVar i) env idx = do > (TermBox t, _) <- maybe (fail $ "not in scope: " ++ i) > return $ M.lookup i env > retTBox $ t > \end{code} > > Nums and vars are easy. > = > App and Lam... less so. > > \begin{code} > go (SApp s1 s2) env idx = do > (TermBox e1, tr1) <- go s1 env idx > (TermBox e2, _) <- go s2 env idx > TermBox rt <- return $ mkTerm $ head $ tail $ > typeRepArgs $ head $ typeRepArgs $ tr1 > -- TypeReps have their... drawbacks. > e1' <- trmCast e1 > retTBox $ TApp e1' e2 `asTypeOf` rt > go (SLam i tr s) env idx = do > TermBox et <- return $ mkTerm tr > (TermBox e, _) <- go s > (M.insert i > (TermBox (TVar idx `asTypeOf` et),tr) > env > ) > (idx + 1) > retTBox $ TLam (\x -> subst (x `asTypeOf` et) idx e) > \end{code} > = > How does mkTerm work? > > \begin{code} > mkTerm :: TypeRep -> TermBox > mkTerm tr = go tr TermBox > where > go :: TypeRep -> (forall a. (Typeable a) => Term a -> res) -> res > go tr k > | tr == typeOf (0::Double) = k (TNum 0) > | typeRepTyCon tr == arrCon = > go (head $ typeRepArgs tr) > $ \xt -> go (head $ tail $ typeRepArgs tr) > $ \y -> k (TLam $ \x -> const y (x `asTypeOf` xt)) > > arrCon = typeRepTyCon $ typeOf (undefined::Int -> String) > \end{code} > > Same principle -- but can build arrows directly. > > Doing so requires staying cps... I think. > = > And this is how we get rid of the DeBruijn terms. > > \begin{code} > subst :: (Typeable a) => Term a -> Int -> Term b -> Term b > subst t i trm = go trm > where > go :: Term c -> Term c > go (TNum d) = (TNum d) > go (TApp f x) = TApp (go f) (go x) > go (TLam f) = TLam (\a -> go (f a)) > go (TVar i') > | i == i' = either error id $ trmCast t > | otherwise = (TVar i') > \end{code} > > Q: Now you see why DeBruijn is handy -- substitution > is otherwise a pain. > > = > But note -- all functions must be monotyped. > This is the simply typed lambda calculus. > > How do we represent TLam (\a -> a)? > > The masses demand HM polymorphism. > - > > Take 4: > > A damn dirty hack. > > = > All hacks begin with Nats. > > \begin{code} > data Z = Z deriving (Show, Typeable) > data S a = S a deriving (Show, Typeable) > \end{code} > = > typeCheck is almost the same. > > \begin{code} > typeCheck'' :: STerm -> Env -> Either String (TermBox, TypeRep) > typeCheck'' t env = go t env 0 > where > go :: STerm -> Env -> Int -> Either String (TermBox, TypeRep) > go (SNum d) _ idx = retTBox (TNum d) > go (SVar i) env idx = do > (TermBox t, _) <- maybe (fail $ "not in scope: " ++ i) > return $ M.lookup i env > retTBox $ t > \end{code} > = > \begin{code} > go (SApp s1 s2) env idx = do > (TermBox e1, tr1) <- go s1 env idx > (TermBox e2, _) <- go s2 env idx > TermBox rt <- unifyAppRet e1 e2 > e1' <- unifyAppFun e1 e2 > retTBox $ TApp e1' e2 `asTypeOf` rt > go (SLam i tr s) env idx = do > TermBox et <- return $ mkTerm' $ tr > (TermBox e, _) <- go s (M.insert i > (TermBox (TVar idx `asTypeOf` et),tr) > env) > (idx + 1) > retTBox $ TLam (\x -> subst (x `asTypeOf` et) idx e) > \end{code} > > It looks like we just factored on the nasty arrow code. > = > mkTerm is almost the same... we just extended it to deal with Nats. > \begin{code} > mkTerm' :: TypeRep -> TermBox > mkTerm' tr = go tr TermBox > where > go :: TypeRep -> (forall a. (Typeable a) => Term a -> res) -> res > go tr k > | tr == typeOf (0::Double) = k (TNum 0) > | tr == typeOf Z = k (TVar 0 :: Term Z) > | typeRepTyCon tr == succCon = go (head $ typeRepArgs tr) > $ \t -> k $ succTerm t > | isArr tr = > go (head $ typeRepArgs tr) > $ \xt -> go (head $ tail $ typeRepArgs tr) > $ \y -> k (TLam $ \x -> const y (x `asTypeOf` xt)) > | otherwise = error $ show tr > > succCon = typeRepTyCon $ typeOf (S Z) > succTerm :: Typeable b => Term b -> Term (S b) > succTerm _ = TVar 0 > \end{code} > = > Some utilities > \begin{code} > isArr :: TypeRep -> Bool > isArr x = typeRepTyCon x == (typeRepTyCon $ typeOf (undefined::Int -> > String)) > > splitArrCon :: TypeRep -> Either String (TypeRep, TypeRep) > splitArrCon x > | isArr x = case typeRepArgs x of > [a,b] -> Right (a,b) > _ -> Left $ "Expected function, inferred: " ++ show x > | otherwise = Left $ "Expected function, inferred: " ++ show x > > \end{code} > = > Give an arrow term we unify it with its argument... > and return a witness. > \begin{code} > unifyAppRet :: forall a b. (Typeable a, Typeable b) => > Term a -> Term b -> Either String TermBox > unifyAppRet x y = do > tr <- unifyAppTyps > (head $ typeRepArgs $ typeOf x) > (head $ typeRepArgs $ typeOf y) > return $ mkTerm' tr > > \end{code} > = > Yes. Actual unification. > (although this is unification of a type template, > so at least it is local) > \begin{code} > unifyAppTyps :: TypeRep -> TypeRep -> Either String TypeRep > unifyAppTyps trf trx = do > (fl,fr) <- splitArrCon trf > env <- go M.empty fl trx > subIt env fr > where > -- go yields a substitution environment. > go :: M.Map String TypeRep -> TypeRep -> > TypeRep -> Either String (M.Map String TypeRep) > go env x y > | isFree x = case M.lookup (show x) env of > Just x' -> if x' == y then return env else Left > (error "a") > Nothing -> return $ M.insert (show x) y env > | isArr x = do > (lh,rh) <- splitArrCon x > (lh',rh') <- splitArrCon y > env' <- go env lh lh' > go env' rh rh' > | otherwise = if x == y then return env else Left (error "b") > \end{code} > = > \begin{code} > -- subIt applies it > subIt :: M.Map String TypeRep -> TypeRep -> Either String TypeRep > subIt env x > | isFree x = case M.lookup (show x) env of > Just x' -> return x' > Nothing -> Left (error "c") > | isArr x = do > (lh,rh) <- splitArrCon x > lh' <- subIt env lh > rh' <- subIt env rh > return $ mkTyConApp arrCon [lh',rh'] > | otherwise = return x > succCon = typeRepTyCon $ typeOf (S Z) > zCon = typeRepTyCon $ typeOf Z > isFree x = typeRepTyCon x `elem` [zCon, succCon] > arrCon = (typeRepTyCon $ typeOf (undefined::Int -> String)) > \end{code} > > = > And now, we just have to convince GHC that they unify! > > \begin{code} > unifyAppFun :: forall a b c. (Typeable a, Typeable b, Typeable c) > => Term a -> Term b -> Either String (Term c) > unifyAppFun x y = do > unifyAppTyps (head $ typeRepArgs $ typeOf x) (head $ typeRepArgs $ typeOf > y) > return $ unsafeCoerce x > \end{code} > = > Problem solved. > > > On Fri, Jun 25, 2010 at 2:03 PM, Edward Kmett <ekm...@gmail.com> wrote: > >> It turns out that defining Read is somewhat tricky to do for a GADT. >> >> Gershom Bazerman has some very nice slides on how to survive the process >> by manual typechecking (much in the spirit of Oleg's meta-typechecking code >> referenced by Stephen's follow up below) >> >> He presented them at hac-phi this time around. >> >> I will check with him to see if I can get permission to host them >> somewhere and post a link to them here. >> >> -Edward Kmett >> >> >> On Fri, Jun 25, 2010 at 5:04 AM, <corentin.dup...@ext.mpsa.com> wrote: >> >>> >>> Hello Haskellers, >>> >>> I'm having trouble writing a Read Instance for my GATD. >>> Arg this GATD!! It causes me more problems that it solves ;) >>> Especially with no automatic deriving, it adds a lot of burden to my >>> code. >>> >>> >data Obs a where >>> > ProposedBy :: Obs Int -- The player that proposed the tested >>> rule >>> > Turn :: Obs Turn -- The current turn >>> > Official :: Obs Bool -- whereas the tested rule is official >>> > Equ :: (Eq a, Show a, Typeable a) => Obs a -> Obs a -> Obs >>> Bool >>> > Plus :: (Num a) => Obs a -> Obs a -> Obs a >>> > Time :: (Num a) => Obs a -> Obs a -> Obs a >>> > Minus :: (Num a) => Obs a -> Obs a -> Obs a >>> > And :: Obs Bool -> Obs Bool -> Obs Bool >>> > Or :: Obs Bool -> Obs Bool -> Obs Bool >>> > Not :: Obs Bool -> Obs Bool >>> > Konst :: a -> Obs a >>> >>> >>> > instance Read a => Read (Obs a) where >>> > readPrec = (prec 10 $ do >>> > Ident "ProposedBy" <- lexP >>> > return (ProposedBy)) >>> > +++ >>> > (prec 10 $ do >>> > Ident "Official" <- lexP >>> > return (Official)) >>> > (etc...) >>> >>> Observable.lhs:120:8: >>> Couldn't match expected type `Int' against inferred type `Bool' >>> Expected type: ReadPrec (Obs Int) >>> Inferred type: ReadPrec (Obs Bool) >>> >>> >>> Indeed "ProposedBy" does not have the same type that "Official". >>> Mmh how to make it all gently mix altogether? >>> >>> >>> Best, >>> Corentin >>> >>> >>> _______________________________________________ >>> Haskell-Cafe mailing list >>> Haskell-Cafe@haskell.org >>> http://www.haskell.org/mailman/listinfo/haskell-cafe >>> >> >> > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > >
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe