On Nov 11, 2010, at 1:42 PM, Stephen Tetley wrote:

[*] In case anyone looks up MacGuffin on Wikipedia, I don't think the
description there is strictly accurate. A MacGuffin doesn't drive the
plot so much as throw the viewer of the scent.

I think Hitchcock might disagree with you.

In any case, serializing functions is as easy as you want it to be. But, there is a big caveat: You are basically embedding a compiler into your run-time. It can be pretty minimal, relying only on facts known about recursively enumerable functions:

class Serialize a where
      serialize         :: a -> ByteString
      unSerialize :: ByteString -> Maybe a  -- Parsers can fail

instance (Serialize a) => Serialize [a] where ...
instance (Serialize a, Serialize b) => Serialize (a, b) where ...

-- We can conclude that a and b must be enumerable from the requirement that
-- f is recursively enumerable:
instance (Serialize a, Enum a, Serialize b, Enum b) => Serialize (a -> b) where
        serialize f = serialize $ ( zip         [minBound..maxBound]
                                        (fmap f [minBound..maxBound]) )
-- A map instance could be better: we trade some serialization time for more -- deserialization time. instance (Serialize a, Serialize b) => Serialize (Map a b) where ...

instance (Serialize a, Serialize b) => Serialize (a -> b) where
        serialize f = serialize . fromList $ ( zip         [minBound..maxBound]
                                                   (fmap f 
[minBound..maxBound]) )
        deserialize map = \x -> lookup x (bytestring_decode_map map)
                                    where bytestring_decode_map = ...

There are potentially big problems with this approach:
(i) Many data types are not instances of Enum (though the really should be. Deriving enumerations is not that hard. I am willing to take one for the team to get GHC to figure out how to derive Enum on arbitrary types, but I'm not sure where to start. Little help?) (ii) Time and memory. We're not encoding a series of instructions for computing pure functions, but we're precomputing the results and saving them all for later. This is at least O(size of the domain) * O(of the function on each element). Not big in theoretical terms, but something like that could easily cause a factor of [1, \infty) slowdown in real code. (iii) Serialized objects must be pure. In particular, you can't serialize general IO actions. I see this as a plus. It is still easy to write an algebra of serializable tokens and a non-serializable interpreter to generate IO actions from the tokens. We do this kind of thing all the time -- we just don't serialize the tokens usually.

I think (ii) is the biggest problem. And it is a big one. We basically need something like template haskell for runtime systems in order to do quasi-quoting and compilation at run-time so we can avoid reifying the domain and its image under f.

The only thing that can serialize an (IO a) is the Haskell runtime, and it does it by running the action (and so putting its sub-steps in a series).
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to