I think this is a great idea and should become a top priority. I would probably start by switching to a type-class-based seq, after which perhaps the next step forward would become more clear.
John L. On Apr 1, 2014 2:54 AM, "Dan Doel" <dan.d...@gmail.com> wrote: > In the past year or two, there have been multiple performance problems in > various areas related to the fact that lambda abstraction is not free, > though we > tend to think of it as so. A major example of this was deriving of > Functor. If we > were to derive Functor for lists, we would end up with something like: > > instance Functor [] where > fmap _ [] = [] > fmap f (x:xs) = f x : fmap (\y -> f y) xs > > This definition is O(n^2) when fully evaluated,, because it causes O(n) eta > expansions of f, so we spend time following indirections proportional to > the > depth of the element in the list. This has been fixed in 7.8, but there are > other examples. I believe lens, [1] for instance, has some stuff in it that > works very hard to avoid this sort of cost; and it's not always as easy to > avoid > as the above example. Composing with a newtype wrapper, for instance, > causes an > eta expansion that can only be seen as such at the core level. > > The obvious solution is: do eta reduction. However, this is not > operationally > sound currently. The problem is that seq is capable of telling the > difference > between the following two expressions: > > undefined > \x -> undefined x > > The former causes seq to throw an exception, while the latter is considered > defined enough to not do so. So, if we eta reduce, we can cause terminating > programs to diverge if they make use of this feature. > > Luckily, there is a solution. > > Semantically one would usually identify the above two expressions. While I > do > believe one could construct a semantics that does distinguish them, it is > not > the usual practice. This suggests that there is a way to not distinguish > them, > perhaps even including seq. After all, the specification of seq is > monotone and > continuous regardless of whether we unify ⊥ with \x -> ⊥ x or insert an > extra > element for the latter. > > The currently problematic case is function spaces, so I'll focus on it. How > should: > > seq : (a -> b) -> c -> c > > act? Well, other than an obvious bottom, we need to emit bottom whenever > our > given function is itself bottom at every input. This may first seem like a > problem, but it is actually quite simple. Without loss of generality, let > us > assume that we can enumerate the type a. Then, we can feed these values to > the > function, and check their results for bottom. Conal Elliot has prior art > for > this sort of thing with his unamb [2] package. For each value x :: a, > simply > compute 'f x `seq` ()' in parallel, and look for any successes. If we ever > find > one, we know the function is non-bottom, and we can return our value of c. > If we > never finish searching, then the function must be bottom, and seq should > not > terminate, so we have satisfied the specification. > > Now, some may complain about the enumeration above. But this, too, is a > simple > matter. It is well known that Haskell programs are denumerable. So it is > quite > easy to enumerate all Haskell programs that produce a value, check whether > that > value has the type we're interested in, and compute said value. All of > this can > be done in Haskell. Thus, every Haskell type is programatically enumerable > in > Haskell, and we can use said enumeration in our implementation of seq for > function types. I have discussed this with Russell O'Connor [3], and he > assures > me that this argument should be sufficient even if we consider semantic > models > of Haskell that contain values not denoted by any Haskell program, so we > should > be safe there. > > The one possible caveat is that this implementation of seq is not > operationally > uniform across all types, so the current fully polymorphic type of seq may > not > make sense. But moving back to a type class based approach isn't so bad, > and > this time we will have a semantically sound backing, instead of just > having a > class with the equivalent of the current magic function in it. > > Once this machinery is in place, we can eta reduce to our hearts' content, > and > not have to worry about breaking semantics. We'd no longer put the burden > on > programmers to use potentially unsafe hacks to avoid eta expansions. I > apologize > for not sketching an implementation of the above algorithm, but I'm sure it > should be elementary enough to make it into GHC in the next couple > versions. > Everyone learns about this type of thing in university computer science > programs, no? > > Thoughts? Comments? Questions? > > Cheers, > -- Dan > > [1] http://hackage.haskell.org/package/lens > [2] http://hackage.haskell.org/package/unamb > [3] http://r6.ca/ > > > _______________________________________________ > Glasgow-haskell-users mailing list > Glasgow-haskell-users@haskell.org > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users > >
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users