[Haskell-cafe] Space usage and CSE in Haskell

Melissa O'Neill Tue, 24 Jul 2007 15:36:39 -0700

When advocating functional languages like Haskell, one of the claimsI've tended to make is that referential transparency allows thelanguage to be much more aggressive about things like commonsubexpression elimination (CSE) than traditional imperative languages(which need to worry about preserving proper side-effect sequencing).

But a recent example has left me thinking that maybe I've gone toofar in my claims.


First, lets consider a simple consumer program, such as:

printEveryNth c l n  = do    print (c', x)
                             printEveryNth c' xs n
                       where c'   = c+n
                             x:xs = drop (n-1) l

Note that we can pass this function an infinite list, such as [1..],and it won't retain the whole list as it prints out every nth elementof the list.

Now let's consider two possible infinite lists we might pass to ourconsumer function. We'll use a list of primes (inspired by therecent discussion of primes, but you can ignore the exact functionbeing computed). Here's the first version:

primes = 2 : [x | x <- [3,5..], all (\p -> x `mod` p > 0)(factorsToTry x)]
    where
        factorsToTry x = takeWhile (\p -> p*p <= x) primes

As you might expect, at the point where we print the nth prime fromour infinite list, we will be retaining a list that requires O(n) space.

But this simple modification allows us to use only O(sqrt(n)) spaceat the point we print the nth prime:

primes =
    2 : [x | x <- [3,5..], all (\p -> x `mod` p > 0) (factorsToTry x)]
    where
        slowerPrimes =

2 : [x | x <- [3,5..], all (\p -> x `mod` p > 0)(factorsToTry x)]

        factorsToTry x = takeWhile (\p -> p*p <= x) slowerPrimes

Notice the gigantic common subexpression -- both primes andslowerPrimes define exactly the same list, but at the point wherewe're examining the nth element of primes, we'll only have advancedto the sqrt(n)th element of slowerPrimes.

Clearly, "simplifying" the second version of primes into the first byperforming CSE actually makes the code much *worse*. This "CSE-makes-it-worse" property strikes me as "interesting".

So, is it "interesting"...? Has anyone worked on characterizing CSEspace leaks (and avoiding CSE in those cases)? FWIW, it looks likeothers have run into the same problem, since bug #947 in GHC (fromOctober 2006) seems to be along similar lines.


    Melissa.

P.S. These issues do make massive difference in practice. There isa huge difference between taking O(n) and O(sqrt(n)) space -- thedifference between a couple of megabytes for the heap and tens orhundreds of megabytes.


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Space usage and CSE in Haskell

Reply via email to