Re: [Haskell-cafe] Efficient way to break up a lazy bytestring

Ranjan Bagchi Fri, 29 Dec 2006 22:36:55 -0800


On Dec 29, 2006, at 6:32 PM, Matthew Brecknell wrote:

breakUp s
        | L.null s = []
        | otherwise = h:(breakUp r) where
                (h,r) = L.splitAt 72 s

Running this on the 2G file blows up the stack pretty quickly, taking
the first 1 million records (there are 20M of them) with a big stack
parameter gives about 25% productivity, with GC taking the other 75%.

My understanding is that while this looks tail recursive, it isn't
really because of laziness.  I've tried throwing seq operators
around, but they don't seem to do much to help the efficiency.


This function by itself doesn't really have any particular behaviour

with respect to stack and heap usage, since it is just a linearmapping

from one lazy sequence to another. To understand the stack blowout,
you'll need to look at what you are doing with the result of this
function.

For example, a foldl over the result might be building a big thunk on
the heap, which could blow the stack when evaluated*. If this is the

case, you might avoid building the big thunk by using foldl', aversion

of foldl which evaluates as it folds.

I guess the consumer's really important (Didn't even occur to me, Iwas concentrating on how I was generating the list).

I was trying to de-lazy the list, I did the following:

bs = [...]
recs' = (take 1000000) breakUp bs
recs = foldr seq recs' recs'
print $ length recs

Would the fold be blowing the stack?

Ranjan

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Efficient way to break up a lazy bytestring

Reply via email to