On Oct 9, 2005, at 11:32 PM, Ben Lippmeier wrote:
This sounds like a text-book space-leak caused by lazy evaluation.
In a lazy language, function evaluation is driven by the need to
perform IO. Because the first version of your program never prints
the parsed structure, the parser doesn't completely parse it. This
implies that the system needs to hang onto all the input data (as
well as all the partially evaluated calls to the parser) incase it
needs it later on.
I naively believed (or hoped) that, so long as I the programmer took
some care, the evaluator would be smart enough to discard unaccessed
computations and values and thus avoid a space leak. Hence the thing
I worried about was explicit references to data structures (say, the
head of very long lists) that prevent objects from being reclaimed.
Now it seems I need to also worry about the hidden aspects of the
internal implementation of lazy evaluation.
What intuition should be my guide about the "proper way" of employing
lazy evaluation? It's not yet clear to me what you mean by "the
system needs to hang onto all the input data ...". It seems
counterintuitive for the evaluator to hang onto partially evaluated
functions and input data when it can be proven (through static
analysis) that references to that data can no longer exist in the
remainder of the program execution; for example, consider
case head $ drop 500000 $ parseArts ... of
Right x -> hPutArts stderr x
in which the first 500,000 elements in the result of parseArts can't
ever be referenced after 'drop' executes, and yet the space leak
still happens (presumably while trying to skip over those 500,000
elements). Have I understood you correctly that this is the
unfortunate behavior of lazy evaluation?
The default string representation is Haskell is pretty abysmal, and
having it use hundreds of megs to represent, say a 10 meg file is
not too surprising.
Each file I process is 70MB, so inefficient representation could be a
problem if input data is buffered.
My advice is that if you don't want to fool around with it, just
swallow hard, then change fixArts to do a hPutArts to /dev/null.
Either that or
1) go read about the DeepSeq library.
2) add strictness annotations to your structure definition.
Thanks for the advice; I'll look into both.
--Young
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe