I am wondering what the current state of the art is of the research
of space leaks in lazy functional languages.
I am interested in this because space leaks and the various fixes
proposed in specific situations seems to me to largely nullify the
purpose of referential transparency. At least in the sense described
by John Huges in his paper "Why functional Programming matters" (originally
1984):
Since expressions can be evaluated at any time, one can
freely replace variables by their values and vice versa - that is,
programs are "referentially transparent".
This seems not be true in the presence of space leaks, according to
several examples I have seen.
Wadler has proposed a solution to a particular space leak in
"Fixing some space leaks with a garbage collector" (1987)
He says initially:
Most distrurbingly, John Hughes has pointed out that unless a
parallel evaluator is used, some programs may be inherently "leaky".
The concrete example used by Wadler and Hughes is the break function:
break (x:xs)
| x == '\n' = ([], xs)
| otherwise = x : (fst b), snd b) where
b = break xs
Wadler proposes a way to plug this space leak in the garbage collector.
He thereby introduces a kind of parallell evaluation.
This may be to help to many programs but there are also many other
sources of space leaks.
In fact it seems to me that a very simple example is the following.
(This may seem trivial to all that knows how lists in Haskell are
implemented. But for other users, perhaps with a mathematical or
some other scientific background, it may not be that obvious.)
let N be BIG in
sum [1..N]
The above runs in constant space in most implementations. But
let N be BIG in
reverse $ sum [1..N]
This requires space proportional to BIG.
As well as the following.
let N be BIG in
reverse $ reverse $ sum [1..N]
Given that Haskell is referential transparent, one would think
that it would be easy for the compiler to optimize these expressions
so they would run into constant space. However, according to what
I am told, compilers tend to take great care of PRESERVING the
original structures of program, just to _not_ inadvertently
introduce any new space leaks. I don't know if that applies to
the above examples. But it seems to apply to common subexpression
coalescion. And that may be worse enough.
The people I might introduce functional programming to might not
be very interested in the implementations. They might take
my word for that Referential Transparencey is a Good Thing.
But then, if it doesn't work in practice, both them and I will
be in Big Trouble.
I am sure these things can be resolved in one way or the other.
But how?
Sverker
ps. Wadler wrote the following in his paper (1987 already !):
A theory that allows one to reason about the space and time
efficiency of functional programs is long overdue. One reason
that such a theory has been late in appearing is that garbage-
collection and lazy evaluation can make it difficult to predict
the run-time behaviour of functional programs. In particular,
currently only _ad hoc_ methods are available to determine whether
a program suffers from space leaks. The method suggested here
does not make this problem any harder, but it does not make it
much easier, either.