Re: Why is there a space leak here?

2001-06-11 Thread Carl R. Witty

S. Alexander Jacobson [EMAIL PROTECTED] writes:

 On 6 Jun 2001, Carl R. Witty wrote:
 
  S. Alexander Jacobson [EMAIL PROTECTED] writes:
 
   For example w/ foldl:
  
   foldl + 0 [1..1]
   foldl (+) ((+) 0 1) [2..1]
   foldl (+) ((+) ((+) 0 1) 2) [3..1]
  
   Can't the implementation notice that each iteration leads to a
   larger closure and, if it is running out of space go ahead an just
   evaluate (+) 0 1?
 
  It's complicated.  You can't (in general) know whether application of
  a function will increase or decrease the space used.  If you were
  running out of space, would you just search the whole unevaluated
  program graph for reductions which somehow seemed likely to reduce
  the space used?  Would you add such reduction nodes to some global
  list at the time they were created?
 
 I'm not clear why you can't in general notice that you are using
 more space after function application than before.  I it hard to see why a
 program couldn't do the analysis I just did on foldl.

I wasn't worried about foldl; you assumed that (+) 0 1 got smaller if
you carried out the application.  Even for (+) on Integer, this is not
guaranteed (for large integers, if something else happens to be
holding on to the summands, evaluating the addition can increase total
space usage).

 You could accumulate statistics on funtions that increase/decrease space
 used at runtime and evaluate those that do reduce space used...

Right, that's the sort of thing I meant about likely above.  But how
do you find such function applications in the global program graph, if
you seem to be running low on space?  (And you also need to realize
that some functions might usually have small outputs, and sometimes
have large outputs.)

  One portable way to implement a memoizing function in Haskell (if the
  domain of the function is countable) is to lazily build a data
  structure that contains the results of the function on every possible
  argument.  Then you evaluate the portions of the data structure that
  you need; the result on each argument is only evaluated once.  This
  probably would count as a growing expression, and it's certainly
  possible that the function on some arguments would be bottom.
 
 I don't think I understood this.  Can you clarify?

Let me know if JCAB's response wasn't enough here.

Carl Witty

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-08 Thread S. Alexander Jacobson

On 6 Jun 2001, Carl R. Witty wrote:

 S. Alexander Jacobson [EMAIL PROTECTED] writes:

  For example w/ foldl:
 
  foldl + 0 [1..1]
  foldl (+) ((+) 0 1) [2..1]
  foldl (+) ((+) ((+) 0 1) 2) [3..1]
 
  Can't the implementation notice that each iteration leads to a
  larger closure and, if it is running out of space go ahead an just
  evaluate (+) 0 1?

 It's complicated.  You can't (in general) know whether application of
 a function will increase or decrease the space used.  If you were
 running out of space, would you just search the whole unevaluated
 program graph for reductions which somehow seemed likely to reduce
 the space used?  Would you add such reduction nodes to some global
 list at the time they were created?

I'm not clear why you can't in general notice that you are using
more space after function application than before.  I it hard to see why a
program couldn't do the analysis I just did on foldl.

You could accumulate statistics on funtions that increase/decrease space
used at runtime and evaluate those that do reduce space used...

  I realize that there is a risk of evaluating _|_ unnecessarily, but if you
  are otherwise going to run out of memory, you might as well give it a
  shot.
 
  In practice, how often do you expect to see growing expressions that cover
  a _|_ that are not actually an error in any case?

 It's certainly possible.

You are trading off the likelihood that an exploding expression contains a
bottom against the liklihood that the programmer would prefer the
exploding expression not to explode.  Much of this type of work can be
done as test-time warnings

 One portable way to implement a memoizing function in Haskell (if the
 domain of the function is countable) is to lazily build a data
 structure that contains the results of the function on every possible
 argument.  Then you evaluate the portions of the data structure that
 you need; the result on each argument is only evaluated once.  This
 probably would count as a growing expression, and it's certainly
 possible that the function on some arguments would be bottom.

I don't think I understood this.  Can you clarify?


-Alex-

___
S. Alexander Jacobson   Shop.Com
1-646-638-2300 voiceThe Easiest Way To Shop (sm)


___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here? (fwd)

2001-06-07 Thread Malcolm Wallace

Hi Claus,

I've had a look at this example in GHood, and you are right, nhc98
does seem to create several copies of v-in.

 import Observe
 
 foo1 m
= take m (observe v-out v)
  where
v = 1 : concat (map triple (observe v-in v))
triple x = [x,x,x]
 
 main = printO $ (foo1 100::[Int])

 Perhaps Malcolm can explain what nhc98 does with this example?

At a guess, I think the answer lies in lambda-lifting.

   nhc98 -c Leak.hs +CTS -lift -CTS

shows this code for `v' (reformatted a little to make it easier to read):

  Main.Prelude.173.v = \ v223 v224 -
Prelude.:
  (Prelude._apply1  (Prelude.fromInteger v223)  1L)
  (Prelude._apply1
(Prelude.concat)
(Prelude.map (Main.Prelude.174.triple)
 (Observe.observe (Observe.Observable.Prelude.[]  v224)
  (LAMBDA225)
  (Prelude._apply2
  (Main.Prelude.173.v) v223  v224

v takes two arguments; v223 represents the numeric dictionary, and
v224 the Observer dictionary.  The recursive reference to v is not
a cyclic pointer to a constant structure, but actually a function call.

I believe the real culprit is that nhc98 does not implement the
monomorphism restriction.  IIRC, the DMR applies to every group
of simple pattern bindings at the same let-bound scope, not just
the top-level, so really we ought to default-away the dictionaries,
which would solve this particular space oddity.

Regards,
Malcolm

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-06 Thread Carl R. Witty

S. Alexander Jacobson [EMAIL PROTECTED] writes:

 For example w/ foldl:
 
 foldl + 0 [1..1]
 foldl (+) ((+) 0 1) [2..1]
 foldl (+) ((+) ((+) 0 1) 2) [3..1]
 
 Can't the implementation notice that each iteration leads to a
 larger closure and, if it is running out of space go ahead an just
 evaluate (+) 0 1?

It's complicated.  You can't (in general) know whether application of
a function will increase or decrease the space used.  If you were
running out of space, would you just search the whole unevaluated
program graph for reductions which somehow seemed likely to reduce
the space used?  Would you add such reduction nodes to some global
list at the time they were created?

 I realize that there is a risk of evaluating _|_ unnecessarily, but if you
 are otherwise going to run out of memory, you might as well give it a
 shot.
 
 In practice, how often do you expect to see growing expressions that cover
 a _|_ that are not actually an error in any case?

It's certainly possible.

One portable way to implement a memoizing function in Haskell (if the
domain of the function is countable) is to lazily build a data
structure that contains the results of the function on every possible
argument.  Then you evaluate the portions of the data structure that
you need; the result on each argument is only evaluated once.  This
probably would count as a growing expression, and it's certainly
possible that the function on some arguments would be bottom.

 Hunting down memory leaks is already so obscure, that you might as well
 take a shot at solving the problem automatically...
 
 Alternatively, is there some magical way of warning about leaky
 expressions at compile time?  You don't have to ban them, but it would be
 nice if the programmer were aware of which parts of the code are likely to
 grow...

In general, this problem is uncomputable.  It might be possible to
come up with some useful approximation, but I bet that's a very
difficult research problem.

Carl Witty

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread Tom Moertel

Alastair David Reid wrote:
 
 Executive summary: David's program has an incredibly subtle space leak
 in it (or I'm being incredibly dumb).  I encourage the honchos (and
 would be honchos) to have a look.  Users of other compilers might give
 it a shot too.

 David Bakin wrote:
 
 Why is there a space leak in foo1 but not in foo2?

The reason that foo1 leaks space is because the middle of v grows
faster than its head.  So taking elements from v causes its in-memory
footprint to grow.  To see why this is the case, evaluate foo1 by hand:

 -- This has a space leak, e.g., when reducing (length (foo1 100))
 foo1 m
   = take m v
 where
   v = 1 : flatten (map triple v)
   triple x = [x,x,x]

Focusing on just v for now, and letting f = flatten for notation
purposes, we have

(1) v = 1 : f (map triple v)

(2)   = { unwrap v }
1 : f (map triple (1 : f (map triple v)))

(3)   = { eval map }
1 : f (triple 1 : map triple (f (map triple v)))

(4)   = { eval triple }
1 : f ([1,1,1] : map triple (f (map triple v)))

(5)   = { eval f (= flatten = foldr (++) []) }
1 : 1 : 1 : 1 : f (map triple (f (map triple v

In order to expose elements 2-4 of v, we had to evaluate v to the extent
that the overall expression held in memory *grew*.  Notice how in (1) we
had a single (f (map triple ...)) expression in the tail of v but in (5)
there are two such expressions, nested.

Continuing further, if we want to expose the 5th-7th elements of v, we
have to expand the expression yet even more.   Noticing that the (f (map
triple v)) subexpression in (5) is identical to the tail of (1), we can
apply the same expansion that we derived in (1)-(5) to yield

(6)   = { repeat (1)-(5) for f (map triple v) in (5) }
1 : 1 : 1 : 1 :
f (map triple (1 : 1 : 1 :
f (map triple (
f (map triple v)))

(7)   = { eval map }
1 : 1 : 1 : 1 :
f (triple 1 : map triple (
f (map triple (
f (map triple v

(8)   = { eval triple }
1 : 1 : 1 : 1 :
f ([1,1,1] : map triple (
f (map triple (
f (map triple v

(9)   = { eval f }
1 : 1 : 1 : 1 : 1 : 1 : 1 :
f (map triple (
f (map triple (
f (map triple v)

Notice how in (9) we have three nested (f (map triple (...)))
expressions in the tail of v whereas in (5) we had only two and in (1)
we had but one?

Now you can see why foo1 has a space leak:  In order to take the Nth
element of v, v's definition must be expanded to the point where there
are 1+(N+1)/3 (f (map triple (...))) subexpressions in the tail of v
*that will never be reached*.  In other words, v's middle grows faster
than its head, ensuring that take will never consume the tail.  Taking
elements from the head only makes the middle grow larger.  The more your
take, the larger it grows.

So the problem isn't Hugs but rather the definition of v, which grows
faster than it can be consumed.

Cheers,
Tom

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread S. Alexander Jacobson

This whole discussion seems strange...
Is laziness an operational or a semantic issue?
Why can't haskell implementations reduce some expressions to save space?
In particular, why can't haskell mark expressions that grow after
evaluation, and reduce them if too much space is being consumed.

For example w/ foldl:

foldl + 0 [1..1]
foldl (+) ((+) 0 1) [2..1]
foldl (+) ((+) ((+) 0 1) 2) [3..1]

Can't the implementation notice that each iteration leads to a
larger closure and, if it is running out of space go ahead an just
evaluate (+) 0 1?

I realize that there is a risk of evaluating _|_ unnecessarily, but if you
are otherwise going to run out of memory, you might as well give it a
shot.

In practice, how often do you expect to see growing expressions that cover
a _|_ that are not actually an error in any case?

Hunting down memory leaks is already so obscure, that you might as well
take a shot at solving the problem automatically...

Alternatively, is there some magical way of warning about leaky
expressions at compile time?  You don't have to ban them, but it would be
nice if the programmer were aware of which parts of the code are likely to
grow...

-Alex-




On Tue, 5 Jun 2001, Tom Moertel wrote:

 Alastair David Reid wrote:
 
  Executive summary: David's program has an incredibly subtle space leak
  in it (or I'm being incredibly dumb).  I encourage the honchos (and
  would be honchos) to have a look.  Users of other compilers might give
  it a shot too.

  David Bakin wrote:
 
  Why is there a space leak in foo1 but not in foo2?

 The reason that foo1 leaks space is because the middle of v grows
 faster than its head.  So taking elements from v causes its in-memory
 footprint to grow.  To see why this is the case, evaluate foo1 by hand:

  -- This has a space leak, e.g., when reducing (length (foo1 100))
  foo1 m
= take m v
  where
v = 1 : flatten (map triple v)
triple x = [x,x,x]

 Focusing on just v for now, and letting f = flatten for notation
 purposes, we have

 (1) v = 1 : f (map triple v)

 (2)   = { unwrap v }
 1 : f (map triple (1 : f (map triple v)))

 (3)   = { eval map }
 1 : f (triple 1 : map triple (f (map triple v)))

 (4)   = { eval triple }
 1 : f ([1,1,1] : map triple (f (map triple v)))

 (5)   = { eval f (= flatten = foldr (++) []) }
 1 : 1 : 1 : 1 : f (map triple (f (map triple v

 In order to expose elements 2-4 of v, we had to evaluate v to the extent
 that the overall expression held in memory *grew*.  Notice how in (1) we
 had a single (f (map triple ...)) expression in the tail of v but in (5)
 there are two such expressions, nested.

 Continuing further, if we want to expose the 5th-7th elements of v, we
 have to expand the expression yet even more.   Noticing that the (f (map
 triple v)) subexpression in (5) is identical to the tail of (1), we can
 apply the same expansion that we derived in (1)-(5) to yield

 (6)   = { repeat (1)-(5) for f (map triple v) in (5) }
 1 : 1 : 1 : 1 :
 f (map triple (1 : 1 : 1 :
 f (map triple (
 f (map triple v)))

 (7)   = { eval map }
 1 : 1 : 1 : 1 :
 f (triple 1 : map triple (
 f (map triple (
 f (map triple v

 (8)   = { eval triple }
 1 : 1 : 1 : 1 :
 f ([1,1,1] : map triple (
 f (map triple (
 f (map triple v

 (9)   = { eval f }
 1 : 1 : 1 : 1 : 1 : 1 : 1 :
 f (map triple (
 f (map triple (
 f (map triple v)

 Notice how in (9) we have three nested (f (map triple (...)))
 expressions in the tail of v whereas in (5) we had only two and in (1)
 we had but one?

 Now you can see why foo1 has a space leak:  In order to take the Nth
 element of v, v's definition must be expanded to the point where there
 are 1+(N+1)/3 (f (map triple (...))) subexpressions in the tail of v
 *that will never be reached*.  In other words, v's middle grows faster
 than its head, ensuring that take will never consume the tail.  Taking
 elements from the head only makes the middle grow larger.  The more your
 take, the larger it grows.

 So the problem isn't Hugs but rather the definition of v, which grows
 faster than it can be consumed.

 Cheers,
 Tom

 ___
 Haskell mailing list
 [EMAIL PROTECTED]
 http://www.haskell.org/mailman/listinfo/haskell


___
S. Alexander Jacobson   Shop.Com
1-646-638-2300 voiceThe Easiest Way To Shop (sm)


___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread Wojciech Moczydlowski, Jr

On Tue, 5 Jun 2001, Tom Moertel wrote:

 The reason that foo1 leaks space is because the middle of v grows
 faster than its head.  So taking elements from v causes its in-memory
 footprint to grow.  To see why this is the case, evaluate foo1 by hand:
 
 So the problem isn't Hugs but rather the definition of v, which grows
 faster than it can be consumed.

 Tom

How come then that the very program compiled under nhc98 evaluates without
any problem, with memory usage below 1M during its execution?

Wojciech Moczydlowski, Jr


___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread Tom Moertel

Wojciech Moczydlowski, Jr wrote:
 
 How come then that the very program compiled under nhc98 evaluates without
 any problem, with memory usage below 1M during its execution?

My claim was that v (as defined) grew faster than it could be consumed,
not that (length (foo1 n)) couldn't be evaluated in constant space.

Even so, your results suggest that nhc98 is doing something
interesting.  Does the memory usage remain constant even if you take 10
or 100 times the number of elements?  If so, perhaps nhc98 is smart
enough to know that

length (take n x) = n

for all infinite lists x.  It might apply those smarts to optimize out
the expansion of v in foo1 when foo1's result is used as the argument of
length.  Out of curiosity, what happens if you consume those elements
with foldl' (+) 0 rather than length?  

Alternatively, if nhc98 were smart enough to prove that

foo1 n = replicate n 1

it could do away with v altogether, which would also explain the
interesting behavior.  And if nhc does *that*, my hat's off to the nhc98
folks.

Or, if your constants are hard-coded, perhaps nhc98 is evaluating the
(length foo1 100) expression at compile time.  What happens to
memory consumption if foo1's argument is supplied at run time?

Or maybe I'm mistaken about v.  Wouldn't be the first time I've done
something boneheaded.  ;-)

In any case, I am curious about what nhc98 is doing internally.  Any
ideas?

Cheers,
Tom

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread Mark Tullsen

Tom,

I noticed this post after I had just posted my own response.

You have to realize that Alastair Reid is one of the truly
great Haskell programmers on planet earth.  I'm serious.  
So, when he says incredibly subtle space leak I wouldn't 
expect the solution to be simple.  As far as I can tell, your 
argument would also apply to foo2, which doesn't have a space leak.

I'd be happy to be proven wrong, but I think this space leak
really /is/ subtle and in order to see the problem seems to
require some /tedious/ hand-reductions, taking into account both 
the sharing and the strictness properties.  See my recent posting
for a very brute-force analysis.

- Mark

Tom Moertel wrote:
 
 Alastair David Reid wrote:
 
  Executive summary: David's program has an incredibly subtle space leak
  in it (or I'm being incredibly dumb).  I encourage the honchos (and
  would be honchos) to have a look.  Users of other compilers might give
  it a shot too.
 
  David Bakin wrote:
 
  Why is there a space leak in foo1 but not in foo2?
 
 The reason that foo1 leaks space is because the middle of v grows
 faster than its head.  So taking elements from v causes its in-memory
 footprint to grow.  To see why this is the case, evaluate foo1 by hand:
 
  -- This has a space leak, e.g., when reducing (length (foo1 100))
  foo1 m
= take m v
  where
v = 1 : flatten (map triple v)
triple x = [x,x,x]
 
 Focusing on just v for now, and letting f = flatten for notation
 purposes, we have
 
 (1) v = 1 : f (map triple v)
 
 (2)   = { unwrap v }
 1 : f (map triple (1 : f (map triple v)))
 
 (3)   = { eval map }
 1 : f (triple 1 : map triple (f (map triple v)))
 
 (4)   = { eval triple }
 1 : f ([1,1,1] : map triple (f (map triple v)))
 
 (5)   = { eval f (= flatten = foldr (++) []) }
 1 : 1 : 1 : 1 : f (map triple (f (map triple v
 
 In order to expose elements 2-4 of v, we had to evaluate v to the extent
 that the overall expression held in memory *grew*.  Notice how in (1) we
 had a single (f (map triple ...)) expression in the tail of v but in (5)
 there are two such expressions, nested.
 
 Continuing further, if we want to expose the 5th-7th elements of v, we
 have to expand the expression yet even more.   Noticing that the (f (map
 triple v)) subexpression in (5) is identical to the tail of (1), we can
 apply the same expansion that we derived in (1)-(5) to yield
 
 (6)   = { repeat (1)-(5) for f (map triple v) in (5) }
 1 : 1 : 1 : 1 :
 f (map triple (1 : 1 : 1 :
 f (map triple (
 f (map triple v)))
 
 (7)   = { eval map }
 1 : 1 : 1 : 1 :
 f (triple 1 : map triple (
 f (map triple (
 f (map triple v
 
 (8)   = { eval triple }
 1 : 1 : 1 : 1 :
 f ([1,1,1] : map triple (
 f (map triple (
 f (map triple v
 
 (9)   = { eval f }
 1 : 1 : 1 : 1 : 1 : 1 : 1 :
 f (map triple (
 f (map triple (
 f (map triple v)
 
 Notice how in (9) we have three nested (f (map triple (...)))
 expressions in the tail of v whereas in (5) we had only two and in (1)
 we had but one?
 
 Now you can see why foo1 has a space leak:  In order to take the Nth
 element of v, v's definition must be expanded to the point where there
 are 1+(N+1)/3 (f (map triple (...))) subexpressions in the tail of v
 *that will never be reached*.  In other words, v's middle grows faster
 than its head, ensuring that take will never consume the tail.  Taking
 elements from the head only makes the middle grow larger.  The more your
 take, the larger it grows.
 
 So the problem isn't Hugs but rather the definition of v, which grows
 faster than it can be consumed.
 
 Cheers,
 Tom
 
 ___
 Haskell mailing list
 [EMAIL PROTECTED]
 http://www.haskell.org/mailman/listinfo/haskell

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread Steinitz, Dominic J

I'd love it if someone could write a tutorial paper on space leaks. Even with the 
explanations that have been provided, I find it difficult to understand why 
expressions get evaluated in a particular order and why garbage collections happen at 
a given point.

Dominic.

-
21st century air travel http://www.britishairways.com

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread Claus Reinke

 Alastair David Reid wrote:
  Executive summary: David's program has an incredibly subtle space leak
  in it (or I'm being incredibly dumb).  I encourage the honchos (and
  would be honchos) to have a look.  Users of other compilers might give
  it a shot too.

I think there have been several good explanations already, or
is there anything wrong with them?

As Olaf pointed out, one might use GHood for the job:
http://www.cs.ukc.ac.uk/people/staff/cr3/toolbox/haskell/GHood/

An example on how to add observations in this case, and 
how to interpret the results might be helpful to those who 
haven't used GHood.

1. Code:

import Observe

foo1 m
   = take m (observe v-out v)
 where
   v = 1 : concat (map triple (observe v-in v))
   triple x = [x,x,x]

main = printO $ (foo1 100::[Int])

2. Interpretation:

Using Hugs to generate the observations, you should see the two
views of v evolving just as other mails in this thread have explained,
i.e., v-out grows at three times the rate of v-in. Remembering
that these are two views of the same structure, the problem is that
generating v-out depends on v-in, which is v-out;-) which means 
that the program execution should hold on to whatever v-out 
delivers until observers of v-in are through with it. In simpler words: 
 
v-out to v-in: you ain't seen nothing yet!.

3. Variation:

Wojciech suggested that nhc's behaviour seems to differ slightly,
which prompted me to try whether this would be visible in GHood.

For explanation: Hood is portable, in that it works with most Haskell 
implementations (special version make use of more features, when 
available, and cater for renamed IOExtras..), but as it instruments 
those Haskell implementations to do its work, its observations can 
actually depend on what the implementation does.

As GHood shows you observations in more detail, you'll see even
more differences (such as: evaluation order of additions in nhc98
seems to depend on the type;-).
 
Trying the code above with nhc98-1.02 and the matching variant
of Observe.lhs, you'll see something odd: instead of two views of
v evolving in parallel, further copies of the v-in-view are created.
So, every three elements in v-out, we need another element of
v-in(1), every three elements in v-in(1), we seem to need another
element of v-in(2), etc. 

Perhaps Malcolm can explain what nhc98 does with this example?

Oh, and for all the honchos Alastairs referred to: I seem to 
remember that the work on preserving cycles with lazy memo
functions also had some comments about avoiding unnecessary
growth of cyclic structures. Can anyone figure out how to apply
that to this example (or tell me that it is impossible)?

Hth,
Claus

PS: Getting new email in before sending this off, I see that
  some explainers now refer to themselves as foolish, but
  I'll send this off anyway, at the risk of adding myself to
  that foolish Haskell programmers club:-)



___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-06-05 Thread Mark Tullsen

Tom Moertel wrote:
 
 Mark Tullsen wrote:
 
  You have to realize that Alastair Reid is one of the truly
  great Haskell programmers on planet earth.  I'm serious.
  So, when he says incredibly subtle space leak I wouldn't
  expect the solution to be simple.
 
 Whoops.  Now don't I feel foolish.
 
  As far as I can tell, your argument would also apply to foo2,
  which doesn't have a space leak.
 
 Hmmm...  Let's see.
 
 foo2 m
   = take m v
 where
   v = 1 : flatten (map single v)
   single x = [x]
 
 v = 1 : flatten (map single v)
   = 1 : flatten (map single (1 : flatten (map single v)))
   = 1 : flatten (single 1 : map single (flatten (map single v)))
   = 1 : flatten ([1] : map single (flatten (map single v)))
   = 1 : 1 : flatten (map single (flatten (map single v)))
   = Aaaarrh!  You're right.
 
 Now don't I feel double foolish.  :P
 
 Okay, then, what is the *right* way to reason about these things?
 
 Cheers,
 Tom
 
Tom,

I don't know if this approach is the *right* way but it's one
way.  This approach is very brute force, and I'm sure there are experts 
out there who can think and reason at a much higher level than
this.  But the brute force approach is this:

  Start evaluating your program symbolically
You can do this at the source level using a CBN (call-by-need)
calculus.
  
  If the program (the program in CBN includes the heap) starts
  growing in size faster than expected then you have a space leak.

Simple, but a bit tedious.  It would be great if we had a tool that
could output such a trace.

- Mark

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Why is there a space leak here?

2001-05-29 Thread David Bakin

That's a very nice visualization - exactly the kind of thing I was hoping
for.  I grabbed your papers and will look over them for more information,
thanks very much for taking the trouble! The animations you sent me - and
the ones on your page - are really nice; it would be nice to have a system
like HOPS available with GHC/GHCi (I understand it is more than the
visualization system, but that's a really useful part of it).

(I also found out that Hood didn't help for this particular purpose - though
now that I see how easy it is to use I'll be using it all the time.  But it
is carefully designed to show you (observe) exactly what has been
evaluated at a given point in the program.  Thus you can't use it (as far as
I can tell) to show the data structures that are accumulating that haven't
been processed yet - which is what you need to know to find a space
problem.)

-- Dave


- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, May 29, 2001 12:49 AM
Subject: Re: Why is there a space leak here?



 David Bakin [EMAIL PROTECTED] writes:

   Which of the various tools built-in or added to Hugs, GHC, NHC, etc.
would
   help me visualize what's actually going on here?  I think Hood would
(using
   a newer Hugs, of course, I'm going to try it).  What else?

 I just used my old ghc-4.06 add-in ``middle-end'' ``MHA'' to generate
 a HOPS module from David's message (slightly massaged, appended below),
 and then used HOPS to generate two animations:

 http://ist.unibw-muenchen.de/kahl/MHA/Bakin_foo1_20.ps.gz
 http://ist.unibw-muenchen.de/kahl/MHA/Bakin_foo2_20.ps.gz

 Hold down the space key in ghostview to get the animation effect!

 The left ``fan'', present in both examples, is the result list,
 and only takes up space in reality as long as it is used.
 The right ``fan'', visible only in foo1, contains the cycle
 of the definition of v, and represents the space leak.
 The take copies cons node away from the cycle.

 The HOPS input was generated automatically by an unreleased
 GHC ``middle end'' that is still stuck at ghc-4.06.

 The homepage of my term graph programming system HOPS is:

 http://ist.unibw-muenchen.de/kahl/HOPS/


 Wolfram




  module Bakin where

 -- This has a space leak, e.g., when reducing (length (foo1 100))

  foo1 :: Int - [Int]
  foo1 m
= take m v
  where
v = 1 : flatten (map triple v)
triple x = [x,x,x]

 -- This has no space leak, e.g., when reducing (length (foo2 100))

  foo2 :: Int - [Int]
  foo2 m
= take m v
  where
v = 1 : flatten (map single v)
single x = [x]

 -- flatten a list-of-lists

  flatten :: [[a]] - [a]
  flatten [] = []
  flatten ([]:xxs)   = flatten xxs
  flatten ((x':xs'):xxs) = x' : flatten' xs' xxs
  flatten' [] xxs= flatten xxs
  flatten' (x':xs') xxs  = x': flatten' xs' xxs



___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: Why is there a space leak here?

2001-05-29 Thread Olaf Chitil

David Bakin wrote:
 
 That's a very nice visualization - exactly the kind of thing I was hoping
 for.  I grabbed your papers and will look over them for more information,
 thanks very much for taking the trouble! The animations you sent me - and
 the ones on your page - are really nice; it would be nice to have a system
 like HOPS available with GHC/GHCi (I understand it is more than the
 visualization system, but that's a really useful part of it).
 
 (I also found out that Hood didn't help for this particular purpose - though
 now that I see how easy it is to use I'll be using it all the time.  But it
 is carefully designed to show you (observe) exactly what has been
 evaluated at a given point in the program.  Thus you can't use it (as far as
 I can tell) to show the data structures that are accumulating that haven't
 been processed yet - which is what you need to know to find a space
 problem.)

You can use GHood,
http://www.cs.ukc.ac.uk/people/staff/cr3/toolbox/haskell/GHood/ .
It animates the evaluation process at a given point in the program.
It doesn't show unevaluated expressions, but for most purposes you don't
need to see them. Most important is to see when which subexpression is
evaluated.
(an older version of Hood, which is distributed with nhc, also has an
animation facility)

At some time we also hope to provide such an animation viewer for our
Haskell tracer Hat. The trace already contains all the necessary
information.

Ciao,
Olaf

-- 
OLAF CHITIL, 
 Dept. of Computer Science, University of York, York YO10 5DD, UK. 
 URL: http://www.cs.york.ac.uk/~olaf/
 Tel: +44 1904 434756; Fax: +44 1904 432767

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Why is there a space leak here?

2001-05-28 Thread David Bakin



Why is there a space leak in foo1 but not in foo2? 
(I.e., in Hugs Nov '99) foo1 eats cells (and eventually runs out) where foo2 
doesn't. That is, if I do (length (foo1 100)) I eventually run 
out of cells but (length (foo2 100)) runs fine (every GC returns basically 
the same amount of space). Something must be wrong in flatten but it 
follows the pattern of many functions in the prelude (which I'm trying to learn 
from).

I have been puzzling over this for nearly a full day (getting 
this reduced version from my own code which wasn't working). In general, 
how can I either a) analyze code looking for a space leak or b) experiment 
(e.g., using Hugs) to find a space leak? Thanks! -- 
Dave

-- This has a space leak, e.g., when 
reducing (length (foo1 100))foo1 m  = take m 
v where v = 1 : flatten 
(map triple v) triple x = [x,x,x]

-- This has no space leak, e.g., when 
reducing (length (foo2 100))foo2 m  = take m 
v where v = 1 : flatten 
(map single v) single x = [x]

-- flatten a list-of-listsflatten :: 
[[a]] - [a]flatten 
[] = 
[]flatten ([]:xxs) = flatten 
xxsflatten ((x':xs'):xxs) = x' : flatten' xs' xxsflatten' [] 
xxs = flatten xxsflatten' (x':xs') 
xxs = x': flatten' xs' xxs




Why is there a space leak here?

2001-05-28 Thread Tom Pledger

David Bakin writes:
 :
 | I have been puzzling over this for nearly a full day (getting this
 | reduced version from my own code which wasn't working).  In
 | general, how can I either a) analyze code looking for a space leak
 | or b) experiment (e.g., using Hugs) to find a space leak?  Thanks!
 | -- Dave

a) Look at how much of the list needs to exist at any one time.

 | -- This has a space leak, e.g., when reducing (length (foo1 100))
 | foo1 m 
 |   = take m v
 | where
 |   v = 1 : flatten (map triple v)
 |   triple x = [x,x,x]

When you consume the (3N)th cell of v, you can't yet garbage collect
the Nth cell because it will be needed for generating the (3N+1)th,
(3N+2)th and (3N+3)th.

So, as you proceed along the list, about two thirds of it must be
retained in memory.

 | -- This has no space leak, e.g., when reducing (length (foo2 100))
 | foo2 m 
 |   = take m v
 | where
 |   v = 1 : flatten (map single v)
 |   single x = [x]

By contrast, when you consume the (N+1)th cell of this v, you free up
the Nth, so foo2 runs in constant space.

 | -- flatten a list-of-lists
 | flatten :: [[a]] - [a]
 :

Rather like concat?

Regards,
Tom

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: Why is there a space leak here?

2001-05-28 Thread Michal Gajda

On Tue, 29 May 2001, Tom Pledger wrote:

 David Bakin writes:

 a) Look at how much of the list needs to exist at any one time.
 
  | -- This has a space leak, e.g., when reducing (length (foo1 100))
  | foo1 m 
  |   = take m v
  | where
  |   v = 1 : flatten (map triple v)
  |   triple x = [x,x,x]
 
 When you consume the (3N)th cell of v, you can't yet garbage collect
 the Nth cell because it will be needed for generating the (3N+1)th,
 (3N+2)th and (3N+3)th.
 
 So, as you proceed along the list, about two thirds of it must be
 retained in memory.

Last sentence seems false. You free up Nth cell of v when you finish with
3Nth cell of result.

  | -- This has no space leak, e.g., when reducing (length (foo2 100))
  | foo1 m 
(...the only difference below:)
  |   single x = [x]

Greetings :-)
Michal Gajda
[EMAIL PROTECTED]
*knowledge-hungry student*




___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: Why is there a space leak here?

2001-05-28 Thread Tom Pledger

Michal Gajda writes:
 | On Tue, 29 May 2001, Tom Pledger wrote:
 :
 |  When you consume the (3N)th cell of v, you can't yet garbage collect
 |  the Nth cell because it will be needed for generating the (3N+1)th,
 |  (3N+2)th and (3N+3)th.
 |  
 |  So, as you proceed along the list, about two thirds of it must be
 |  retained in memory.
 | 
 | Last sentence seems false. You free up Nth cell of v when you
 | finish with 3Nth cell of result.

I counted from 0.  Scouts' honour.  Call (!!) as a witness.

;-)

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



Re: Why is there a space leak here?

2001-05-28 Thread David Bakin

Ah, thank you for pointing out concat to me.  (Oddly, without knowing about
concat, I had tried foldr1 (++) and also foldl1 (++) but got the same space
problem and so tried to 'factor it out'.)

OK, now I see what's going on - your explanation is good, thanks.

Which of the various tools built-in or added to Hugs, GHC, NHC, etc. would
help me visualize what's actually going on here?  I think Hood would (using
a newer Hugs, of course, I'm going to try it).  What else?

-- Dave


- Original Message -
From: Tom Pledger [EMAIL PROTECTED]
To: David Bakin [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, May 28, 2001 1:59 PM
Subject: Why is there a space leak here?


 David Bakin writes:
  :
  | I have been puzzling over this for nearly a full day (getting this
  | reduced version from my own code which wasn't working).  In
  | general, how can I either a) analyze code looking for a space leak
  | or b) experiment (e.g., using Hugs) to find a space leak?  Thanks!
  | -- Dave

 a) Look at how much of the list needs to exist at any one time.

  | -- This has a space leak, e.g., when reducing (length (foo1 100))
  | foo1 m
  |   = take m v
  | where
  |   v = 1 : flatten (map triple v)
  |   triple x = [x,x,x]

 When you consume the (3N)th cell of v, you can't yet garbage collect
 the Nth cell because it will be needed for generating the (3N+1)th,
 (3N+2)th and (3N+3)th.

 So, as you proceed along the list, about two thirds of it must be
 retained in memory.

  | -- This has no space leak, e.g., when reducing (length (foo2 100))
  | foo2 m
  |   = take m v
  | where
  |   v = 1 : flatten (map single v)
  |   single x = [x]

 By contrast, when you consume the (N+1)th cell of this v, you free up
 the Nth, so foo2 runs in constant space.

  | -- flatten a list-of-lists
  | flatten :: [[a]] - [a]
  :

 Rather like concat?

 Regards,
 Tom



___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe



RE: Test case Re: Is there a space leak here?

2000-02-28 Thread Simon Peyton-Jones

Joe

I've not really looked at your code in detail, but this does
occur to me:

| -- deserialize builds a tree from a list of events;
| -- (deserialize . serialize) = id
| --
| -- This is the problematic function.
| --
| deserialize :: [Event a] - Tree a
| deserialize events = head (fst (build events)) where
| build :: [Event a] - ([Tree a], [Event a])
| build [] = ([],[])
| build (e:es) = case e of
|   Data a -
|   let (siblings, rest) = build es
|   in  (Tree a [] : siblings, rest)
|   StartTag a -
|   let (children,es')  = build es
|   (siblings,rest) = build es'
|   in (Tree a children : siblings, rest)
|   EndTag - ([],es)

Consider the Data a branch. It'll be compiled to

let t = build es
siblings = fst t
rest = snd t
in ...

In many implementations, until you evaluate 'rest', you'll
keep the whole of 't'; which in turn contains 'siblings'.
So there's a danger of a leak here. 

This is a well known problem.  You can force t early by saying

case build es of
  (siblings, rest) - ...

but that may give you a different sort of space leak: you are
trying to do everthing incrementally.

GHC does a standard trick, which is to perform the 'snd'
in the garbage collector, so the original form shouldn't
leak.  I don't think Hugs does.  (But it will when we release STG
Hugs.)


I may be miles astray here.

Simon



Re: Test case Re: Is there a space leak here?

2000-02-28 Thread Joe English


Simon Peyton-Jones [EMAIL PROTECTED] wrote:

 Consider the Data a branch. It'll be compiled to
   let t = build es
   siblings = fst t
   rest = snd t
   in ...
 In many implementations, until you evaluate 'rest', you'll
 keep the whole of 't'; which in turn contains 'siblings'.
 So there's a danger of a leak here.

That's more or less what I suspected...

 GHC does a standard trick, which is to perform the 'snd'
 in the garbage collector, so the original form shouldn't
 leak.  I don't think Hugs does.  (But it will when we release STG
 Hugs.)

That must be it!

I just tried the developer snapshot of STG Hugs, and (since
I couldn't get the profiler to work) ran it with a heap size
just big enough to hold the Prelude and my test case, plus ~10K
words to spare.

After fixing the *other* space leak that Malcolm Wallace noted,
(too much laziness in 'length' and 'sum') all the tests ran without
a problem.  That was using breadth=12 and depth=6, which makes Hugs98
run out of room even with the default heap size.

Problem solved!  Thanks!


--Joe English

  [EMAIL PROTECTED]



Re: Test case Re: Is there a space leak here?

2000-02-28 Thread Joe English


I wrote:  [...]
 I just tried the developer snapshot of STG Hugs, and [...]
 After fixing the *other* space leak that Malcolm Wallace noted,
 all the tests ran without a problem. [...]
 Problem solved!  Thanks!

Forgot to mention: my "real" program -- the XML parser --
also runs without a space problem under STG Hugs.


--Joe English

  [EMAIL PROTECTED]



Fwd: Re: Is there a space leak here?

2000-02-27 Thread Bill Halchin


Where is the XML stuff??

Bill Halchin


From: Joe English [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Re: Is there a space leak here?
Date: Sat, 26 Feb 2000 16:27:47 -0800

"Mark P Jones" [EMAIL PROTECTED] wrote:

  Joe: As you've observed, the space behavior of Haskell
  programs is often very subtle, and hard to understand.
  I glanced quickly over your program but didn't see any
  immediate signs of problems.  My first suggestion would
  be that you try using the rudimentary heap profiler that
  Hugs provides to see if this gives some insight into the
  source of the leak.

Ah!  I didn't even know Hugs had this.  Very useful!

This has turned up some interesting results... here's
what I've found so far:

Running the parser by itself (parseInstance :: String -
[XMLEvent]) yields a nice flat space profile (actually
it's rather spiky, but it definitely runs in bounded space).
So 'parseInstance' by itself doesn't seem to have a space leak.
But when I feed its output to 'preorderTree . treeBuild' (where
treeBuild :: [XMLEvent] - Tree XMLNode and preorderTree ::
Tree a - [a]), the space usage grows linearly, with a sharp
dropoff very near the end.

Further investigation shows that the problem is *definitely*
in the tree builder.  It's not specific to Hugs either,
GHC behaves the same way.

  Failing that, it might be worth trying to put together
  a complete example (program and data) that demonstrates
  the problem.  I find it rather hard to think about examples
  like this in the abstract.  Having code that I can actually
  run, can make a big difference in situations like this.

I've boiled it down to a short test case; will post that
here presently.

Some more background on what I'm working on...  There are
two traditional approaches to processing SGML and XML:
the event-driven approach, where the application responds
to start-tag, end-tag, and data events; and the tree-based
approach, where the application has access to the entire
document tree.  The tree-based approach tends to be easier
to use and more flexible, but common wisdom has it that the
event-driven approach is more space-efficient.  I thought:
wouldn't it be neat if you could write programs
in a tree-based style, and automatically get good space
behaviour through the magic of lazy evaluation?

There are a lot of common XML processing tasks that
are naturally expressed as a catamorphism, tree homomorphism,
or downwards accumulation (validation, namespace processing,
many simple translations to other document types,  etc.),
all of which should run in space bounded by the depth of the tree.


--Joe English

   [EMAIL PROTECTED]


__
Get Your Private, Free Email at http://www.hotmail.com




Fwd: Re: Is there a space leak here?

2000-02-27 Thread Bill Halchin


Oops! Never mind. I found!

Thnaks,

Bill

From: Joe English [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Re: Is there a space leak here?
Date: Sat, 26 Feb 2000 16:27:47 -0800

"Mark P Jones" [EMAIL PROTECTED] wrote:

  Joe: As you've observed, the space behavior of Haskell
  programs is often very subtle, and hard to understand.
  I glanced quickly over your program but didn't see any
  immediate signs of problems.  My first suggestion would
  be that you try using the rudimentary heap profiler that
  Hugs provides to see if this gives some insight into the
  source of the leak.

Ah!  I didn't even know Hugs had this.  Very useful!

This has turned up some interesting results... here's
what I've found so far:

Running the parser by itself (parseInstance :: String -
[XMLEvent]) yields a nice flat space profile (actually
it's rather spiky, but it definitely runs in bounded space).
So 'parseInstance' by itself doesn't seem to have a space leak.
But when I feed its output to 'preorderTree . treeBuild' (where
treeBuild :: [XMLEvent] - Tree XMLNode and preorderTree ::
Tree a - [a]), the space usage grows linearly, with a sharp
dropoff very near the end.

Further investigation shows that the problem is *definitely*
in the tree builder.  It's not specific to Hugs either,
GHC behaves the same way.

  Failing that, it might be worth trying to put together
  a complete example (program and data) that demonstrates
  the problem.  I find it rather hard to think about examples
  like this in the abstract.  Having code that I can actually
  run, can make a big difference in situations like this.

I've boiled it down to a short test case; will post that
here presently.

Some more background on what I'm working on...  There are
two traditional approaches to processing SGML and XML:
the event-driven approach, where the application responds
to start-tag, end-tag, and data events; and the tree-based
approach, where the application has access to the entire
document tree.  The tree-based approach tends to be easier
to use and more flexible, but common wisdom has it that the
event-driven approach is more space-efficient.  I thought:
wouldn't it be neat if you could write programs
in a tree-based style, and automatically get good space
behaviour through the magic of lazy evaluation?

There are a lot of common XML processing tasks that
are naturally expressed as a catamorphism, tree homomorphism,
or downwards accumulation (validation, namespace processing,
many simple translations to other document types,  etc.),
all of which should run in space bounded by the depth of the tree.


--Joe English

   [EMAIL PROTECTED]


__
Get Your Private, Free Email at http://www.hotmail.com




Re: Is there a space leak here?

2000-02-26 Thread Joe English


"Mark P Jones" [EMAIL PROTECTED] wrote:

 Joe: As you've observed, the space behavior of Haskell
 programs is often very subtle, and hard to understand.
 I glanced quickly over your program but didn't see any
 immediate signs of problems.  My first suggestion would
 be that you try using the rudimentary heap profiler that
 Hugs provides to see if this gives some insight into the
 source of the leak.

Ah!  I didn't even know Hugs had this.  Very useful!

This has turned up some interesting results... here's
what I've found so far:

Running the parser by itself (parseInstance :: String -
[XMLEvent]) yields a nice flat space profile (actually
it's rather spiky, but it definitely runs in bounded space).
So 'parseInstance' by itself doesn't seem to have a space leak.
But when I feed its output to 'preorderTree . treeBuild' (where
treeBuild :: [XMLEvent] - Tree XMLNode and preorderTree ::
Tree a - [a]), the space usage grows linearly, with a sharp
dropoff very near the end.

Further investigation shows that the problem is *definitely*
in the tree builder.  It's not specific to Hugs either,
GHC behaves the same way.

 Failing that, it might be worth trying to put together
 a complete example (program and data) that demonstrates
 the problem.  I find it rather hard to think about examples
 like this in the abstract.  Having code that I can actually
 run, can make a big difference in situations like this.

I've boiled it down to a short test case; will post that
here presently.

Some more background on what I'm working on...  There are
two traditional approaches to processing SGML and XML:
the event-driven approach, where the application responds
to start-tag, end-tag, and data events; and the tree-based
approach, where the application has access to the entire
document tree.  The tree-based approach tends to be easier
to use and more flexible, but common wisdom has it that the
event-driven approach is more space-efficient.  I thought:
wouldn't it be neat if you could write programs
in a tree-based style, and automatically get good space
behaviour through the magic of lazy evaluation?

There are a lot of common XML processing tasks that
are naturally expressed as a catamorphism, tree homomorphism,
or downwards accumulation (validation, namespace processing,
many simple translations to other document types,  etc.),
all of which should run in space bounded by the depth of the tree.


--Joe English

  [EMAIL PROTECTED]



Test case Re: Is there a space leak here?

2000-02-26 Thread Joe English


Here's the test case...

The space profile for tests 2 and 3 look interesting;
there are n triangular "spikes" (where 'n' is the breadth
of the tree) that drop off sharply.

My hypothesis is that 'deserialize' (the problematic function,
called 'buildTree' in my earlier message) is building up
a long chain of suspensions of the form

snd . snd . snd . ... build 

that aren't getting reduced... not sure about this though.

It takes a really large tree before Hugs runs out of space
with this test case (breadth=11, depth=6 or so).  In my
real program though there's much more data stored at each
node and it fails on modestly-sized inputs.

Thanks in advance for any advice...

--Joe English

  [EMAIL PROTECTED]

--

module SpaceLeak where

data Tree a = Tree a [Tree a]
deriving (Show, Eq)

--
-- a few of the usual polytypic functions...
--
mapTree :: (a - b) - Tree a - Tree b
mapTree f (Tree a c)=  Tree (f a) (map (mapTree f) c)

type TreeF a b  =  (a, [b])
cataTree:: (TreeF a b - b) - Tree a - b
anaTree :: (b - TreeF a b) - b - Tree a
cataTree f (Tree a c)   =  f (a,map (cataTree f) c)
anaTree g b =  let (a,bs) = g b in Tree a (map (anaTree g) bs)

--
-- and a few useful utilies...
--
preorderTree :: Tree a - [a]
preorderTree t = traverse t [] where
traverse (Tree a c) k   = a : travlist c k
travlist (c:cs) k   = traverse c (travlist cs k)
travlist [] k   = k

sizeTree :: Tree a - Integer
sizeTree = cataTree (\(_,l)- 1 + sum l)

treeChildren (Tree n c) =  c

--
-- A datatype for tree serialization:
-- This is similar to the tokens returned by an XML parser.
--
data Event a =
StartTag a
  | Data a
  | EndTag
deriving (Show,Eq)

--
-- serialize turns a tree into a list of start/data/end events.
--
serialize   :: Tree a - [Event a]
serialize t = stree t [] where
stree (Tree x []) k = Data x : k
stree (Tree x l) k  = StartTag x : slist l (EndTag : k)
slist [] k  = k
slist (t:ts) k  = stree t (slist ts k)

--
-- deserialize builds a tree from a list of events;
-- (deserialize . serialize) = id
--
-- This is the problematic function.
--
deserialize :: [Event a] - Tree a
deserialize events = head (fst (build events)) where
build :: [Event a] - ([Tree a], [Event a])
build [] = ([],[])
build (e:es) = case e of
Data a -
let (siblings, rest) = build es
in  (Tree a [] : siblings, rest)
StartTag a -
let (children,es')  = build es
(siblings,rest) = build es'
in (Tree a children : siblings, rest)
EndTag - ([],es)

--
-- 'sampleTree breadth depth' generates a tree of the specified depth,
-- where each non-leaf node node has 'breadth' children.
--
testTree breadth depth =  anaTree testCOAlg depth  where
testCOAlg n = (n, if n  1 then take breadth $ repeat (n-1) else [])

--
-- Quick sanity check to make sure 'deserialize' works as specified:
--
test0 n m =  testTree n m == (deserialize . serialize) (testTree n m) -- True

--
-- The following all run in bounded space:
-- try with ':set -d1000' in Hugs, n=4, m=6 or so...
-- In particular,  serialize $ testTree n m behaves nicely.
--
test1a n m = sizeTree $ testTree n m
test1b n m = length   $ preorderTree $ testTree n m
test1c n m = length   $ serialize$ testTree n m

--
-- These seem to leak space:
--
test2a n m = sizeTree $ deserialize $ serialize $ testTree n m
test2b n m = length   $ preorderTree $ deserialize $ serialize $ testTree n m

test3a n m = deserialize $ serialize $ testTree n m
test3b n m = preorderTree $ deserialize $ serialize $ testTree n m

-- This does not:
test4a n m = length $ treeChildren $ deserialize $ serialize $ testTree n m
-- But this does:
test4b n m = map (length . treeChildren) $ treeChildren 
 $ deserialize $serialize $ testTree n m 

-- *EOF* --



Help! Is there a space leak here?

2000-02-23 Thread Tom Pledger

Hi.

Joe English writes:
  [...] ought to run in space bounded by the depth of the tree
  (which for XML documents is typically not very deep).
  
  This turns out not to be the case; testing with Hugs
  invariably fails with a "Garbage collection fails to
  reclaim sufficient space" on even moderately sized
  documents (5000 nodes or so).
  
  Is there something obviously wrong with the above formulation?
  If so, what can I do to fix it?

Beyond the two static errors ("Node a:siblings" should be "Node a
[]:siblings", and "(Tree a c)" should be "(Node a c)"), it's fine for
space use.  I tested it with Hugs 98 (Nov 1999), and didn't find any
sign of lurking strictness:

   Main :t eventList
   eventList :: Int - [Event Char]
   Main :set +s
   Main length . fst . build . eventList $ 100
   10
   (22700039 reductions, 38600096 cells, 161 garbage collections)
   Main length . preorderTree . Node '*' . fst . build . eventList $ 100
   71
   (29900051 reductions, 47600115 cells, 198 garbage collections)
   Main :set
   ...
   Current settings: +sfewui -tgl.qk -h25 -p"%s " -r$$ -c40
   ...
   Compatibility   : Haskell 98 (+98)

(The eventList function generates an Event list of the given length,
 which will result in a very shallow tree.)

  I haven't by any means ruled out the possibility that the
  space leak is elsewhere in the code [...]

AFAICS it must be.

Regards,
Tom





Re: Help! Is there a space leak here?

2000-02-22 Thread Jan Skibinski



On Sun, 6 Feb 2000, Joe English wrote:

 This turns out not to be the case; testing with Hugs
 invariably fails with a "Garbage collection fails to
 reclaim sufficient space" on even moderately sized
 documents (5000 nodes or so).

If I remember correctly, one of the past postings
explained that the current version of Hugs does not
properly handle the tail optimizations. You will
probably get a good explanation of this from someone
else.

The reason I am answering your post is such that
I run into similar problems when executing one
of the examples from Hawk distribution. It deals
with the concrete and symbolic simulation of one
of the intel microprocessors. It supposed to simulate
an evaluation of a sequence of instructions, such
as the implementation of multiplications via additions.

In the concrete case it boils down to evaluation
of "7 * n", in a symbolic case it is "i * n",
where "n" is a running counter. Although it
looks quite simplistic here, the implementation
of those simulations is highly recursive.

Both simulations run fine for small values of
counter "n" but fail badly when "n" becomes big,
40,000 say. I happened to have a presentation
on the subject of Hawk about two months ago, and
the audience was not much impressed when they saw
Hugs failing on such simple (in their view)
examples. I knew beforehand what would happen for
large "n" and I tried to restrict my presentation
to small n's, but unfortunately the audience was
very inquisitive. You see, those people are accustomed
to running their tests for hours a time, so it
was natural for them to ask for some big values
of n.

Not a good publicity for Hugs, unfortunately.


Jan
 




RE: Help! Is there a space leak here?

2000-02-22 Thread Mark P Jones

|   Both simulations run fine for small values of
|   counter "n" but fail badly when "n" becomes big,
|   40,000 say. I happened to have a presentation
|   on the subject of Hawk about two months ago, and
|   the audience was not much impressed when they saw
|   Hugs failing on such simple (in their view)
|   examples. I knew beforehand what would happen for
|   large "n" and I tried to restrict my presentation
|   to small n's, but unfortunately the audience was
|   very inquisitive. You see, those people are accustomed
|   to running their tests for hours a time, so it
|   was natural for them to ask for some big values
|   of n.
| 
|   Not a good publicity for Hugs, unfortunately.

Jan: I don't think you're being very fair.  Do you really
know that the failures you demonstrated were due to a bug
in Hugs?  Is it possible that they might have been caused
by bugs in the program that you were running?  Or perhaps
you simply didn't have Hugs configured with a big enough
heap for the size of the bigger problems that you tried?
If it truly was a bug in Hugs, I hope that you reported
it so that the Hugs maintainers could do something to fix
it?

Joe: As you've observed, the space behavior of Haskell
programs is often very subtle, and hard to understand.
I glanced quickly over your program but didn't see any
immediate signs of problems.  My first suggestion would
be that you try using the rudimentary heap profiler that
Hugs provides to see if this gives some insight into the
source of the leak.  (You'll probably need to be able
to recompile Hugs with profiling enabled for this.)
Failing that, it might be worth trying to put together
a complete example (program and data) that demonstrates
the problem.  I find it rather hard to think about examples
like this in the abstract.  Having code that I can actually
run, can make a big difference in situations like this.  (I'm
not actually promising that I'll have time to investigate
it myself, but I might, and so might some other readers on
this list.)

All the best,
Mark




RE: Help! Is there a space leak here?

2000-02-22 Thread Jan Skibinski


[Prompted by Sergey's message about the strange dates:
 The mess in my headers is entirely my fault.
 I have not had a chance to properly finish the
 upgrades to this machines: internationalization and
 the rest. Please forgive me for this mess]
  
On Tue, 22 Feb 2000, Mark P Jones wrote:

 Jan: I don't think you're being very fair.  

I stand corrected. But my remarks were out of
love to Hugs, not otherwise, you know! :-)

 Do you really
 know that the failures you demonstrated were due to a bug
 in Hugs?  Is it possible that they might have been caused
 by bugs in the program that you were running?  

No I don't know it. I had barely enough time to put
all those pieces together in a nice tutorial fashion,
so I could not examine them properly before the
demonstration I was giving. But this program came from
a reputable source, akin to Hugs itself, after all! :-)
I did not have any reason to suspect major problems
there.

 Or perhaps
 you simply didn't have Hugs configured with a big enough
 heap for the size of the bigger problems that you tried?

I run the examples first on one of my resourceless
machines here and fiddling with the heap size would
not help. Later, the demo was being run on a 256Mb machine
where I had a comfort to set a heap size to quite
a large value. Yet, it did not help.

 If it truly was a bug in Hugs, I hope that you reported
 it so that the Hugs maintainers could do something to fix
 it?

No Mark, I did not. I got caught up in some other
problems and forgot all about it.
One of these days I am going to re-examine it again.

This particular program is quite big and I remember
one other curious thing about it: evaluation of
simple expressions (2*3, say) under Hawk prompt
were significantly slower than under Hugs prompt.

Jan