[Haskell-cafe] Re: Leaves of a Tree

2007-02-22 Thread apfelmus
Tom Hawkins wrote:
 Any recommendations for speeding up extracting the set of leaves from a
 tree?
 
 data Tree = Branch Tree Tree | Leaf Int deriving (Eq, Ord)
 
 My slow, naive function:
 
 leaves :: Tree - Set Int
 leaves (Leaf n) = singleton n
 leaves (Branch left right) = union (leaves left) (leaves right)
 
 In my case, many of the branches in the tree are the same.  I suspect
 the fixed point operation mentioned in the thread speeding up
 fibonacci with memoizing is applicable, but I'm having a tough time
 making the connection.

What do you mean? The 'leaves' function itself cannot be made faster for
it has to touch every leaf of the tree at least once. How large is your
tree?

But your tree could in reality be a DAG (directed acyclic graph), i.e.
there are many shared subtrees. In this case, you'd have to fuse the
tree generation and the tree destruction together. In the resulting
function, you have a chance to exploit sharing. Here's an example with
the fibonacci numbers:

  -- unfused
  fibtree 0 = Leaf 1
  fibtree 1 = Leaf 1
  fibtree n = Branch (fibtree (n-1)) (fibtree (n-2))

  sumtree (Leaf x) = x
  sumtree (Branch x y) = sumtree x + sumtree y

  fib = sumtree . fibtree

Here, 'sumtree' is equivalent to your 'leaves' and it's impossible to
speed 'fib' up by changing 'sumtree' alone. Fusing the concatenation by
equational reasoning yields

  fib' 0 = sumtree (fibtree 0) = sumtree (Leaf 1)
 = 1
  fib' 1 = sumtree (fibtree 1) = sumtree (Leaf 1)
 = 1
  fib' n = sumtree (fibtree n)
 = sumtree $ Branch (fibtree (n-1)) (fibtree (n-2))
 = (sumtree $ fibtree (n-1)) + (sumtree $ fibtree (n-2))
 = fib' (n-1) + fib' (n-2)

Now, you can optimize the result by memoizing fib'.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Leaves of a Tree

2007-02-21 Thread Chad Scherrer

Tom,

I think inserting elements would be a lot faster than multiple unions.
I would try:

leafList :: Tree - [Int]
leafList (Leaf n) = [n]
leafList (Branch left right) = leafList left ++ leafList right

leaves = fromList . leafList

If you're writing many functions on Trees (or maybe even if you're
not), you might consider writing a fold function and putting leafList
in terms of this. Do you have experience with folds?

-Chad


Hello,

Any recommendations for speeding up extracting the set of leaves from a tree?

data Tree = Branch Tree Tree | Leaf Int deriving (Eq, Ord)

My slow, naive function:

leaves :: Tree - Set Int
leaves (Leaf n) = singleton n
leaves (Branch left right) = union (leaves left) (leaves right)

In my case, many of the branches in the tree are the same.  I suspect
the fixed point operation mentioned in the thread speeding up
fibonacci with memoizing is applicable, but I'm having a tough time
making the connection.

-Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Leaves of a Tree

2007-02-21 Thread Tom Hawkins

On 2/21/07, Chad Scherrer [EMAIL PROTECTED] wrote:

Tom,

I think inserting elements would be a lot faster than multiple unions.
I would try:

leafList :: Tree - [Int]
leafList (Leaf n) = [n]
leafList (Branch left right) = leafList left ++ leafList right

leaves = fromList . leafList

If you're writing many functions on Trees (or maybe even if you're
not), you might consider writing a fold function and putting leafList
in terms of this. Do you have experience with folds?


Folding was my first approach:

leaves :: Tree - Set Int
leaves tree = accumLeaves Set.empty tree

accumLeaves :: Set Int - Tree - Set Int
accumLeaves set (Leaf n) = insert n set
accumLeaves set (Branch l r) = foldl accumLeaves set [l,r]

However, with this approach I quickly ran out of stack space.  I found
this odd, since I thought this program was tail recursive and
shouldn't be using the stack.

My next attempt was to use some form of memorization.  Since many of
the branches in my trees are equal, memorization should prevent
following branches that have already been calculated.  My question is,
what is the best way to transform the original function to incorporate
memorization?

-Tom



 My slow, naive function:

 leaves :: Tree - Set Int
 leaves (Leaf n) = singleton n
 leaves (Branch left right) = union (leaves left) (leaves right)

 In my case, many of the branches in the tree are the same.  I suspect
 the fixed point operation mentioned in the thread speeding up
 fibonacci with memoizing is applicable, but I'm having a tough time
 making the connection.

 -Tom


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Leaves of a Tree

2007-02-21 Thread Chad Scherrer

Neil,

I think this idea is better than what I had suggested, but as it
stands it doesn't typecheck. Did you mean something like this?

leaves :: Tree - [Int]
leaves = f []
 where
 f rest (Leaf n) = n : rest
 f rest (Branch l r) = f (f rest r) l


-Chad
---
(from Neil Mitchell)

Hi Tom


data Tree = Branch Tree Tree | Leaf Int deriving (Eq, Ord)



leaves :: Tree - Set Int
leaves (Leaf n) = singleton n
leaves (Branch left right) = union (leaves left) (leaves right)


The standard method for a traversal over leaves with accumulation is:

leaves :: Tree - Set Int
leaves x = f []
 where
 f (Leaf n) rest = n : rest
 f (Branch l r) rest = f l (f r rest)

This makes the construction of the list quite cheap.

Then you can do the fromList trick, and that might give you a speed up.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Leaves of a Tree

2007-02-21 Thread Chad Scherrer

Hi Tom,

Tom Hawkins wrote:

Folding was my first approach:

leaves :: Tree - Set Int
leaves tree = accumLeaves Set.empty tree

accumLeaves :: Set Int - Tree - Set Int
accumLeaves set (Leaf n) = insert n set
accumLeaves set (Branch l r) = foldl accumLeaves set [l,r]

However, with this approach I quickly ran out of stack space.  I found
this odd, since I thought this program was tail recursive and
shouldn't be using the stack.


This is a problem of tail recursion and lazy evaluation not playing
nicely. See this page:

http://www.haskell.org/haskellwiki/Stack_overflow


My next attempt was to use some form of memorization.  Since many of
the branches in my trees are equal, memorization should prevent
following branches that have already been calculated.  My question is,
what is the best way to transform the original function to incorporate
memorization?


I think something like this could be done, if there's some invariants
maintained by the data structure. Is there any additional structure
you're imposing beyond that required for the data line?

-Chad
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe