[Haskell-cafe] Re: Leaves of a Tree
Tom Hawkins wrote: Any recommendations for speeding up extracting the set of leaves from a tree? data Tree = Branch Tree Tree | Leaf Int deriving (Eq, Ord) My slow, naive function: leaves :: Tree - Set Int leaves (Leaf n) = singleton n leaves (Branch left right) = union (leaves left) (leaves right) In my case, many of the branches in the tree are the same. I suspect the fixed point operation mentioned in the thread speeding up fibonacci with memoizing is applicable, but I'm having a tough time making the connection. What do you mean? The 'leaves' function itself cannot be made faster for it has to touch every leaf of the tree at least once. How large is your tree? But your tree could in reality be a DAG (directed acyclic graph), i.e. there are many shared subtrees. In this case, you'd have to fuse the tree generation and the tree destruction together. In the resulting function, you have a chance to exploit sharing. Here's an example with the fibonacci numbers: -- unfused fibtree 0 = Leaf 1 fibtree 1 = Leaf 1 fibtree n = Branch (fibtree (n-1)) (fibtree (n-2)) sumtree (Leaf x) = x sumtree (Branch x y) = sumtree x + sumtree y fib = sumtree . fibtree Here, 'sumtree' is equivalent to your 'leaves' and it's impossible to speed 'fib' up by changing 'sumtree' alone. Fusing the concatenation by equational reasoning yields fib' 0 = sumtree (fibtree 0) = sumtree (Leaf 1) = 1 fib' 1 = sumtree (fibtree 1) = sumtree (Leaf 1) = 1 fib' n = sumtree (fibtree n) = sumtree $ Branch (fibtree (n-1)) (fibtree (n-2)) = (sumtree $ fibtree (n-1)) + (sumtree $ fibtree (n-2)) = fib' (n-1) + fib' (n-2) Now, you can optimize the result by memoizing fib'. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Leaves of a Tree
Tom, I think inserting elements would be a lot faster than multiple unions. I would try: leafList :: Tree - [Int] leafList (Leaf n) = [n] leafList (Branch left right) = leafList left ++ leafList right leaves = fromList . leafList If you're writing many functions on Trees (or maybe even if you're not), you might consider writing a fold function and putting leafList in terms of this. Do you have experience with folds? -Chad Hello, Any recommendations for speeding up extracting the set of leaves from a tree? data Tree = Branch Tree Tree | Leaf Int deriving (Eq, Ord) My slow, naive function: leaves :: Tree - Set Int leaves (Leaf n) = singleton n leaves (Branch left right) = union (leaves left) (leaves right) In my case, many of the branches in the tree are the same. I suspect the fixed point operation mentioned in the thread speeding up fibonacci with memoizing is applicable, but I'm having a tough time making the connection. -Tom ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Leaves of a Tree
On 2/21/07, Chad Scherrer [EMAIL PROTECTED] wrote: Tom, I think inserting elements would be a lot faster than multiple unions. I would try: leafList :: Tree - [Int] leafList (Leaf n) = [n] leafList (Branch left right) = leafList left ++ leafList right leaves = fromList . leafList If you're writing many functions on Trees (or maybe even if you're not), you might consider writing a fold function and putting leafList in terms of this. Do you have experience with folds? Folding was my first approach: leaves :: Tree - Set Int leaves tree = accumLeaves Set.empty tree accumLeaves :: Set Int - Tree - Set Int accumLeaves set (Leaf n) = insert n set accumLeaves set (Branch l r) = foldl accumLeaves set [l,r] However, with this approach I quickly ran out of stack space. I found this odd, since I thought this program was tail recursive and shouldn't be using the stack. My next attempt was to use some form of memorization. Since many of the branches in my trees are equal, memorization should prevent following branches that have already been calculated. My question is, what is the best way to transform the original function to incorporate memorization? -Tom My slow, naive function: leaves :: Tree - Set Int leaves (Leaf n) = singleton n leaves (Branch left right) = union (leaves left) (leaves right) In my case, many of the branches in the tree are the same. I suspect the fixed point operation mentioned in the thread speeding up fibonacci with memoizing is applicable, but I'm having a tough time making the connection. -Tom ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Leaves of a Tree
Neil, I think this idea is better than what I had suggested, but as it stands it doesn't typecheck. Did you mean something like this? leaves :: Tree - [Int] leaves = f [] where f rest (Leaf n) = n : rest f rest (Branch l r) = f (f rest r) l -Chad --- (from Neil Mitchell) Hi Tom data Tree = Branch Tree Tree | Leaf Int deriving (Eq, Ord) leaves :: Tree - Set Int leaves (Leaf n) = singleton n leaves (Branch left right) = union (leaves left) (leaves right) The standard method for a traversal over leaves with accumulation is: leaves :: Tree - Set Int leaves x = f [] where f (Leaf n) rest = n : rest f (Branch l r) rest = f l (f r rest) This makes the construction of the list quite cheap. Then you can do the fromList trick, and that might give you a speed up. Thanks Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Leaves of a Tree
Hi Tom, Tom Hawkins wrote: Folding was my first approach: leaves :: Tree - Set Int leaves tree = accumLeaves Set.empty tree accumLeaves :: Set Int - Tree - Set Int accumLeaves set (Leaf n) = insert n set accumLeaves set (Branch l r) = foldl accumLeaves set [l,r] However, with this approach I quickly ran out of stack space. I found this odd, since I thought this program was tail recursive and shouldn't be using the stack. This is a problem of tail recursion and lazy evaluation not playing nicely. See this page: http://www.haskell.org/haskellwiki/Stack_overflow My next attempt was to use some form of memorization. Since many of the branches in my trees are equal, memorization should prevent following branches that have already been calculated. My question is, what is the best way to transform the original function to incorporate memorization? I think something like this could be done, if there's some invariants maintained by the data structure. Is there any additional structure you're imposing beyond that required for the data line? -Chad ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe