Hello,
I'm trying to construct a Tree out of a list of XML parser events,
with something similar to the following:
data Event a =
StartTag a -- start-tag, <a>
| Data a -- character data or empty tag <a/>
| EndTag -- end-tag, </a>
data Tree a = Node a [Tree a]
buildTree :: a -> [Event a] -> Tree a
buildTree root events = Node root (fst (build events)) -- [*NOTE]
where
build :: [Event a] -> ([Tree a], [Event a])
build [] = ([],[])
build (e:es) = case e of
Data a ->
let (siblings,rest) = build es
in (Node a:siblings,rest)
StartTag a ->
let (children,es') = build es
(siblings,rest) = build es'
in (Node a children : siblings, rest)
EndTag -> ([],es)
-- [*NOTE] : Assumes the input is well-formed XML,
-- so 'snd (build events)' will be [].
My intuition (which is shaky at best when it comes to the
space usage behaviour of Haskell programs) tells me that
(preorderTree . buildTree root)
where
preorderTree :: Tree a -> [a]
preorderTree t = traverse t [] where
traverse (Tree a c) k = a : travlist c k
travlist (c:cs) k = traverse c (travlist cs k)
travlist [] k = k
ought to run in space bounded by the depth of the tree
(which for XML documents is typically not very deep).
This turns out not to be the case; testing with Hugs
invariably fails with a "Garbage collection fails to
reclaim sufficient space" on even moderately sized
documents (5000 nodes or so).
Is there something obviously wrong with the above formulation?
If so, what can I do to fix it?
I haven't by any means ruled out the possibility that the
space leak is elsewhere in the code (although I'm building the
list of parser events with 'unfoldr' which -- again according
to my unreliable intuition -- should behave nicely space-wise).
Thanks for any advice,
--Joe English
[EMAIL PROTECTED]