Hello,

I'm trying to construct a Tree out of a list of XML parser events,
with something similar to the following:

        data Event a =
            StartTag a          -- start-tag, <a>
          | Data a              -- character data or empty tag <a/>
          | EndTag              -- end-tag, </a>

        data Tree a = Node a [Tree a]

        buildTree :: a -> [Event a] -> Tree a
        buildTree root events = Node root (fst (build events)) -- [*NOTE]
        where
            build :: [Event a] -> ([Tree a], [Event a])
            build [] = ([],[])
            build (e:es) = case e of
                Data a ->
                    let (siblings,rest) = build es
                    in  (Node a:siblings,rest)
                StartTag a ->
                    let (children,es') = build es
                        (siblings,rest) = build es'
                    in (Node a children : siblings, rest)
                EndTag -> ([],es)

        -- [*NOTE] : Assumes the input is well-formed XML,
        -- so 'snd (build events)' will be [].


My intuition (which is shaky at best when it comes to the
space usage behaviour of Haskell programs) tells me that

            (preorderTree . buildTree root)

where

            preorderTree :: Tree a -> [a]
            preorderTree t = traverse t [] where
                traverse (Tree a c) k   = a : travlist c k
                travlist (c:cs) k       = traverse c (travlist cs k)
                travlist [] k           = k

ought to run in space bounded by the depth of the tree
(which for XML documents is typically not very deep).

This turns out not to be the case; testing with Hugs
invariably fails with a "Garbage collection fails to
reclaim sufficient space" on even moderately sized
documents (5000 nodes or so).

Is there something obviously wrong with the above formulation?
If so, what can I do to fix it?

I haven't by any means ruled out the possibility that the
space leak is elsewhere in the code (although I'm building the
list of parser events with 'unfoldr' which -- again according
to my unreliable intuition -- should behave nicely space-wise).

Thanks for any advice,


--Joe English

  [EMAIL PROTECTED]



Reply via email to