"Rene de Visser" <[EMAIL PROTECTED]> wrote:

> Even if you replace parsec, HXT is itself not
> incremental.  (It stores the whole XML document in memory as a tree,
> and the tree is not  memory effecient.

If the usage pattern of the tree is search-and-discard, then only enough
of the tree to satisfy the search needs to be stored in memory at once.
Everything from the root to the first node of interest can easily be
pruned by the garbage collector.

A paper describing the lazy parsing technique, and using XML-parsing as
its motivating example, is available at
    http://www.cs.york.ac.uk/~malcolm/partialparse.html

> >> haxml offers the choice of non-incremental parsers and incremental
> >> parsers.

Indeed.  This lazy incremental parser for XML is available in the
development version of HaXml:
    http://www.cs.york.ac.uk/fp/HaXml-devel

The source code for partial parsing is available in a separate package:
    http://www.cs.york.ac.uk/fp/polyparse

These lazy parser combinators are roughly between 2x - 5x faster than
Parsec on large inputs (although the strict variation is about 2x slower
than Parsec).

Regards,
    Malcolm
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to