Re: Finalizing D2

Michel Fortin Sun, 24 May 2009 04:50:29 -0700

On 2009-05-24 03:22:47 -0400, Daniel Keep <daniel.keep.li...@gmail.com> said:

Callbacks are "easier" to set up, but are incredibly complicated for any
sort of structured parsing.  The problem is that you can't easily change
the behaviour of the parser once it's started.

I had to write a SAX parser for a structured data format a few years
ago.  I swear that 90% of the code (and it's a monstrously huge module)
was just boilerplate to work around the bloody callback system.  I've
come to the conclusion that the SAX api is about the worse POSSIBLE way
of parsing anything more complex than a flat file that shouldn't have
been XML in the first place.

A callback API isn't necessarily SAX. A callback API doesn'tnecessarily have to parse everything until completion, it could parseonly the next token and call the appropriate callback.

If I can construct a range class/struct over my callback API I'll behappy. And if I can recursively call the parser API inside a callbackhandler so I can reuse the call stack while parsing then I'll be veryhappy.

Something like Tango's PullParser is the superior API because although
it's more verbose up-front, that's as bad as it gets.  Plus, you can
actually do stuff like call subroutines.

All that is needed really is a callback system that parses only onetoken. Then the callback can update the PullParser state, or thetoken-range state, run in a loop to produce a SAX-like API, or directlydo what you want to do, which may include parsing more tokens usingdifferent callbacks until you reach a closing tag.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: Finalizing D2

Reply via email to