On 2009-05-24 03:22:47 -0400, Daniel Keep <daniel.keep.li...@gmail.com> said:

Callbacks are "easier" to set up, but are incredibly complicated for any
sort of structured parsing.  The problem is that you can't easily change
the behaviour of the parser once it's started.

I had to write a SAX parser for a structured data format a few years
ago.  I swear that 90% of the code (and it's a monstrously huge module)
was just boilerplate to work around the bloody callback system.  I've
come to the conclusion that the SAX api is about the worse POSSIBLE way
of parsing anything more complex than a flat file that shouldn't have
been XML in the first place.

A callback API isn't necessarily SAX. A callback API doesn't necessarily have to parse everything until completion, it could parse only the next token and call the appropriate callback.

If I can construct a range class/struct over my callback API I'll be happy. And if I can recursively call the parser API inside a callback handler so I can reuse the call stack while parsing then I'll be very happy.


Something like Tango's PullParser is the superior API because although
it's more verbose up-front, that's as bad as it gets.  Plus, you can
actually do stuff like call subroutines.

All that is needed really is a callback system that parses only one token. Then the callback can update the PullParser state, or the token-range state, run in a loop to produce a SAX-like API, or directly do what you want to do, which may include parsing more tokens using different callbacks until you reach a closing tag.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Reply via email to