Re: XML API

Michel Fortin Tue, 26 May 2009 05:25:14 -0700

On 2009-05-24 20:31:05 -0400, Daniel Keep <daniel.keep.li...@gmail.com> said:

Michel Fortin wrote:

On 2009-05-24 12:51:43 -0400, Daniel Keep <daniel.keep.li...@gmail.com>
said:


(Cutting us mostly going back-and-forth on what a callback api would
look like.

...

Like I said, this seems like a lot of work to bolt a callback interface
onto something a pull api is designed for.

...

Except of course that you now can't easily control the loop, nor can do
you do fall-through on the cases.


Again, my definition of a callback API doesn't include an implicit loop,
just a callback. And I intend the callback to be a template argument so
it can be dispatched using function overloading and/or function
templates. So you'll have this instead:

bool continue = true;
do
continue = pp.readNext!(callback)();
while (continue);

void callback(OpenElementToken t) { blah(t.name); }
void callback(CloseElementToken t) { ... }
void callback(CharacterDataToken t) { ... }
...

No switch statement and no inversion of control.


Except that you can't define overloads of a function inside a function.


I didn't know that. Interesting point.

Perhaps that's just a bug in the compiler that we could get fixedthough. Any clue on that? I notice it also happen if you want tospecialize a nested template function.

Which means you have to stuff all of your code in a set of increasingly
obtusely-named globals or private members.  Like elemAStart, elemAData,
elemAAttr, elemAClose, elemBStart, elemBData, elemBAttr, ...

But when inside a function you can still dispatch using a nestedfunction template:


        void callback(T)(T t)
        {
                static if (is(T : OpenElementToken))
                {
                        blah(t.name);
                }
                static if (is(T : CloseElementToken))
                {
                        ...
                }
        }

It sure is a little less elegant, but you still skip a switch.

...
And at that point, I've just reinvented SAX.  Well, almost.  I have
control over the loop.  I still can't simply break out of it; I've got
to mess around with flags to get that done.

Meanwhile, if I write that code with a PullParser, it's just a
collection of normal functions, one per element type with all the
related code together in one place.  Or, if I don't want them all
bundled together, I can dispatch to smaller functions.

There's no way I'm not including a pull API, most likely implemented asa range.

I have a feeling you're going to head down this path irrespective, so
I'll just hope you can figure out a way to make the api not suck.

I want to offer at least two API options (so you can choose the mostappropriate parser API for what you do), and I want all of them toshare the same underlying parser (so I don't write two or threeparsers) with no compromise on speed.

I'm now realizing that an inversion of control can increase theperformance of the parser by not having to rebranch on the currentstate each time you ask for a new token. I don't want to forceinversion of control to anyone, but surely an API with inversion ofcontrol should be possible at full speed, and it can't be built on topof a pull parser.

So basically, the way I see it, you'd have two APIs: the inversion ofcontrol callback parser (for which you can specify a stop criterion sothat it saves it state and release control) and the range parser. Therange is built on top of the inversion of control parser with a stopcriterion making it stop and save its state after each token. Withinlining, both APIs should run at optimal speed.

Perhaps you'll say that it's complicated, but if you have a better ideacapable of extracting a maximum of performance for both parser APIs,then I'd like to know.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

Re: XML API

Reply via email to