That said, I would *not* recommend to rewrite any such thing. It is a *lot* of work, and as such quite unrelated to boost's goals.
Would also mapping an implementations structure to a C++ internal structure also require quite a bit of work?
yes ! As I said, parsing is only a small part of it all. Once you hold a dom tree, you want to manipulate it. If the tree is implemented in C++, all this, too, needs to be implemented anew.
What I did was to provide a *thin* wrapper around the internal C strucs used by libxml2, so every dom manipulation call can be delegated down to libxml2. For example xpath lookup: I call libxml2's xpath API, returning me a C structure (possibly) holding a node set, i.e. a list of C nodes. I just need to map these C structs back to my C++ wrapper objects and I'm done with it. (Luckily for me, libxml2 provides all the hooks to make that lookup very efficient...)
Imagine that with a C++ tree: You would trash the 'implementation structure' as soon as the C++ structure is built, i.e. right after the parsing is finished. So everything but parsing has to be rewritten to fit your C++ data types.
OPTION 2: If you are intending to wrap an implementation like libxml2 into a C++ interface, you would sacrifice how the data is represented internally and you would get a slight performance penalty from the wrappers (not so much if you use inlined functions). This approach would not suffer the loading penalties described above.
Right, and it is what I chose. It's working wonderfully !
OPTION 3: Writing a boost XML/XPath parser would allow the internal structure to be optimised for C++-specific bindings, while not suffering from either wrapper performance penalties nor document loading/SAX parsing penalties.
Well, that would mean to write 'yet another xml library'.
What I had (and still have) in mind is a C++ interface to an existing implementation (libxml2 actually).
What if the user wants an interface to another implementation? Is it possible to standardize access to other parsers.
good question. The API boost exposes should of course be independent. But providing a different binding may practically be very hard, i.e. require a lot of work. Again, don't focus on parsers only. There is *much* more...
Best regards, Stefan
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost