Storing character positions in DOM

Cyberthymia Thu, 22 Nov 2001 03:07:22 -0800

I'm currently in the process of adding an XML parser to a project I'm working on, using the DOM parser in XercesC.

The problem I've got at the moment is that I need to be able to tie in the nodes on the DOM tree to the original XML source being processed, i.e.

store the start and end character positions for each node (including elements, attributes and text data).

(As a bit of background info, I did originally look at using the SAX parser, but I couldn't get this to work as it doesn't throw events for the attributes. Besides I need to end up with a DOM interface, and didn't want to write my own to sit on top when Xerces gives you a perfectly good one already)

I've had a play around with Xerces and have reached the conclusion that there is no way of doing this without modifying the parser code itself, and adding new methods to the DOM classes for getting / setting the positions.

Before I start though, I wanted to see if anyone knows of a better way of doing this, as any changes I make to the parser are going to have to be carefully recorded and reimplemented every time I pick up a newer version of Xerces.

Thanks for any advice / help,

Richard Jinks

Storing character positions in DOM

Reply via email to