http://xerces.apache.org/xerces-c/apiDocs/functions_0x67.html provides a list of methods implemented by Xerces-C; if you look at it, you'll find getDocument() is a method of AbstractDOMParser. Click the method name and you'll find a brief description, including the fact that it returns a DOMDocument pointer. Click DOMDocument and you'll find that it has a getDocumentElement() method "that allows direct access to the child node that is the root element of the document." Given this node, you can use getChildNodes(), getFirstChild(), getNextSibling(), and so on to directly navigate the DOM.
Alternately, you can use getElementsByTagName() to obtain a list of elements with a given name or getElementById() to get an element with a unique ID. Or use DOMTreeWalker to work with a subset of your document. I've never done that, so it's left as an exercise for the student. -----Original Message----- From: Javier Gálvez Guerrero [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 27, 2007 10:46 AM To: [email protected] Subject: Re: How to parse using DOM Thank you all very much, specially to Sven who shared his own effort. However, I have looked into the samples on the repositories site and I can't find how to "extract" the data itself from the DOM tree. If you don't mind I would like to make some simple questions so with your answers I hope I could start typing code. //get the DOM representation DOMNode *doc = parser->getDocument(); I can not find the getDocument method description in the provided documentation and I am quite confused about it. Anyway, *doc is supposed to be the DOM representation. So, what I need to do is to extract many elements (with their childs and attributes) from a XML file, which is supposed to be represented by *doc once it has been parsed and got. Then, how can I assign, let's say, the value of the "nickname" element inside its parent element "user"? getDocument returns the root node? So I guess I can ask it for its children and then "move" through the tree with methods of the DOMNode API, like for example, getNodeValue(), getChildNodes() and so on. Is it ok? Does it exist any other way to extract data from the DOM representation or this is the one about to use? Thank you all very much again and sorry for the inconvenience. I am really interested in using Xerces in the application I am developing, so that's whay I would like to know how to use it properly. Cheers, Javi ** 2007/11/27, David Bertoni <[EMAIL PROTECTED]>: > > Sven Bauhan wrote: > > Hi Javi, > > > > the Xerces interface is not really intuitively. A short description can > be > > found at the DOM programming giude: > > http://xerces.apache.org/xerces-c/program.html > > > > In the Xerces documentation it is often described to use an extra class > for > > the conversion of std::string and XMLChar*. I have written such a class. > As > > it is quite short, I attach it here. > Your class uses XMLString::transcode(), which transcodes to the local code > page. This will result in data loss in cases where content contains > Unicode characters that are not representable in the local code page. A > better choice would be to transcode to UTF-8, which is compatible with > char* APIs, and has the advantage that it can represent any Unicode > character. > > There are many postings in the archives that will illustrate why using > XMLString::transcode() is a bad idea. I wish we would actually modify the > analogous class in our samples so it doesn't do local code page > transcoding, as it's providing a bad example. > > Dave >
