RE: How to parse using DOM

Jesse Pelton Tue, 27 Nov 2007 08:28:59 -0800

http://xerces.apache.org/xerces-c/apiDocs/functions_0x67.html provides a list 
of methods implemented by Xerces-C; if you look at it, you'll find 
getDocument() is a method of AbstractDOMParser.  Click the method name and 
you'll find a brief description, including the fact that it returns a 
DOMDocument pointer. Click DOMDocument and you'll find that it has a 
getDocumentElement() method "that allows direct access to the child node that 
is the root element of the document."  Given this node, you can use 
getChildNodes(), getFirstChild(), getNextSibling(), and so on to directly 
navigate the DOM.

Alternately, you can use getElementsByTagName() to obtain a list of elements 
with a given name or getElementById() to get an element with a unique ID.  Or 
use DOMTreeWalker to work with a subset of your document.  I've never done 
that, so it's left as an exercise for the student. 

-----Original Message-----
From: Javier Gálvez Guerrero [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 27, 2007 10:46 AM
To: [email protected]
Subject: Re: How to parse using DOM

Thank you all very much, specially to Sven who shared his own effort.

However, I have looked into the samples on the repositories site and I can't
find how to "extract" the data itself from the DOM tree. If you don't mind I
would like to make some simple questions so with your answers I hope I could
start typing code.

//get the DOM representation
DOMNode *doc = parser->getDocument();

I can not find the getDocument method description in the provided
documentation and I am quite confused about it. Anyway, *doc is supposed to
be the DOM representation. So, what I need to do is to extract many elements
(with their childs and attributes) from a XML file, which is supposed to be
represented by *doc once it has been parsed and got. Then, how can I assign,
let's say, the value of the "nickname" element inside its parent element
"user"? getDocument returns the root node? So I guess I can ask it for its
children and then "move" through the tree with methods of the DOMNode API,
like for example, getNodeValue(), getChildNodes() and so on.

Is it ok? Does it exist any other way to extract data from the DOM
representation or this is the one about to use?

Thank you all very much again and sorry for the inconvenience. I am really
interested in using Xerces in the application I am developing, so that's
whay I would like to know how to use it properly.

Cheers,
Javi

**

2007/11/27, David Bertoni <[EMAIL PROTECTED]>:
>
> Sven Bauhan wrote:
> > Hi Javi,
> >
> > the Xerces interface is not really intuitively. A short description can
> be
> > found at the DOM programming giude:
> > http://xerces.apache.org/xerces-c/program.html
> >
> > In the Xerces documentation it is often described to use an extra class
> for
> > the conversion of std::string and XMLChar*. I have written such a class.
> As
> > it is quite short, I attach it here.
> Your class uses XMLString::transcode(), which transcodes to the local code
> page.  This will result in data loss in cases where content contains
> Unicode characters that are not representable in the local code page.  A
> better choice would be to transcode to UTF-8, which is compatible with
> char* APIs, and has the advantage that it can represent any Unicode
> character.
>
> There are many postings in the archives that will illustrate why using
> XMLString::transcode() is a bad idea.  I wish we would actually modify the
> analogous class in our samples so it doesn't do local code page
> transcoding, as it's providing a bad example.
>
> Dave
>

RE: How to parse using DOM

Reply via email to