Off the top of my head, I don't know about null nodes, but text,
comments, and processing instructions are a few of the other types of
nodes that one might encounter in a DOM.  See
http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-159
0626202 for more.

The most common problem people have with the DOM is when they
deserialize a document like the following:

<root>
  <child />
</root>

People typically assume the whitespace is insignificant and are
surprised when it turns out that the first child of the <root> element
is a text node composed of a newline and some blanks.  (It's possible to
ignore whitespace, but you have to take steps to make it so.  Parsers
can't blindly assume that whitespace is unimportant, so they hang on to
it unless they have some way to determine what can be ignored.)

-----Original Message-----
From: Adrian Schubert [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 18, 2008 1:43 PM
To: [email protected]
Subject: What are 'null' and 'non-element' children?

Hi people
 
I've been studying and using some pieces of the sample code delivered 
with xerces.
There's something I don't understand about the DOM parsing. When I'm 
traversing the DOM, looking for a particular node with a given name, 
each time I have to filter out the nodes that are NOT elements, and NOT 
null.
In other words, the children returned by DOMNodeList* getChildNodes() 
are not just the nodes I can see when looking at my XML file, but some 
further mysterious null and non-element nodes. Aren't nodes simply the 
elements we can see in the XML file, for example <imageDescription> or 
<fileSize>?
Here's how I get the children of a node:

   DOMElement* currentElement = elementRoot; // initialize to root
   children = currentElement->getChildNodes();
   nChildren = children->getLength(); // this number includes null- and 
non-elements!

Even when my parent element contains only 2 children that I know of, I 
get, for example, 5 children back according to children->getLength.
To get what I really want I have to test to see which ones are the 
"regular" elements by doing:

        if(!currentNode->getNodeType() ||  // is NULL
            currentNode->getNodeType() != DOMNode::ELEMENT_NODE ) // 
isn't an element
        {
           continue; // check next node
        }

Why is this? What are these mysterious invisible children in the XML 
file? Is there a better way to find one or more elements that have a 
known location in the DOM for an XML file?

Thanks, Adrian

Reply via email to