Off the top of my head, I don't know about null nodes, but text, comments, and processing instructions are a few of the other types of nodes that one might encounter in a DOM. See http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-159 0626202 for more.
The most common problem people have with the DOM is when they deserialize a document like the following: <root> <child /> </root> People typically assume the whitespace is insignificant and are surprised when it turns out that the first child of the <root> element is a text node composed of a newline and some blanks. (It's possible to ignore whitespace, but you have to take steps to make it so. Parsers can't blindly assume that whitespace is unimportant, so they hang on to it unless they have some way to determine what can be ignored.) -----Original Message----- From: Adrian Schubert [mailto:[EMAIL PROTECTED] Sent: Monday, February 18, 2008 1:43 PM To: [email protected] Subject: What are 'null' and 'non-element' children? Hi people I've been studying and using some pieces of the sample code delivered with xerces. There's something I don't understand about the DOM parsing. When I'm traversing the DOM, looking for a particular node with a given name, each time I have to filter out the nodes that are NOT elements, and NOT null. In other words, the children returned by DOMNodeList* getChildNodes() are not just the nodes I can see when looking at my XML file, but some further mysterious null and non-element nodes. Aren't nodes simply the elements we can see in the XML file, for example <imageDescription> or <fileSize>? Here's how I get the children of a node: DOMElement* currentElement = elementRoot; // initialize to root children = currentElement->getChildNodes(); nChildren = children->getLength(); // this number includes null- and non-elements! Even when my parent element contains only 2 children that I know of, I get, for example, 5 children back according to children->getLength. To get what I really want I have to test to see which ones are the "regular" elements by doing: if(!currentNode->getNodeType() || // is NULL currentNode->getNodeType() != DOMNode::ELEMENT_NODE ) // isn't an element { continue; // check next node } Why is this? What are these mysterious invisible children in the XML file? Is there a better way to find one or more elements that have a known location in the DOM for an XML file? Thanks, Adrian
