Hi there, I have an XML document which contains a mixture of structural nodes (called 'section' and with unique 'id' attributes) and non-structural nodes (called anything else). The structural elements ('section's) can contain, as well as non-structural elements, other structural elements. I'm doing the Python DOM programming with this document and have got stuck with something.
I want to be able to get all the non-structural elements which are children of a given 'section' elemenent (identified by 'id' attribute) but not children of any child 'section' elements of the given 'section'. e.g.: <section id="a"> <foo>bar</foo> </section> <section id="b"> <foo>baz</foo> <section id="c"> <bar>foo</bar> </section> </section> Given this document, the working function would return "<foo>baz</foo>" for id='b' and "<bar>foo</bar>" for id='c'. Normally, recursion is used for DOM traversals. I've tried this function which uses recursion with a generator (can the two be mixed?) def content_elements(node): if node.hasChildNodes(): node = node.firstChild if not page_node(node): yield node for e in self.content_elements(node): yield e node = node.nextSibling which didn't work. So I tried it without using a generator: def content_elements(node, elements): if node.hasChildNodes(): node = node.firstChild if node.nodeType == Node.ELEMENT_NODE: print node.tagName if not page_node(node): elements.append(node) self.content_elements(node, elements) node = node.nextSibling return elements However, I got exactly the same problem: each time I use this function I just get a DOM Text node with a few white space (tabs and returns) in it. I guess this is the indentation in my source document? But why do I not get the propert element nodes? Cheers, Richard -- http://mail.python.org/mailman/listinfo/python-list