Steven Schveighoffer napisał: > > The design I'm thinking is that the node iterator will own a buffer. One > > consequence is that the fields of the current node will point to the > > buffer akin to foreach(line; File.byLine), so in order to lift the input > > the user will have to dup (or process the node in-place). As new nodes > > will be overwritten on the same piece of memory, an important trait of > > the design emerges: cache intensity. Because of XML namespaces I think > > it is necessary for the buffer to contain the current node plus all its > > parents. > > That might not scale well. For instance, if you are accessing the 1500th > child element of a parent, doesn't that mean that the buffer must contain > the full text for the previous 1499 elements in order to also contain the > parent? > > Maybe I'm misunderstanding what you mean.
Let's talk on an example: <a name="value"> <b> Some Text 1 <c2> <!-- HERE --> Some text 2 </c2> Some Text 3 </b> </a> The buffer of the iterator positioned HERE would be: [Node a | Node b | Node c2] Node c2 and all its parents are available for inspection. Node a's attribute is stored in the buffer, but not b's "Some text 1" as it is c2's sibling; "Some text 1" was available in the previous iteration, now it's overwritten by c2. To get to "Some text 2" let's advance the iterator in depth to get: [Node a | Node b | Node c2 | Text node "Some text 2"] Advancing it once more we get to: [Node a | Node b | Text node "Some text 3"] So "Some text 3" is written where c2 and the text node 2 used to be. The element type of the range would always be the child, parents available through pointers: foreach (node; xmlRange) { doStuff(node); if (Node* parent = node.parent) doOtherStuff(parent); } Having no access to siblings is quite limiting but the iterator can form an efficient (zero-allocation) basis on which more convenient schemes are built upon. It's still just brain-storming, though. I fear there's something that'll make the whole idea crash & burn. > I would start out with a non-compliant parser, but one that allocates > nothing beyond the I/O buffer, one that simply parses lazily and can be > used as well as a SAX parser. Then see how much extra allocations we need > to get it to be compliant. Then, one can choose the compliancy level > based on what performance penalties one is willing to incur. Yeah, 100% compliance is a long way. -- Tomek