On Tue, 08 Feb 2011 19:16:37 -0500, Tomek Sowiński <j...@ask.me> wrote:

Steven Schveighoffer napisał:

> The design I'm thinking is that the node iterator will own a buffer. One
> consequence is that the fields of the current node will point to the
> buffer akin to foreach(line; File.byLine), so in order to lift the input
> the user will have to dup (or process the node in-place). As new nodes
> will be overwritten on the same piece of memory, an important trait of
> the design emerges: cache intensity. Because of XML namespaces I think
> it is necessary for the buffer to contain the current node plus all its
> parents.

That might not scale well. For instance, if you are accessing the 1500th child element of a parent, doesn't that mean that the buffer must contain the full text for the previous 1499 elements in order to also contain the
parent?

Maybe I'm misunderstanding what you mean.

Let's talk on an example:

<a name="value">
        <b>
                Some Text 1
                <c2>      <!-- HERE -->
                Some text 2
                </c2>
                Some Text 3
        </b>
</a>

The buffer of the iterator positioned HERE would be:

[Node a | Node b | Node c2]

OK, so you mean a buffer other than the I/O buffer. This means double buffering data. I was thinking of a solution that allows simply using the I/O buffer for parsing. I think this is one of the keys to Tango's xml performance.

-Steve

Reply via email to