Hi again all,

In this installment of the neverending NodeIterator-question series, I'll
discuss the tantilizing details of bug 
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5152.

The path "/a/b[@name = 'foo']/c/text()" compiles to (mostly) form (the / at
the front adds a root AbsoluteIterator - I'll ignore that at my peril):

             SI1
            /   \
         SI2     TCI4
        /   \
     SI3    TCI3
    /   \
TCI1   NVI   legend:
       /     SI - StepIterator
     SI4     TCI - TypedChildIterator
    /   \    NVI - NodeValueIterator
TCI2    TAI  TAI - TypedAttributeIterator

(for StepIterators / is the _source and \ is the _iterator)

Consider the following code from bug#5152:

par<xsl:value-of select="$par"/>par
var<xsl:value-of select="$var"/>var

As reported in the bug, $par will generate output whereas $var will not.

In the $par case, BasisLibrary.stringF(Object, DOM) is passed an uncloned
iterator tree, resets it, and uses next() to grab the first value (see code
below):

return dom.getNodeValue(((NodeIterator)obj).reset().next());

In the $var case, a (different) uncloned iterator tree is reset, cloned and
next() is used to grab the first value (see code below):

int i_5_ = nodeiterator_0_.reset().cloneIterator().next();
String string = dom_6_.getNodeValue(i_5_);

It would appear that cloneIterator is causing some problems. First, let me
relate how understand what the thing does: cloneIterator serves to make a
throw-away copy of an iterator. All it's objects and all their states are
copied so that the iterator can then be used (and in the course of being
used, modified).

cloneIterator works by cloning 'this', then calling 'clone.reset()': reset()
is called after each subtree is cloned. I think I can summarise the
cloning/resetting of the iterator tree in bug #5152 as objects are cloned
(@) and then each clone is reset (#). This stuff happens in the following
order: SI1@ SI2@ SI3@ TCI1@ TCI1# NVI@ SI4@ TCI2@ TCI2# TAI@ TAI# SI4# NVI#
SI3# TCI3@ TCI3# SI2# TCI4@ TCI4@ SI1#.

Notice that after TAI# we are unwinding from the deepest part of the
callstack. Notice also that as we unwind from SI4# we then execute NVI#.
>From looking at the code, we see that NodeValueIterator.cloneIterator() is
(mostly) implemented as (_source is object SI4):

NodeValueIterator clone = (NodeValueIterator)super.clone();
clone._source = _source.cloneIterator();
return clone.reset();

and that even-though _source.cloneIterator performs a reset, since
NodeValueIterator.reset is implemented as:

_source.reset();
return resetPosition();

clone.reset() will call SI4's reset function again. Thus it appears that
every cloned object reset()s itself and then if it's not the root of the
iterator tree, when it's parent is cloned it's reset() called again
(setStartNode(int) would be called if the object is an _iterator of a
StepIterator).

Looking at the example above, it appears that by the time the root is cloned
and told to reset(), TCI2 has been told to reset/set-its-starting-node as
many times as the tree is deep minus one (5 times).

Besides being a performance issue, I'll wager that this mutltipe-resetting
behaviour is the source of problem for bug #5152. I'll claim that the
iterator tree above appears to break during the _second_ initialization of
TCI3 (ie: during SI1#). Starting from SI4#, I'll outline the codepath:

1. During SI4#, TAI._attribute is correctly set but TCI2.next moves to 0
2. SI2# calls SI2.reset (by way of clone.reset)
3. SI2.reset calls SI3.next to initialize TCI3 (and succeeds), and during
the call TCI1.next moves to 0 and TAI._attribute is consumed (it is now 0 as
well).
4. SI1# calls SI2.reset (by way of SI1.reset)
5. SI2.reset calls SI3.next to initialize TCI3
6. SI3.next fails to get a node from either _iterator (by way of SI4.next
failing - itself failing from TAI and TCI2 being 0) or _source (TCI1 has no
more nodes)

Since SI3.next returned 0, TCI3 is set to 0, SI2.next returns 0, TCI4 is set
to 0 and SI1.next returns 0 thereafter.

Where I see things falling apart is in SI3.next (StepIterator.javaV1.9L140)
during SI2#. Could a fix be to check (during cloning only) if _source came
to END then _source could be reset and _iterator was reinitialized? This
would be a pretty ugly patch.

Perhaps another way would be to have a final method in NodeIteratorBase that
would start the cloning of the iterator tree and then reset it - once only.
Like the way SyntaxTreeNode.parseContents/Children works - uh, only
different. If the main entrypoint was still called cloneIterator then only
the names of the iterator methods that perform the cloning would need to
change - nothing in the compiler-proper.

Thoughts? Please straighten me out if things look crooked,

john

Reply via email to