[jira] Created: (XALANJ-2493) BasisLibrary.nodeList2Iterator broken

Martin von Gagern (JIRA) Mon, 23 Mar 2009 06:46:14 -0700

BasisLibrary.nodeList2Iterator broken
-------------------------------------


                 Key: XALANJ-2493
                 URL: https://issues.apache.org/jira/browse/XALANJ-2493
             Project: XalanJ2
          Issue Type: Bug
      Security Level: No security risk; visible to anyone (Ordinary problems in 
Xalan projects.  Anybody can view the issue.)
          Components: XSLTC
    Affects Versions: The Latest Development Code
            Reporter: Martin von Gagern


The current implementation of nodeList2Iterator is broken, because it can not 
deal with attribute nodes. It relies on copyNodes which in turn tries to add 
attribute nodes as children of some top level node. Attributes don't live on 
the children axis, though, so this is against DOM and causes a DOM exception in 
the Xerces DOM implementation and probably most other implementations. The 
resulting HIERARCHY_REQUEST_ERR was noted e.g. in XALANJ-2424.

Furthermore, the implementation is inefficient, because it manually copies each 
and every node from the source document to a new DTM to some new DTM. The time 
overhead for the copying as well as the memory overhead for the additional DOM 
can be avoided in cases where the nodes come from some input document, as 
opposed to a document completely loaded within some extension function.

A comment in the related XALANJ-2425 suggests returning DTMIterator from 
extension functions and avoiding the re-import for those. I don't like this 
idea because it exposes a lot of Xalans internals to extension functions, and 
because the returned node list might be newly created, while at least some of 
thenodes might still be from the same document. So instead of special cases for 
the list type, I implemented special cases for every node of the list. If it is 
a proxy node of the same (Multi)DOM, we simply use its handle.

If not, we add it to some w3c DOM and turn that into a DTM, pretty much like 
the current implementation does. However, I dropped copyNodes in favor of 
Document.importNode, to avoid code duplication and rely on supposedly more 
heavily tested code. I also added another level of elements, so that there is 
one such dummy node for every item of the source list, with always a single 
child or element. A few assertions help ensure this single child policy. This 
is especially important in the new implementation, because otherwise it would 
become difficult to get the proxied nodes and the newly DTMified nodes into 
correct order.

Right now, the import of DOM nodes is only implemented for those nodes I expect 
to turn up in the DTM in pretty much the same form as they do in the w3c DOM. 
For all other nodes, an internal error is thrown. This especially concerns 
document fragment nodes. At least in w3c DOM, a document fragment can never be 
a child, so if DTM behaves the same, we would need to import document fragments 
seperately, or expand them to the list of their children instead. I'm not sure 
what correct behaviour would be here, so I'd rather throw an exception than 
implement wrong behaviour.

I also noticed that 
org.apache.xml.dtm.ref.DTMManagerDefault.getDTMHandleFromNode would in theory 
provide an implementation for turning w3c nodes into DTM handles - exactly what 
we need here. That method seems to start importing from the topmost ancestor of 
a node, giving as much context as possible, in contrast to both current and my 
suggested XSLTC approach, which destroys ancestor references. That method also 
seems to employ caches in order to avoid importing a document repeatedly. 
Sadly, actually using that method throws a ClassCastException as it expects a 
DTM generate from a DOM source to be a DOM2DTM, which SAXImpl is not. A comment 
inside that method also indicates that future implementations might drop 
auto-importing and instead leave it to the caller to import a DOM if it hasn't 
been imported before.

I left my own attempt at an nodeList2Iterator implementation using 
getDTMHandleFromNode in place, renamed to nodeList2IteratorUsingHandleFromNode 
and made private. So it's there, it even gets compiled, but it won't get used. 
If that method gets fixed in XSLTCDTMManager or its ancestor, then this method 
might be used instead, giving a much simpler and cleaner implementation. If 
some of my code would be useful for such an implementation as well, like the 
check for "is same DOM", feel free to copy or move my code to those classes as 
well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Created: (XALANJ-2493) BasisLibrary.nodeList2Iterator broken

Reply via email to