[
https://issues.apache.org/jira/browse/XERCESJ-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen White updated XERCESJ-1622:
-----------------------------------
Attachment: Bug.java
Demonstraton of the bug. The output relates to walking the tree in a
depth-first manner, but prior to this the code iterates through a NodeList and
then accesses it again in a manner that results in the same NodeListCache being
added to the free list twice.
$ java -cp xerces-2_11_0/xercesImpl.jar:xerces-2_11_0/xml-apis.jar:. Bug
[root: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c
[root: null](1) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c
[root: null](2) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c
[root: null](3) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c
As you can see the calls to item(2) and item(3) on the root don't use a cache.
Each time a call to item(...) on the root Node claims the cache it looses it
again straight away, as the same cache is given to the iteration over the <p>
element children.
The expected output is:
$ java -cp fixedXercesImpl.jar:xerces-2_11_0/xml-apis.jar:. Bug
[root: null](0) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](0) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](1) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](2) using org.apache.xerces.dom.NodeListCache@e44d85
[root: null](1) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](1) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](2) using org.apache.xerces.dom.NodeListCache@9d93be
[root: null](2) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](0) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](1) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](2) using org.apache.xerces.dom.NodeListCache@9d93be
[root: null](3) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](0) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](1) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](2) using org.apache.xerces.dom.NodeListCache@e44d85
Here the NodeListCache used for iterating over the children of <root> is lost
between root.item(0) and root.item(1), exactly the same as in the first case.
This is because it has been added to the free list, as Iteration is expected to
have been completed. The difference here is that once root.item(1) has
reclaimed it, it stays claimed. Calls to item(...) on the <p> elements get
given a new cache.
> CoreDocumentImpl.freeNodeListCache can be called multiple times and results
> in free list corruption
> ---------------------------------------------------------------------------------------------------
>
> Key: XERCESJ-1622
> URL: https://issues.apache.org/jira/browse/XERCESJ-1622
> Project: Xerces2-J
> Issue Type: Bug
> Components: DOM (Level 3 Core)
> Affects Versions: 2.11.0
> Environment: Tested on Linux using Java 1.7.0_25 and 1.7.0_45
> (x86_64), but I don't think this issue is constrained to a particular
> environment
> Reporter: Stephen White
> Attachments: Bug.java
>
>
> Iterating through the children of a ParentNode using indexes (as in the code
> below) rather than the nextSibling pointers would be very slow (O(n^2) in the
> number of nodes) if the list is long. Xerces speeds this up by using a
> cache, fNodeListCache.
> NodeList l = doc.getDocumentElement().getChildNodes();
> for (int i = 0; i < l.getLength(); i++) {
> l.item(i);
> }
> ParentNode.nodeListItem() contains some code that allows a NodeListCache to
> be freed when iteration is complete.
> Freeing a NodeListCache is done inside CoreDocumentImpl.freeNodeListCache
> which adds the NodeListCache to a list of free caches ready for re-use by a
> different ParentNode, avoiding object allocations. A NodeListCache can still
> be used after it is freed, as long as it hasn't been claimed by anything else
> yet.
> If you iterate through the same child list twice then the NodeListCache will
> be freed multiple times (during the second iteration firstAccess will be
> false and, according to the logic in nodeListItem(), it'll get freed again).
> Unfortunately CoreDocumentImpl.freeNodeListCache adds NodeListCaches to the
> free list unconditionally, so if it is called multiple times on the same
> NodeListCache then the cache is added to the list multiple times. This
> creates a cycle in the linked list. When this happens subsequent calls to
> getNodeListCache always return the same cache. This means that in a
> depth-first traversal of the DOM the same cache gets used for children as for
> their parents, and as the children use it the state needed for the parent is
> lost. The result is that things run as slowly as if the cache wasn't present
> at all.
> We've got real code that triggers this situation when trying to build a Saxon
> TinyTree from a Xerces DOM, as the Saxon DOMSender iterates using item
> indexes rather than nextSibling pointers.
> Example code to follow.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]