[
https://issues.apache.org/jira/browse/XERCESJ-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen White updated XERCESJ-1622:
-----------------------------------
Attachment: nodelistcache-free-fix.patch
Simple patch, with a potential performance concern. If a large number of
NodeListCaches are in the free list then this code becomes slow.
> CoreDocumentImpl.freeNodeListCache can be called multiple times and results
> in free list corruption
> ---------------------------------------------------------------------------------------------------
>
> Key: XERCESJ-1622
> URL: https://issues.apache.org/jira/browse/XERCESJ-1622
> Project: Xerces2-J
> Issue Type: Bug
> Components: DOM (Level 3 Core)
> Affects Versions: 2.11.0
> Environment: Tested on Linux using Java 1.7.0_25 and 1.7.0_45
> (x86_64), but I don't think this issue is constrained to a particular
> environment
> Reporter: Stephen White
> Attachments: Bug.java, nodelistcache-free-fix.patch
>
>
> Iterating through the children of a ParentNode using indexes (as in the code
> below) rather than the nextSibling pointers would be very slow (O(n^2) in the
> number of nodes) if the list is long. Xerces speeds this up by using a
> cache, fNodeListCache.
> NodeList l = doc.getDocumentElement().getChildNodes();
> for (int i = 0; i < l.getLength(); i++) {
> l.item(i);
> }
> ParentNode.nodeListItem() contains some code that allows a NodeListCache to
> be freed when iteration is complete.
> Freeing a NodeListCache is done inside CoreDocumentImpl.freeNodeListCache
> which adds the NodeListCache to a list of free caches ready for re-use by a
> different ParentNode, avoiding object allocations. A NodeListCache can still
> be used after it is freed, as long as it hasn't been claimed by anything else
> yet.
> If you iterate through the same child list twice then the NodeListCache will
> be freed multiple times (during the second iteration firstAccess will be
> false and, according to the logic in nodeListItem(), it'll get freed again).
> Unfortunately CoreDocumentImpl.freeNodeListCache adds NodeListCaches to the
> free list unconditionally, so if it is called multiple times on the same
> NodeListCache then the cache is added to the list multiple times. This
> creates a cycle in the linked list. When this happens subsequent calls to
> getNodeListCache always return the same cache. This means that in a
> depth-first traversal of the DOM the same cache gets used for children as for
> their parents, and as the children use it the state needed for the parent is
> lost. The result is that things run as slowly as if the cache wasn't present
> at all.
> We've got real code that triggers this situation when trying to build a Saxon
> TinyTree from a Xerces DOM, as the Saxon DOMSender iterates using item
> indexes rather than nextSibling pointers.
> Example code to follow.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]