[ 
https://issues.apache.org/jira/browse/XERCESJ-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen White updated XERCESJ-1622:
-----------------------------------

    Attachment: nodelistcache-free-fix.patch

Simple patch, with a potential performance concern.  If a large number of 
NodeListCaches are in the free list then this code becomes slow.

> CoreDocumentImpl.freeNodeListCache can be called multiple times and results 
> in free list corruption
> ---------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1622
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1622
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 3 Core)
>    Affects Versions: 2.11.0
>         Environment: Tested on Linux using Java 1.7.0_25 and 1.7.0_45 
> (x86_64), but I don't think this issue is constrained to a particular 
> environment
>            Reporter: Stephen White
>         Attachments: Bug.java, nodelistcache-free-fix.patch
>
>
> Iterating through the children of a ParentNode using indexes (as in the code 
> below) rather than the nextSibling pointers would be very slow (O(n^2) in the 
> number of nodes) if the list is long.  Xerces speeds this up by using a 
> cache, fNodeListCache.
> NodeList l = doc.getDocumentElement().getChildNodes();
> for (int i = 0; i < l.getLength(); i++) {
>     l.item(i);
> }
> ParentNode.nodeListItem() contains some code that allows a NodeListCache to 
> be freed when iteration is complete.  
> Freeing a NodeListCache is done inside CoreDocumentImpl.freeNodeListCache 
> which adds the NodeListCache to a list of free caches ready for re-use by a 
> different ParentNode, avoiding object allocations.  A NodeListCache can still 
> be used after it is freed, as long as it hasn't been claimed by anything else 
> yet.
> If you iterate through the same child list twice then the NodeListCache will 
> be freed multiple times (during the second iteration firstAccess will be 
> false and, according to the logic in nodeListItem(), it'll get freed again).
> Unfortunately CoreDocumentImpl.freeNodeListCache adds NodeListCaches to the 
> free list unconditionally, so if it is called multiple times on the same 
> NodeListCache then the cache is added to the list multiple times.  This 
> creates a cycle in the linked list.  When this happens subsequent calls to 
> getNodeListCache always return the same cache.  This means that in a 
> depth-first traversal of the DOM the same cache gets used for children as for 
> their parents, and as the children use it the state needed for the parent is 
> lost.  The result is that things run as slowly as if the cache wasn't present 
> at all.
> We've got real code that triggers this situation when trying to build a Saxon 
> TinyTree from a Xerces DOM, as the Saxon DOMSender iterates using item 
> indexes rather than nextSibling pointers.
> Example code to follow.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to