DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=8612>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=8612 Performance Enhancement to Xalan distinct function Summary: Performance Enhancement to Xalan distinct function Product: XalanJ2 Version: 2.3 Platform: All OS/Version: All Status: NEW Severity: Enhancement Priority: Other Component: org.apache.xalan.lib AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] In Extensions.java, the distinct function uses the Hashtable object to track unique nodes. The Hashtable object synchronizes all access to instances of itself. In Xalan 2.3.1, the current code is as follows: public static NodeSet distinct(ExpressionContext myContext, NodeIterator ni) throws javax.xml.transform.TransformerException { // Set up our resulting NodeSet and the hashtable we use to keep track of duplicate // strings. NodeSet dist = new NodeSet(); dist.setShouldCacheNodes(true); Hashtable stringTable = new Hashtable(); Node currNode = ni.nextNode(); while (currNode != null) { String key = myContext.toString(currNode); if (!stringTable.containsKey(key)) { stringTable.put(key, currNode); dist.addElement(currNode); } currNode = ni.nextNode(); } return dist; } Since the Hashtable instance is used locally within the method, there really is not need to use an object that synchronizes access to its instance. To improve performance, a HashSet should be used. Furthermore, it is a good idea to manually clear the HashSet at the end of the method to ensure the HashSet instance is garbage collected. The enhanced code is as follows: public static NodeSet distinct(ExpressionContext myContext, NodeIterator ni) throws javax.xml.transform.TransformerException { // Set up our resulting NodeSet and the hashtable we use to keep track of duplicate // strings. NodeSet dist = new NodeSet(); dist.setShouldCacheNodes(true); HashSet stringSet = new HashSet(); Node currNode = ni.nextNode(); while (currNode != null) { String key = myContext.toString(currNode); if (stringSet.add(key)) { dist.addElement(currNode); } currNode = ni.nextNode(); } stringSet.clear(); return dist; } If you want to "completely" ensure the HashSet is garbage collected (due a TransformerException being thrown), the following enhanced code could be used instead of the above enhanced code: public static NodeSet distinct(ExpressionContext myContext, NodeIterator ni) throws javax.xml.transform.TransformerException { // Set up our resulting NodeSet and the hashtable we use to keep track of duplicate // strings. NodeSet dist = new NodeSet(); dist.setShouldCacheNodes(true); HashSet stringSet = new HashSet(); try { Node currNode = ni.nextNode(); while (currNode != null) { String key = myContext.toString(currNode); if (stringSet.add(key)) { dist.addElement(currNode); } currNode = ni.nextNode(); } } finally { stringSet.clear(); } return dist; }
