[
https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050520#comment-13050520
]
Murray Williams commented on XALANJ-2540:
-----------------------------------------
I get a similar improvement with my app when setting
-Dcom.sun.org.apache.xml.internal.dtm.DTMManager=com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
- Thanks!
> Very inefficient default behaviour for looking up DTMManager
> ------------------------------------------------------------
>
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
> Issue Type: Improvement
> Security Level: No security risk; visible to anyone(Ordinary problems in
> Xalan projects. Anybody can view the issue.)
> Components: DTM, XPath
> Affects Versions: 2.7.1, 2.7
> Reporter: Lukas Eder
>
> I have analysed an issue that has been bothering me for some time. When
> executing XPath evaluations, it looks like a very significant amount of time
> is spent in the initialisation of the XPathContext. I have asked this
> question on Stack Overflow and answered it myself:
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite
> sub-optimal and should be improved, statically. I imagine, it is unlikely
> that this configuration is going to change once classes have been loaded.
> Hence, the fallback lookup of META-INF/service/org.apache.xml.dtm.DTMManager
> should only be done once.
> For reference, here's the question and answer again in JIRA:
> ----
> I have come to an astonishing conclusion that this:
> Element e = (Element)
> document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
> "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the
> above XPath query to actually execute a simple getElementsByTagName()
> instead. But it doesn't seem to do that. This problem is limited to around
> 5-6 frequently used XPath calls, that are abstracted and hidden by an API.
> Those queries involve simple paths (e.g. /a/b/c, no variables, conditions)
> against an always available DOM Document only. So, if an optimisation can be
> done, it will be quite easy to achieve.
> ----
> I have debugged and profiled my test-case and Xalan/JAXP in general. I
> managed to identify the big major problem in
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName()
> It can be seen that every one of the 10k test XPath evaluations led to the
> classloader trying to lookup the DTMManager instance in some sort of default
> configuration. This configuration is not loaded into memory but accessed
> every time. Furthermore, this access seems to be protected by a lock on the
> ObjectFactory.class itself. When the access fails (by default), then the
> configuration is loaded from the xalan.jar file's
> META-INF/service/org.apache.xml.dtm.DTMManager
> configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter
> like this:
> -Dorg.apache.xml.dtm.DTMManager=
> org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
> com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath
> evaluations of //SomeNodeName against a 90k XML file (measured with
> System.nanoTime():
> measured library : Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen
> 1.1.3
> --------------------------------------------------------------------------------
> without optimisation : 10400ms | 4717ms | |
> 25500ms
> reusing XPathFactory : 5995ms | 2829ms | |
> reusing XPath : 5900ms | 2890ms | |
> reusing XPathExpression : 5800ms | 2915ms | 16000ms |
> 25000ms
> adding the JVM param : 1163ms | 761ms | n/a |
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]