[
https://issues.apache.org/jira/browse/XERCESJ-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Simmons updated XERCESJ-1276:
-----------------------------------
Attachment: xerces-value-store.txt
Here's another patch. I took a different approach, I created a Value class to
encapsulate the components being stored and put them in a hash set which fixes
the bottleneck without requiring values to implement an ordering (and its
probably quicker anyway).
There is a subtle difference that is presumably be a bug either in my patch or
the original code. See the TODO in the patch (line 3826). The old code copied
the values but not the value types whereas now they are inevitably paired
together. They can't both be right. Was this deliberate or an oversight?
> Improve performance of XML Schema Identity-constraint validation ---
> XMLSchemaValidator$ValueStoreBase.contains() is painfully slow.
> ------------------------------------------------------------------------------------------------------------------------------------
>
> Key: XERCESJ-1276
> URL: https://issues.apache.org/jira/browse/XERCESJ-1276
> Project: Xerces2-J
> Issue Type: Bug
> Components: XML Schema 1.0 Structures
> Affects Versions: 2.6.2, 2.9.1
> Reporter: Kenny MacLeod
> Labels: gsoc, gsoc2013, mentor
> Attachments: Xerces-J-src.2.11.0_patch1276.txt,
> xerces-value-store.txt, XMLSchemaValidator.java
>
>
> Under certain conditions, the contains() method in
> XMLSchemaValidator$ValueStoreBase can cripple the performance of parsing and
> validation.
> I'm not sure what those conditions are, but as a guideline figure I was using
> JAXB2 to deserialize a 22meg XML file. Without schema validation, it took 5
> seconds. With validation, it took over 3 minutes (JDK 1.5.0_10 on win32). My
> profiler pointed the finger squarely at that method XMLSchemaValidator.
> Suspicions were aroused further when seeing this comment in the source:
> public boolean contains() {
> // REVISIT: we can improve performance by using hash codes,
> instead of
> // traversing global vector that could be quite large.
> This is present in Xerces 2.6.2 contained with JDK1.5.0_10, and also in the
> source for 2.9.1.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]