[
https://issues.apache.org/jira/browse/XERCESJ-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antti S. Lankila updated XERCESJ-1276:
--------------------------------------
Attachment: xerces-fast-unique-check.diff
> Improve performance of XML Schema Identity-constraint validation ---
> XMLSchemaValidator$ValueStoreBase.contains() is painfully slow.
> ------------------------------------------------------------------------------------------------------------------------------------
>
> Key: XERCESJ-1276
> URL: https://issues.apache.org/jira/browse/XERCESJ-1276
> Project: Xerces2-J
> Issue Type: Bug
> Components: XML Schema 1.0 Structures
> Affects Versions: 2.6.2, 2.9.1
> Reporter: Kenny MacLeod
> Priority: Major
> Labels: gsoc, gsoc2013, mentor
> Attachments: XMLSchemaValidator.java,
> Xerces-J-src.2.11.0_patch1276.txt, xerces-binaries-patched-over-2.11.0.zip,
> xerces-fast-unique-check.diff, xerces-value-store.txt
>
>
> Under certain conditions, the contains() method in
> XMLSchemaValidator$ValueStoreBase can cripple the performance of parsing and
> validation.
> I'm not sure what those conditions are, but as a guideline figure I was using
> JAXB2 to deserialize a 22meg XML file. Without schema validation, it took 5
> seconds. With validation, it took over 3 minutes (JDK 1.5.0_10 on win32). My
> profiler pointed the finger squarely at that method XMLSchemaValidator.
> Suspicions were aroused further when seeing this comment in the source:
> public boolean contains() {
> // REVISIT: we can improve performance by using hash codes,
> instead of
> // traversing global vector that could be quite large.
> This is present in Xerces 2.6.2 contained with JDK1.5.0_10, and also in the
> source for 2.9.1.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]