[
https://issues.apache.org/jira/browse/XERCESJ-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901176#comment-14901176
]
Matthias Reischenbacher commented on XERCESJ-970:
-------------------------------------------------
Nice fix. It also affects large attribute values E.g. embedded bitmaps in SVG
files:
<image xlink:href="data:image/jpeg;....
XMLStringBuffer seems to used there too.
Hopefully there will be new release which includes the fix, because the
performance improvement is quite noticeable (even for relatively small
attribute values around 5mb).
> Large comments are extremely slow to parse
> ------------------------------------------
>
> Key: XERCESJ-970
> URL: https://issues.apache.org/jira/browse/XERCESJ-970
> Project: Xerces2-J
> Issue Type: Bug
> Components: XNI
> Affects Versions: 2.2.0, 2.2.1, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.6.1, 2.6.2
> Environment: Windows XP running Java 1.4.2
> Reporter: Sean Griffin
> Priority: Minor
> Attachments: comments.txt
>
>
> Very large comments drastically increase the parsing time for both SAX and
> DOM implementations. Running the sax.Counter and dom.Counter samples with a
> 410KB file where the entire thing is uncommented results in parse times in
> the 100ms to 300ms range. However, if I comment out 95% of the file and run
> the same samples the parse times jump to between 40 and 50 seconds. I ran
> the same samples using the Aelfred parser shipped with Saxon 7.9 and, while
> the file with the large comment was slower than without the comment, it
> jumped by only 100ms or so.
> I briefly compared the code between the two parsers, and they don't look
> significantly different when it comes to handling comments. The only main
> difference I noticed was around low/high byte character checks. I suspect it
> is an inefficiency in the XMLStringBuffer class, but I'm not seeing anything.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]