DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=11831>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=11831 Extremely slow with long attribute values ------- Additional Comments From [EMAIL PROTECTED] 2002-09-11 20:21 ------- Neil, I'm not used to sending bug reports, so sorry for the diff stuff. The contrived data for the parser is a large (eg.130K) file which looks like: <a a=" aaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaa ... aaaaaaaaaaaaaaaaaaaaaaaaa "/> In the current version of XERCES, for each instance of the parser, for the first time you parse this file, the parsing takes a lot of time. Once an instance processed this file each subsequent processing is done much faster. There is another problem assosiated with that though. Each Parser instance which parsed that file holds a 130K buffer. In a multithreaded environment, in which each thread holds its own instance of the parser (that scenario is a recommended approach, see javax.xml.parsers.DocumentBuilder) this can lead to running out of memory. The patch I submited does not resolve that problem, but enables a parser instance to release the buffer each time after parsing without real performance degradation. As to your questions: The data has already been shown. At each run I parsed the file only once. The time shortened from 20s to 0.5s (including class loading). I also noticed some improvement (6%) in parsing the XNI-CONFIG.XML file 100 times using a new instance of the parser each time (5.540ms->5.189ms). On the other hand I also noticed minor performance degradation on some other files. So the patch only shows the approach to the problem resolution and can surely be further optimized. As to the patch itself. Do not be worried about the method calls. Since the methods are private they can all be inlined by modern VMs. The loop might look a bit suspicious :) But in fact it can only be executed at most 30 times during the lifetime of a parse instance, so it's more like if statement. I used a standard approach to extending an array. Instead of extending it in constant size chunks (which is an O(n^2) process), it's better to extend the array by multiplying its size (which is an O(n) process, not matter what the multiplier is). Hope this will help Regards -Andrzej --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
