DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13695>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13695 Performance problem with large text nodes and XMLFormatter.cpp Summary: Performance problem with large text nodes and XMLFormatter.cpp Product: Xerces-C++ Version: 2.1.0 Platform: PC OS/Version: Windows NT/2K Status: NEW Severity: Normal Priority: Other Component: Non-Validating Parser AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] I found a performance problem with large text nodes in XMLFormatter.cpp::formatBuff(). My node is actually 6MB of base64 encoded binary data. The code searches the buffer for escape sequences and doesn't find any (since it is base64 data). Then it goes into an if statement which is supposed to pass all the data it just checked through the transcoder. The problem is that it only does one buffer size (about 16K), then loops around and starts over, checking 6MB - 16K for escape sequences again. I added a while statement inside the if statment and performance was improved by an order of magnitude (easily). Here is the patch: --- XMLFormatter.cpp.old 2002-10-16 10:47:38.000000000 -0400 +++ XMLFormatter.cpp 2002-10-16 10:47:49.000000000 -0400 @@ -358,38 +358,42 @@ // if (tmpPtr > srcPtr) { - const unsigned int srcCount = tmpPtr - srcPtr; - const unsigned srcChars = srcCount > kTmpBufSize ? - kTmpBufSize : srcCount; + + while ( tmpPtr > srcPtr ) + { + const unsigned int srcCount = tmpPtr - srcPtr; + const unsigned srcChars = srcCount > kTmpBufSize ? + kTmpBufSize : srcCount; - const unsigned int outBytes = fXCoder->transcodeTo - ( - srcPtr - , srcChars - , fTmpBuf - , kTmpBufSize - , charsEaten - , unRepOpts - ); + const unsigned int outBytes = fXCoder->transcodeTo + ( + srcPtr + , srcChars + , fTmpBuf + , kTmpBufSize + , charsEaten + , unRepOpts + ); - #if defined(XML_DEBUG) - if ((outBytes > kTmpBufSize) - || (charsEaten > srcCount)) - { - // <TBD> The transcoder is freakin out maaaannn - } - #endif + #if defined(XML_DEBUG) + if ((outBytes > kTmpBufSize) + || (charsEaten > srcCount)) + { + // <TBD> The transcoder is freakin out maaaannn + } + #endif - // If we get any bytes out, then write them - if (outBytes) - { - fTmpBuf[outBytes] = 0; fTmpBuf[outBytes + 1] = 0; - fTmpBuf[outBytes + 2] = 0; fTmpBuf[outBytes + 3] = 0; - fTarget->writeChars(fTmpBuf, outBytes, this); - } + // If we get any bytes out, then write them + if (outBytes) + { + fTmpBuf[outBytes] = 0; fTmpBuf[outBytes + 1] = 0; + fTmpBuf[outBytes + 2] = 0; fTmpBuf[outBytes + 3] = 0; + fTarget->writeChars(fTmpBuf, outBytes, this); + } - // And bump up our pointer - srcPtr += charsEaten; + // And bump up our pointer + srcPtr += charsEaten; + } } else if (tmpPtr < endPtr) { --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]