[ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690140#action_12690140 ]
David Bertoni commented on XERCESC-1858: ---------------------------------------- I'm not sure if you tested with a zero-length string, but in that case, the transcoder does not write to charsRead, so it contains a garbage value, leading to and endless loop. > Using TranscodeToStr with strings shorter than two characters corrupts memory > ------------------------------------------------------------------------------ > > Key: XERCESC-1858 > URL: https://issues.apache.org/jira/browse/XERCESC-1858 > Project: Xerces-C++ > Issue Type: Bug > Components: Utilities > Affects Versions: 3.0.0 > Environment: Windows XP Pro (32-bit), compiling with MS > VisualStudio.Net 2003 with debugging symbols. > Reporter: George Compton > Assignee: John Snelson > Priority: Minor > Attachments: XERCESC-1858.patch > > > I have observed this problem when using TranscodeToStr to transcode "" (empty > string) and "1" (the numeral one) to US-ASCII. Both caused the program to > crash, apparently because the debug heap noticed that it had been corrupted. > TranscodeToStr::transcode will overrun the allocated string fString when > called with an input string that is less than two characters long. When > determining whether it needs to allocate additional space for terminating > null characters, it uses the expression : > if(fBytesWritten > (allocSize - 4)) { > allocSize is of type XMLSize_t, which is unsigned, assuming I followed all > the typedefs correctly. So, when the input string contains exactly one > non-null character, allocSize is one times the size of XMLCh. That's two > bytes on Windows. (2 - 4) in unsigned arithmetic wraps back to a large > number, so the conditional is false, and additional memory for the terminator > is not allocated. The four terminating characters are then written to bytes > 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it. > Something similar happens with empty strings. I haven't followed that all > the way through a debugger, but I think there may be an additional problem > there. The method's while(true) loop will transcode at least one character, > because it doesn't check the input length until half way through the first > iteration. Now, it's possible the allocator always allocates at least one > byte, or it's possible that the transcoder won't write anything for a NULL > character. I haven't checked either into of those possibilities, but it > seems risky to rely on those, even if they are the case. > I have not had a chance to try fixing either problem yet. For the first, > just changing it to (fBytesWritten + 4 > allocSize) should probably work. > The second will probably require moving the length check or adding a second > length check outside the loop. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org