[ 
https://issues.apache.org/jira/browse/XERCESC-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007374#comment-13007374
 ] 

Lee Doron commented on XERCESC-1947:
------------------------------------

I've attached a patch that addresses this bug and another I discovered. First, 
with regard to the issue at hand, it seems to me that an empty string (len == 
0) *should* be "transcoded", with the result being another zero-terminated 
empty string. Otherwise the caller has an undue burden to examine the string 
before attempting to transcode it. Also, the Throw at line 624 is warranted, in 
case the input XMLCh string is malformed (in my book, that includes having a 
premature zero before len characters). So, I avoid an early exit. Instead, I 
add enough space to allocSize for the 4 terminating zeroes, which has two 
beneficial effects -- in some cases it avoids a reallocation, and it also 
guarantees enough space for at least one UTF-8 transcoded character, so we can 
safely keep the Throw. However, if the input string is empty, we just skip 
calling transcodeTo().

I applied a similar fix to TranscodeFromStr::transcode(), and that's where I 
found an entirely different bug. When it needs to reallocate, it does a 
memcpy(newBuf, fString, fCharsWritten) to copy the existing partial string to 
the new, larger buffer. However, memcpy() takes a count in units of bytes, 
while fCharsWritten is a count of XMLCh! The call should be memcpy(newBuf, 
fString, fCharsWritten * sizeof(XMLCh)).

I made a couple of other minor changes to improve readability and optimize.

> XMLUTF8Transcoder::transcodeTo  fails with an exception when transcoding 
> single characters that require 3 or more bytes as UTF8.
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESC-1947
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1947
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.1.0, 3.1.1
>         Environment: Tested on mac os and debian linux. The failure is only 
> manifest on v3.1.x
>            Reporter: Ben Griffin
>            Priority: Minor
>             Fix For: 3.1.2, 3.2.0
>
>         Attachments: TransService.cpp.patch, TransService.patch, transtest.cpp
>
>
> This can be demonstrated with the following 2 lines of code.
>       const XMLCh uval [] = { 0x254B, 0x0000}; //BOX DRAWINGS HEAVY VERTICAL 
> AND HORIZONTAL (needs 3 bytes for utf-8)
>       char* uc = (char*)TranscodeToStr(uval,"UTF-8").adopt(); cout << uc << 
> endl << flush; XMLString::release(&uc); //faulty exception;
> The error is: "terminate called after throwing an instance of 
> 'xercesc_3_1::TranscodingException'"

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to