[ 
https://issues.apache.org/jira/browse/XERCESC-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919245#action_12919245
 ] 

Ben Griffin commented on XERCESC-1947:
--------------------------------------

XMLUTF8Transcoder::transcodeTo() names it's fifth character 'charsEaten', and 
returns the value of just how many characters were successfully transcoded 
before hitting the end-buffer.
However, TranscodeToStr::transcode() calls transcodeTo with the same parameter 
named as 'charsRead', and expects a value greater than zero.  This is clearly a 
mistake, as a single character that requires more than one byte will not be 
eaten, even though it was 'read'.  

As I see it, the Throw at line 624 of TransService.cpp is unnecessary.  There 
are obviously cases where there are no characters 'read' because there isn't 
enough memory to read them yet. The Transservice should be able to rely upon 
the transcodeTo() method to handle exceptions, rather than just be 'surprised' 
at what it gets back.

Therefore, there is no error when no bytes are 'read' - instead the memory 
should be increased and the transcoding should continue.

OTOH, it seems reasonable for transcode() to test for a zero length string 
before callint transcodeTo().

Just my humble 2ยข.

> XMLUTF8Transcoder::transcodeTo  fails with an exception when transcoding 
> single characters that require 3 or more bytes as UTF8.
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESC-1947
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1947
>             Project: Xerces-C++
>          Issue Type: Bug
>    Affects Versions: 3.1.0, 3.1.1
>         Environment: Tested on mac os and debian linux. The failure is only 
> manifest on v3.1.x
>            Reporter: Ben Griffin
>            Priority: Critical
>         Attachments: transtest.cpp
>
>
> This can be demonstrated with the following 2 lines of code.
>       const XMLCh uval [] = { 0x254B, 0x0000}; //BOX DRAWINGS HEAVY VERTICAL 
> AND HORIZONTAL (needs 3 bytes for utf-8)
>       char* uc = (char*)TranscodeToStr(uval,"UTF-8").adopt(); cout << uc << 
> endl << flush; XMLString::release(&uc); //faulty exception;
> The error is: "terminate called after throwing an instance of 
> 'xercesc_3_1::TranscodingException'"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org

Reply via email to