[
https://issues.apache.org/jira/browse/XERCESC-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Scott Cantor closed XERCESC-1984.
---------------------------------
Applied to 3.1 branch, r1662885
> TranscodeToStr::transcode throws an exception when transcoding to UTF-8
> -----------------------------------------------------------------------
>
> Key: XERCESC-1984
> URL: https://issues.apache.org/jira/browse/XERCESC-1984
> Project: Xerces-C++
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 3.2.0, 4.0.0
> Environment: Bug reproducible on a Red Hat 5 based platform. The bug
> doesn't seem to be platform specific though.
> Reporter: Dan PV
> Assignee: Alberto Massari
> Labels: exception, transcode
> Fix For: 3.1.2, 3.2.0
>
> Attachments: transtest2.cpp
>
>
> This issue relates to the bug fix for issue XERCESC-1947. There are still
> cases where the method will fail in providing a transcoded version without
> throwing an exception. See the attached "transtest2.cpp" to reproduce the
> issue.
> The cause seems to come from the added "if((allocSize - fBytesWritten) < (len
> - charsDone))" condition in "TranscodeToStr::transcode" . In my provided test
> case I have a string composed of 6 Japanese characters (i.e. "絞り込み検索"). Once
> the first call to "XMLUTF8Transcoder::transcodeTo" is done, "charsRead" will
> return a count of 5 XMLCh readed. Since the initial allocated buffer for this
> string was set to 16 bytes, the condition will check against the following
> values "if((16 - 15) < (6 - 5))" which avoids the reallocation of a larger
> buffer for the UTF-8 encoded version of the string.
> Since the reallocation doesn't take place, the code will recall
> "XMLUTF8Transcoder::transcodeTo" but this time the "charsRead" count will be
> set to 0 because there is insufficient space in the buffer and this will
> trigger an exception of type "Trans_BadSrcSeq".
> I suppose that the goal of this added condition was to avoid an unnecessary
> reallocation of a buffer but unfortunately it doesn’t work when transcoding
> to variable length encoding like UTF-8. The solution is probably to simply
> replace the condition with "if(charsDone < len)".
> Regards,
> Dan
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]