[
https://issues.apache.org/jira/browse/XERCESC-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Scott Cantor resolved XERCESC-2158.
-----------------------------------
Resolution: Fixed
Holding my breath a little but I do think this is a correct fix.
r1872119
> XMLUTF8Transcoder: One multibyte UTF8 character is swallowed from the srcData
> when the resulting surrogate pair does not fit in toFill at the end
> -------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: XERCESC-2158
> URL: https://issues.apache.org/jira/browse/XERCESC-2158
> Project: Xerces-C++
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 3.2.0, 3.1.4, 3.2.1, 3.2.2
> Environment: OS independent: Linux (RedHat 7.5)/Windows 10
> Compiler independent
> Reporter: Johannes Willnecker
> Assignee: Scott Cantor
> Priority: Major
> Fix For: 3.2.3
>
> Attachments: UTF8.xml, xerces.patch
>
>
> *Bug found in Xerces-C++ Version 3.1.4* (based on code reviews also newer
> versions are affected)
>
> *How to reproduce:* Call SAX2Print for the attached UTF8.xml file "SAX2Print
> UTF8.xml".
> One chinese character is missing in the name attribute of the last but one
> Instance element.
> *Fix:* The fix for this bug is included in the xerces.patch file.
> In XMLUTF8Transcoder.cpp a check for this issue was already included but the
> conclusion
> that the bytes read are updated at the end of the loop was wrong.
> The bytes read (bytesEaten) calculation is based on the srcPtr which was
> already updated when the check is made.
> Therefore srcPtr needs to be repositioned in case the Surrogate pair does not
> fit into the toFill buffer.
>
> *Contributor related:*
> Author Name of the code being contributed: Johannes Willnecker
> Employer: Siemens AG
> I have the right to grant the copyright licenses for the contribution.
> My employer has rights to the code that I have written. My employer gave me
> permission to contribute this code on its behalf.
> I am not aware of any third-party license or other restrictions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]