[jira] [Updated] (XERCESC-2158) XMLUTF8Transcoder: One multibyte UTF8 character is swallowed from the srcData when the resulting surrogate pair does not fit in toFill at the end

2019-12-09 Thread Scott Cantor (Jira)


 [ 
https://issues.apache.org/jira/browse/XERCESC-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Cantor updated XERCESC-2158:
--
Affects Version/s: 3.2.0
   3.2.1

> XMLUTF8Transcoder: One multibyte UTF8 character is swallowed from the srcData 
> when the resulting surrogate pair does not fit in toFill at the end
> -
>
> Key: XERCESC-2158
> URL: https://issues.apache.org/jira/browse/XERCESC-2158
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 3.2.0, 3.1.4, 3.2.1, 3.2.2
> Environment: OS independent: Linux (RedHat 7.5)/Windows 10
> Compiler independent
>Reporter: Johannes Willnecker
>Priority: Major
> Fix For: 3.2.3
>
> Attachments: UTF8.xml, xerces.patch
>
>
> *Bug found in Xerces-C++ Version 3.1.4* (based on code reviews also newer 
> versions are affected)
>  
> *How to reproduce:* Call SAX2Print for the attached UTF8.xml file "SAX2Print 
> UTF8.xml".
> One chinese character is missing in the name attribute of the last but one 
> Instance element.
> *Fix:* The fix for this bug is included in the xerces.patch file.
> In XMLUTF8Transcoder.cpp a check for this issue was already included but the 
> conclusion
> that the bytes read are updated at the end of the loop was wrong.
> The bytes read (bytesEaten) calculation is based on the srcPtr which was 
> already updated when the check is made.
> Therefore srcPtr needs to be repositioned in case the Surrogate pair does not 
> fit into the toFill buffer.
>  
> *Contributor related:*
> Author Name of the code being contributed: Johannes Willnecker
> Employer: Siemens AG
> I have the right to grant the copyright licenses for the contribution.
> My employer has rights to the code that I have written. My employer gave me 
> permission to contribute this code on its behalf.
> I am not aware of any third-party license or other restrictions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Updated] (XERCESC-2158) XMLUTF8Transcoder: One multibyte UTF8 character is swallowed from the srcData when the resulting surrogate pair does not fit in toFill at the end

2019-12-09 Thread Scott Cantor (Jira)


 [ 
https://issues.apache.org/jira/browse/XERCESC-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Cantor updated XERCESC-2158:
--
Fix Version/s: 3.2.3

> XMLUTF8Transcoder: One multibyte UTF8 character is swallowed from the srcData 
> when the resulting surrogate pair does not fit in toFill at the end
> -
>
> Key: XERCESC-2158
> URL: https://issues.apache.org/jira/browse/XERCESC-2158
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 3.1.4, 3.2.2
> Environment: OS independent: Linux (RedHat 7.5)/Windows 10
> Compiler independent
>Reporter: Johannes Willnecker
>Priority: Major
> Fix For: 3.2.3
>
> Attachments: UTF8.xml, xerces.patch
>
>
> *Bug found in Xerces-C++ Version 3.1.4* (based on code reviews also newer 
> versions are affected)
>  
> *How to reproduce:* Call SAX2Print for the attached UTF8.xml file "SAX2Print 
> UTF8.xml".
> One chinese character is missing in the name attribute of the last but one 
> Instance element.
> *Fix:* The fix for this bug is included in the xerces.patch file.
> In XMLUTF8Transcoder.cpp a check for this issue was already included but the 
> conclusion
> that the bytes read are updated at the end of the loop was wrong.
> The bytes read (bytesEaten) calculation is based on the srcPtr which was 
> already updated when the check is made.
> Therefore srcPtr needs to be repositioned in case the Surrogate pair does not 
> fit into the toFill buffer.
>  
> *Contributor related:*
> Author Name of the code being contributed: Johannes Willnecker
> Employer: Siemens AG
> I have the right to grant the copyright licenses for the contribution.
> My employer has rights to the code that I have written. My employer gave me 
> permission to contribute this code on its behalf.
> I am not aware of any third-party license or other restrictions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org