[ http://issues.apache.org/jira/browse/XERCESC-1654?page=all ]
Alberto Massari resolved XERCESC-1654. -------------------------------------- Resolution: Fixed Hi Boris, I have changed the code to use the same computation provided by the Unicode web site; however, the current code is not wrong as masking with 0x3FF will in any case discard the bit that the (ch - 0x10000) operation changed. > Bug in surrogate handling in internal UCS4 transcoder > ----------------------------------------------------- > > Key: XERCESC-1654 > URL: http://issues.apache.org/jira/browse/XERCESC-1654 > Project: Xerces-C++ > Issue Type: Bug > Components: Utilities > Affects Versions: 2.7.0 > Environment: any > Reporter: Boris Kolpackov > > In util/XMLUCSTranscoder.cpp there is the following code which handles > surrogates (line 110): > const XMLCh ch1 = XMLCh(((nextVal - 0x10000) >> 10) + 0xD800); > const XMLCh ch2 = XMLCh(((nextVal - 0x10000) & 0x3FF) + 0xDC00); > I believe the second line should be: > const XMLCh ch2 = XMLCh((nextVal & 0x3FF) + 0xDC00); > See http://unicode.org/unicode/faq/utf_bom.html#35 for confirmation. Also, > while at it, I would suggest renaming XMLUCSTranscoder.cpp to > XMLUCST4ranscoder.cpp so that it is consistent with XMLUCS4Transcoder.hpp. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]