DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25498>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25498 Win32Transcoder does not properly transcode ISO-8859-2 and other encodings Summary: Win32Transcoder does not properly transcode ISO-8859-2 and other encodings Product: Xerces-C++ Version: 2.4.0 Platform: PC OS/Version: Windows XP Status: NEW Severity: Major Priority: Other Component: Utilities AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Win32TransService scans the Windows registry for supported charsets and reads the "Codepage" and "InternetEncoding". For many charsets these value are equal, but not for all. When a Win32Transcoder object is created for a given charset, the "Codepage" value is stored in the fWinCP member and the "InternetEncoding" value in the fIECP member. Win32Transcoder methods use the fWinCP value and pass it to the Windows API functions like ::MultiByteToWideChar. This is wrong. The fIECP value should be used instead. For example when transcoding from the ISO-8859-2 encoding then fWinCP is 1250 and fIECP is 28592. Win32Transcoder::transcodeFrom(...) calls ::MultiByteToWideChar(1250, ...). This transcodes from the Windows-1250 code page, not from ISO-8859-2, and the result is wrong. The proposed patch: Replace fWinCP with fIECP in all calls of Windows API functions in all Win32Transcoder methods. In Win32Transcoder::transcodeFrom: ............... const unsigned int toEat = ::IsDBCSLeadByteEx(fIECP, *inPtr) ? 2 : 1; // Make sure a whol char is in the source if (inPtr + toEat > inEnd) break; // Try to translate this next char and check for an error const unsigned int converted = ::MultiByteToWideChar ( fIECP, MB_PRECOMPOSED | MB_ERR_INVALID_CHARS, (const char*)inPtr, toEat, outPtr, 1); ............... In Win32Transcoder::transcodeTo: ............... const unsigned int bytesStored = ::WideCharToMultiByte (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcPtr, 1, (char*)outPtr, outEnd - outPtr, 0, &usedDef); ............... In Win32Transcoder::canTranscodeTo: ............... const unsigned int bytesStored = ::WideCharToMultiByte (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcBuf, srcCount, tmpBuf, 64, 0, &usedDef); ............... --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]