Mingye Wang added the comment: > Codecs are strict by default in Python. Call MultiByteToWideChar() with the > MB_ERR_INVALID_CHARS flag as Python does.
Great catch. Without MB_ERR_INVALID_CHARS or WC_NO_BEST_FIT_CHARS Windows would perform the "best fit" behavior described in the BestFit files, which is not marked explicitly (they didn't add '<< Best Fit Mapping' like in the readme) in these files and requires checking for existence of reverse mapping[1]. When MB_ERR_INVALID_CHARS is set, Windows would perform a strict check. [2]: http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/readme.txt By the way, will there be a 'mbcsbestfitreplace' error handler on Windows to invoke "best fit" behavior? It might be useful for interoperating with common Windows programs and users. (Implementation for other platforms can be constructed from WindowsBestFit charts, but it might be too large relative to its usefulness.) ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28712> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com