Mingye Wang added the comment:

> Codecs are strict by default in Python. Call MultiByteToWideChar() with the 
> MB_ERR_INVALID_CHARS flag as Python does.

Great catch. Without MB_ERR_INVALID_CHARS or WC_NO_BEST_FIT_CHARS Windows would 
perform the "best fit" behavior described in the BestFit files, which is not 
marked explicitly (they didn't add '<< Best Fit Mapping' like in the readme) in 
these files and requires checking for existence of reverse mapping[1]. When 
MB_ERR_INVALID_CHARS is set, Windows would perform a strict check.
  [2]: 
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/readme.txt

By the way, will there be a 'mbcsbestfitreplace' error handler on Windows to 
invoke "best fit" behavior? It might be useful for interoperating with common 
Windows programs and users. (Implementation for other platforms can be 
constructed from WindowsBestFit charts, but it might be too large relative to 
its usefulness.)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28712>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to