[issue45120] Windows cp encodings "UNDEFINED" entries update

Marc-Andre Lemburg Fri, 17 Sep 2021 00:35:07 -0700


Marc-Andre Lemburg <[email protected]> added the comment:


Just to be clear: The Python code page encodings are (mostly) taken from the 
unicode.org set of mappings (ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/). 
This is our standards body for such mappings, where possible. In some cases, 
the Unicode consortium does not provide such mappings and we resort to other 
standards (ISO, commonly used mapping files in OSes, Wikipedia, etc).

Changes to the existing mapping codecs should only be done in case corrections 
are applied to the mappings under those names by the standard bodies.

If you want to add variants such as the best fit ones from MS, we'd have to add 
them under a different name, e.g. bestfit1252 (see 
ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/).

Otherwise, interop with other systems would no longer.

>From Eryk's description it sounds like we should always add 
>WC_NO_BEST_FIT_CHARS as an option to MultiByteToWideChar() in order to make 
>sure it doesn't use best fit variants unless explicitly requested.

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue45120>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45120] Windows cp encodings "UNDEFINED" entries update

Reply via email to