Eryk Sun added the comment:
The issue isn't quite the same for 3.5+. The new CRT uses Windows Vista locale
APIs. In this case it uses LOCALE_SENGLISHLANGUAGENAME instead of the old
LOCALE_SENGLANGUAGE. This maps "Norwegian" to simply "Norwegian" instead of
"Norwegian Bokmål":
>>> locale.setlocale(locale.LC_TIME, 'norwegian')
'Norwegian_Norway.1252'
The "Norwegian Bokmål" language name has to be requested explicitly to see the
same problem:
>>> try: locale.setlocale(locale.LC_TIME, 'Norwegian Bokmål')
... except Exception as e: print(e)
...
unsupported locale setting
The fix for 3.4 would be to encode the locale string using
PyUnicode_AsMBCSString (ANSI). It's too late, however, since 3.4 is no longer
getting bug fixes.
For 3.5+, setlocale could either switch to using _wsetlocale on Windows or call
setlocale with the string encoded via Py_EncodeLocale (wcstombs). Encoding the
string via wcstombs is required because the new CRT roundtrips the conversion
via mbstowcs before forwarding the call to _wsetlocale. This means that success
depends on the current LC_CTYPE, unless Python switches to calling _wsetlocale
directly.
As a workaround for 3.5+, the new CRT also supports RFC 4646 language-tag
locales when running on Vista or later. For example, "Norwegian Bokmål" is
simply "nb".
Language-tag locales differ from POSIX locales. Superficially, they use "-"
instead of "_" as the delimiter. More importantly, they don't allow explicitly
setting the codeset. Instead of a .codeset, they use ISO 15924 script codes.
Specifying a script may select a different ANSI codepage. It depends on whether
there's an NLS definition for the language-script combination. For example,
Bosnian can be written using either Latin or Cyrillic. Thus the "bs-BA" and
"bs-Latn-BA" locales use the Central Europe codepage 1250, but "bs-Cyrl-BA"
uses the Cyrillic codepage 1251. On the other hand, "en-Cyrl-US" still uses the
Latin codepage 1252.
As a separate issue, language-tag locales break the parsing in locale.getlocale:
>>> locale.setlocale(locale.LC_TIME, 'nb-NO')
'nb-NO'
>>> try: locale.getlocale(locale.LC_TIME)
... except Exception as e: print(e)
...
unknown locale: nb-NO
>>> locale.setlocale(locale.LC_CTYPE, 'bs-Cyrl-BA')
'bs-Cyrl-BA'
>>> try: locale.getlocale(locale.LC_CTYPE)
... except Exception as e: print(e)
...
unknown locale: bs-Cyrl-BA
----------
resolution: -> third party
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue26024>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com