On 13 March 2017 at 18:37, INADA Naoki <songofaca...@gmail.com> wrote:
> But locale coercing works nice on platforms like android. > So how about simplified version of PEP 538? Just adding configure > option for locale coercing > which is disabled by default. No envvar options and no warnings. > That doesn't solve my original Linux distro problem, where locale misconfiguration problems show up as "Python 2 works, Python 3 doesn't work" behaviour and bug reports. The problem is that where Python 2 was largely locale-independent by default (just passing raw bytes through) such that you'd only get immediate encoding or decoding errors if you had a Unicode literal or a decode() call somewhere in your code and would otherwise pass data corruption problems further down the chain, Python 3 is locale-*aware* by default, and eagerly decodes: - command line parameters - environment variables - responses from operating system API calls - standard stream input - file contents You *can* still write locale-independent Python 3 applications, but they involve sprinkling liberal doses of "b" prefixes and suffixes and mode settings and "surrogateescape" error handler declarations in various places - you can't just run python-modernize over a pre-existing Python 2 application and expect it to behave the same way in the C locale as it did before. Once implemented, PEP 540 will partially solve the problem by introducing a locale independent UTF-8 mode, but that still leaves the inconsistency with other locale-aware components that are needing to deal with Python 3 API calls that accept or return Unicode objects where Python 2 allowed the use of 8-bit strings. Folks that really want the old behaviour back will be able to set PYTHONCOERCECLOCALE=0 (as that no longer emits any warnings), or else build their own CPython from source using `--without-c-locale-coercion` and ``--without-c-locale-warning`. However, they'll also get the explicit support notification from PEP 11 that any Unicode handling bugs they run into in those configurations are entirely their own problem - we won't fix them, because we consider those configurations unsupportable in the general case. That puts the additional self-support burden on folks doing something unusual (i.e. insisting on running an ASCII-only environment in 2017), rather than on those with a more conventional use case (i.e. running an up to date \*nix OS using UTF-8 or another universal encoding for both local and remote interfaces). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com