New submission from STINNER Victor:

When LANG=C is used to get the english language (which is a mistake, LC_CTYPE=C 
should be used instead) or when Python is started with an empty environment (no 
environment variable), Python gets the POSIX locale (aka "C locale") for the 
LC_CTYPE (encoding) locale.

Standard streams use the locale encoding, which is usually ASCII with POSIX 
locale on most platforms (except on AIX: ISO 8859-1). In this case, data read 
from the OS (environment variables, command line arguments, filenames, etc.) 
may contain surrogate characters because of the internal usage of the 
surrogateescape error handler (see the PEP 383 for the rationale).

The problem is that standard output uses the strict error handler, and so 
print() fails to display OS data like filenames.

Example, "ls" command in Python:
---
import os
for name in sorted(os.listdir()): print(name)
---

Try it with "LANG=C python ls.py" in a directory containing non-ASCII 
characters and you will get unicode errors.

Issues #19846 and #19847 are examples of this annoyance.

I propose to use also the surrogateescape error handler for sys.stdout if the 
POSIX locale is used for LC_CTYPE at startup. Attached patch implements this 
idea.

With the patch, "LANG=C python ls.py" almost works as filenames and stdout are 
byte streams, even if the Unicode type is used.

----------
components: Unicode
files: c_locale_surrogateescape.patch
keywords: patch
messages: 206111
nosy: Sworddragon, a.badger, ezio.melotti, haypo, loewis, ncoghlan
priority: normal
severity: normal
status: open
title: Use "surrogateescape" error handler for sys.stdout on UNIX for the C 
locale
versions: Python 3.4
Added file: http://bugs.python.org/file33122/c_locale_surrogateescape.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19977>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to