Eryk Sun added the comment:

There is no right encoding as far as I can see. 

If it's attached to a console (i.e. conhost.exe), then cmd.exe uses the 
console's output codepage when writing to a pipe or file, which is the scenario 
that your patch attempts to address. But if you pass 
creationflags=CREATE_NO_WINDOW, then the new console (created without a window) 
uses the OEM codepage, CP_OEMCP. And if you pass creationflags=DETACHED_PROCESS 
(i.e. no console), cmd uses the ANSI codepage, CP_ACP. There's also a "/u" 
option to force cmd to use the native Unicode encoding on Windows, UTF-16LE.

Note that the above only considers cmd.exe. Its child processes can write 
output using any encoding. You may end up with several different encodings 
present in the same stream. Many, if not most, programs don't use the console's 
current codepage when writing to a pipe or file. Commonly they default to OEM, 
ANSI, UTF-8, or UTF-16LE. For example, Windows Python uses ANSI for standard 
I/O that's not a console, unless you set PYTHONIOENCODING. 

Even if a called program cares about the console output codepage, your patch 
doesn't implement this robustly. It uses sys.stdout and sys.stderr, but those 
can be reassigned. Even sys.__stdout__ and sys.__stderr__ may be irrelevant. 
Python could be run via pythonw.exe for which the latter are None (unless it's 
started with non-NULL standard handles). Or python.exe could be run with 
standard I/O redirected to pipes or files, defaulting to ANSI. Also, the 
current program or called program could change the console encoding via 
chcp.com, which is just an indirect way of calling the WinAPI functions 
SetConsoleCP and SetConsoleOutputCP. 

There's no common default encoding for standard I/O on Windows, especially not 
a common UTF encoding, so universal_newlines=True, getoutput, and 
getstatusoutput may be of limited use. Preferably a calling program can set an 
option like cmd's "/u" or Python's PYTHONIOENCODING to force using a Unicode 
encoding, and then manually decode the output by wrapping stdout/stderr in 
instances of io.TextIOWrapper. It would help if subprocess.Popen had parameters 
for encoding and errors.

----------
nosy: +eryksun

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27179>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to