[issue37275] GetConsole(Output)CP is used even when stdin/stdout is redirected

2019-06-14 Thread Inada Naoki

Inada Naoki  added the comment:

On Sat, Jun 15, 2019 at 2:43 AM Eryk Sun  wrote:
>
> Eryk Sun  added the comment:
>
> > # Power Shell 6 (use cp65001 by default)
> > PS C:¥> python3 -c "print('おはよう')" > ps.txt
>
> PowerShell standard I/O redirection is different from any shell I've ever 
> used. In this case, it runs Python with StandardOutput set to a handle for a 
> pipe instead of a handle for the file. It decodes Python's output using 
> whatever encoding is configured for input and re-encodes it with whatever 
> encoding is configured for output.

I'm sorry,  I mixed my assumption.  I checked `os.device_encoding()` in cmd,
but forgot to check it on Power Shell.  All I said about Power Shell was just
my assumption and it was wrong.  And thank you for clarifying.

I confirmed writing UTF-8 to pipe cause mojibake, because Power Shell decodes
it using cp932.

```
PS C:\Users\inada-n> python3 -Xutf8 -c "import os,sys;
print(os.device_encoding(1), sys.stdout.encoding, file=sys.stderr);
print('こんにちは')" >x
None utf-8
PS C:\Users\inada-n> type x
```

Hmm, how can I teach to Power Shell about python3 is using
UTF-8 for stdout?
It seems cmd.exe with chcp 65001 and PYTHONUTF8=1 is better
than PowerShell when I want to use UTF-8 on Windows.

Anyway, nothing is wrong about python.  I just didn't understand
PowerShell at all.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37275] GetConsole(Output)CP is used even when stdin/stdout is redirected

2019-06-14 Thread Inada Naoki


Change by Inada Naoki :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37275] GetConsole(Output)CP is used even when stdin/stdout is redirected

2019-06-14 Thread Eryk Sun

Eryk Sun  added the comment:

> # Power Shell 6 (use cp65001 by default)
> PS C:¥> python3 -c "print('おはよう')" > ps.txt

PowerShell standard I/O redirection is different from any shell I've ever used. 
In this case, it runs Python with StandardOutput set to a handle for a pipe 
instead of a handle for the file. It decodes Python's output using whatever 
encoding is configured for input and re-encodes it with whatever encoding is 
configured for output. 

To see what Python is actually writing, try using Start-Process with 
StandardOutput redirected to the file. For example:

Start-Process -FilePath python3.exe -ArgumentList "-c `"print('おはよう')`"" 
-NoNewWindow -Wait -RedirectStandardOutput ps.txt

> # cmd.exe
> C:¥> chcp 65001
> C:¥> python3 -c "print('おはよう')" > cmd.txt

CMD uses simple redirection, like every other shell I've ever used. It runs 
python3.exe with a handle for the file as its StandardOutput. So "cmd.txt" 
contains exactly what Python writes.

> * TextIOWrapper tries `os.device_encoding(1)`
> * `os.device_encoding(1)` use GetConsoleOutputCP() without checking stdout is 
> console

No, _Py_device_encoding returns None if the isatty(fd) is false, i.e. for a 
pipe or disk file. In this case, TextIOWrapper defaults to the encoding from 
locale.getpreferredencoding().

The current preferred encoding is the system ANSI codepage from GetACP(). 
Changing the default to UTF-8 would be disruptive. You can use UTF-8 mode (i.e. 
-X utf8). Or, to override just stdin, stdout, and stderr, set the environment 
variable "PYTHONIOENCODING=utf-8".

> In the example above, a console is attached when python is called from 
> Power Shell 6, but it is not attached when python is called from 
> cmd.exe.

In both cases the Python process inherits the console of the parent shell. The 
only way to run python.exe without a console is to use the CreateProcess 
creation flag DETACHED_PROCESS.

> There is a relating issue: UTF-8 mode doesn't override 
> stdin,stdout,stderr encoding when console is attached.

It works for me. For example, testing with stdout redirected to a pipe:

C:\>python -c "import sys;print(sys.stdout.encoding)" | more
cp1252

C:\>python -X utf8 -c "import sys;print(sys.stdout.encoding)" | more
utf-8

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37275] GetConsole(Output)CP is used even when stdin/stdout is redirected

2019-06-14 Thread Steve Dower


Steve Dower  added the comment:

Isn't the point that device_encoding(FD) gets the encoding of the specified 
file? In this case stdout?

It seems odd that chcp doesn't actually update the console code page here, as 
that is its entire purpose. Perhaps TextIOWrapper is actually getting ACP 
rather than the console encoding through some other path?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37275] GetConsole(Output)CP is used even when stdin/stdout is redirected

2019-06-14 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37275] GetConsole(Output)CP is used even when stdin/stdout is redirected

2019-06-13 Thread Inada Naoki

New submission from Inada Naoki :

When stdout is redirected to file and cp65001 is used, stdout encoding is 
unexpectable:

  # Power Shell 6 (use cp65001 by default)
  PS C:¥> python3 -c "print('おはよう')" > ps.txt

  # cmd.exe
  C:¥> chcp 65001
  C:¥> python3 -c "print('おはよう')" > cmd.txt

Now, ps.txt is encoded by UTF-8, but cmd.txt is encoded by cp932 (ACP).


This is because:

* TextIOWrapper tries `os.device_encoding(1)`
* `os.device_encoding(1)` use GetConsoleOutputCP() without checking stdout is 
console

In the example above, a console is attached when python is called from Power 
Shell 6, but it is not attached when python is called from cmd.exe.

I think using GetConsoleOutputCP() for non console is abusing.

---

There is a relating issue: UTF-8 mode doesn't override stdin,stdout,stderr 
encoding when console is attached.

On Unix, os.device_encoding() uses locale encoding and UTF-8 mode overrides 
locale encoding.  Good.

But on Windows, os.device_encoding() uses GetConsole(Output)CP().  UTF-8 mode 
doesn't override it.

If we stop abusing GetConsoleOutputCP(), this issue is fixed automatically.
But if we keep using GetConsoleOutputCP() for stdout which is not a console, 
UTF-8 mode should override it.

--
components: Windows
messages: 345551
nosy: inada.naoki, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
priority: normal
severity: normal
status: open
title: GetConsole(Output)CP is used even when stdin/stdout is redirected
type: behavior
versions: Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com