Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-13 Thread Paul Moore
On 5 September 2016 at 21:19, Paul Moore wrote: > > The code I'm looking at doesn't use the raw stream (I think). The > problem I had (and the reason I was concerned) is that the code does > some rather messy things, and without tracing back through the full > code path, I'm not 100% sure *what* l

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-07 Thread Guido van Rossum
Congrats Steve! I'm provisionally accepting PEP 528. You can mark it as provisionally accepted in the PEP, preferably with a link to the mail.python.org archival copy of this message. Good luck with the implementation. -- --Guido van Rossum (python.org/~guido) __

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-06 Thread Random832
On Tue, Sep 6, 2016, at 06:34, Martin Panter wrote: > Yes, that was basically it. Though I had only thought as far as simple > encodings like ASCII, where one byte corresponds to one character. I > wonder if you really need UTF-8 support. Are the encoding values > currently encountered for Windows

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-06 Thread Martin Panter
On 5 September 2016 at 21:40, eryk sun wrote: > On Mon, Sep 5, 2016 at 7:54 PM, Steve Dower wrote: >> On 05Sep2016 1234, eryk sun wrote: >>> It would probably be simpler to use UTF-16 in the main pipeline and >>> implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 >>> buffer could

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread eryk sun
On Mon, Sep 5, 2016 at 9:45 PM, Steve Dower wrote: > > So it works, though the behaviour is a little strange when you do it from > the interactive prompt: > sys.stdin.buffer.raw.read(1) > ɒprint('hi') > b'\xc9' hi sys.stdin.buffer.raw.read(1) > b'\x92' > > What happens here is

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Steve Dower
On 05Sep2016 1308, Paul Moore wrote: On 5 September 2016 at 20:30, Steve Dower wrote: The only case we can reasonably handle at the raw layer is "n / 4" is zero but n != 0, in which case we can read and cache up to 4 bytes (one wchar_t) and then return those in future calls. If we try to cache

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread eryk sun
On Mon, Sep 5, 2016 at 7:54 PM, Steve Dower wrote: > On 05Sep2016 1234, eryk sun wrote: >> >> Also, the console is UCS-2, which can't be transcoded between UTF-16 >> and UTF-8. Supporting UCS-2 in the console would integrate nicely with >> the filesystem PEP. It makes it always possible to print >

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Paul Moore
On 5 September 2016 at 20:34, eryk sun wrote: > Paul, do you have example code that uses the 'raw' stream? Using the > buffer should behave as it always has -- at least in this regard. > sys.stdin.buffer requests a large block, such as 8 KB. But since the > console defaults to a cooked mode (i.e.

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Paul Moore
On 5 September 2016 at 20:30, Steve Dower wrote: > The only case we can reasonably handle at the raw layer is "n / 4" is zero > but n != 0, in which case we can read and cache up to 4 bytes (one wchar_t) > and then return those in future calls. If we try to cache any more than that > we're substit

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Steve Dower
On 05Sep2016 1234, eryk sun wrote: Also, the console is UCS-2, which can't be transcoded between UTF-16 and UTF-8. Supporting UCS-2 in the console would integrate nicely with the filesystem PEP. It makes it always possible to print os.listdir('.'), copy and paste, and read it back without data lo

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread eryk sun
I have some suggestions. With ReadConsoleW, CPython can use the pInputControl parameter to set a CtrlWakeup mask. This enables a Unix-style Ctrl+D for ending a read without having to press enter. For example: >>> CTRL_MASK = 1 << 4 >>> inctrl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0)

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Steve Dower
On 05Sep2016 1110, Paul Moore wrote: On 5 September 2016 at 18:38, Steve Dower wrote: Can you provide an example of how I'd rewrite the code that I quoted previously to follow this advice? Note - this is not theoretical, I expect to have to provide a PR to fix exactly this code should this chan

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Paul Moore
On 5 September 2016 at 18:38, Steve Dower wrote: >> Can you provide an example of how I'd rewrite the code that I quoted >> previously to follow this advice? Note - this is not theoretical, I >> expect to have to provide a PR to fix exactly this code should this >> change go in. At the moment I ca

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Steve Dower
On 05Sep2016 0941, Paul Moore wrote: On 5 September 2016 at 14:36, Steve Dower wrote: The best fix is to use a buffered reader, which will read all the available bytes and then let you .read(1), even if it happens to be an incomplete character. But this is sys.stdin.buffer.raw, we're talking

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Paul Moore
On 5 September 2016 at 14:36, Steve Dower wrote: > The best fix is to use a buffered reader, which will read all the available > bytes and then let you .read(1), even if it happens to be an incomplete > character. But this is sys.stdin.buffer.raw, we're talking about. People can't really layer an

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Steve Dower
severely complicates things and the advice to use a buffered reader is good advice anyway. Top-posted from my Windows Phone -Original Message- From: "Paul Moore" Sent: ‎9/‎5/‎2016 3:23 To: "Martin Panter" Cc: "Python Dev" Subject: Re: [Python-Dev] P

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Paul Moore
On 5 September 2016 at 10:37, Martin Panter wrote: > On 5 September 2016 at 09:10, Paul Moore wrote: >> On 5 September 2016 at 06:54, Steve Dower wrote: >>> +Using the raw object with small buffers >>> +--- >>> + >>> +Code that uses the raw IO object and attem

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Martin Panter
On 5 September 2016 at 09:10, Paul Moore wrote: > On 5 September 2016 at 06:54, Steve Dower wrote: >> +Using the raw object with small buffers >> +--- >> + >> +Code that uses the raw IO object and attempts to read less than four >> characters >> +will now rece

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-05 Thread Paul Moore
On 5 September 2016 at 06:54, Steve Dower wrote: > +Using the raw object with small buffers > +--- > + > +Code that uses the raw IO object and attempts to read less than four > characters > +will now receive an error. Because it's possible that any single chara

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-04 Thread Steve Dower
I posted a minor update to PEP 528 at https://github.com/python/peps/blob/master/pep-0528.txt and a diff below. While there are likely to be technical and compatibility issues to resolve after the changes are applied, I don't believe they impact the decision to accept the change at the PEP-lev

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Martin Panter
On 1 September 2016 at 23:28, Random832 wrote: > On Thu, Sep 1, 2016, at 18:28, Steve Dower wrote: >> This is a raw (bytes) IO class that requires text to be passed encoded >> with utf-8, which will be decoded to utf-16-le and passed to the Windows >> APIs. >> Similarly, bytes read from the class

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Adam Bartoš
> > The use of an ASCII compatible encoding is required to maintain > compatibility with code that bypasses the TextIOWrapper and directly > writes ASCII bytes to the standard streams (for example, > [process_stdinreader.py] > ). >

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Adam Bartoš
Steve Dower (steve.dower at python.org) on Thu Sep 1 18:28:53 EDT 2016 wrote I'm about to be offline for a few days, so I wanted to get my current > draft PEPs out for people can read and review. > > I don't believe there is a lot of change as a result of either PEP, but > the impact of what chang

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Adam Bartoš
Paul Moore (p.f.moore at gmail.com) on Fri Sep 2 05:23:04 EDT 2016 wrote > > On 2 September 2016 at 03:35, Steve Dower > wrote: > >* I'd need to test to be sure, but writing an incomplete code point should > *>* just truncate to before that poi

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-02 Thread Paul Moore
On 2 September 2016 at 03:35, Steve Dower wrote: > I'd need to test to be sure, but writing an incomplete code point should > just truncate to before that point. It may currently raise OSError if that > truncated to zero length, as I believe that's not currently distinguished > from an error. What

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-01 Thread Steve Dower
buffer from a single read, as there is no simple mapping between length-as-utf8 and length-as-utf16 for an arbitrary string. Top-posted from my Windows Phone -Original Message- From: "Random832" Sent: ‎9/‎1/‎2016 16:31 To: "python-dev@python.org" Subject: Re: [Python

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-01 Thread Random832
On Thu, Sep 1, 2016, at 18:28, Steve Dower wrote: > This is a raw (bytes) IO class that requires text to be passed encoded > with utf-8, which will be decoded to utf-16-le and passed to the Windows APIs. > Similarly, bytes read from the class will be provided by the operating > system as utf-16-le

[Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-01 Thread Steve Dower
I'm about to be offline for a few days, so I wanted to get my current draft PEPs out for people can read and review. I don't believe there is a lot of change as a result of either PEP, but the impact of what change there is needs to be weighed against the benefits. If anything, I'm likely to