[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Matt Wozniski
On Sun, Jan 24, 2021 at 9:53 AM <2qdxy4rzwzuui...@potatochowder.com> wrote: > On 2021-01-25 at 00:29:41 +1100, > Steven D'Aprano wrote: > > > On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote: > > > First problem I see is that the file may be a pipe and then you will > block > > > until

[Python-ideas] Changing the default text encoding of pathlib

2021-01-24 Thread Inada Naoki
My previous thread is hijacked about "auto guessing" idea, so I split this thread for pathlib. Path.open() was added in Python 3.4. Path.read_text() and Path.write_text() was added in Python 3.5. Their history is shorter than built-in open(). Changing its default encoding should be easier than bui

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Random832
On Sun, Jan 24, 2021, at 13:18, MRAB wrote: > Well, if you see patterns like b'\x00H\x00e\x00l\x00l\x00o' then it's > probably UTF16-BE and if you see patterns like > b'H\x00e\x00l\x00l\x00o\x00' then it's probably UTF16-LE. > > You could also look for, say, sequences of Latin characters and >

[Python-ideas] Re: dataclass: __init__ kwargs and Optional[type]

2021-01-24 Thread Paul Bryan via Python-ideas
The main benefits of this proposal: - the order of fields (those with defaults, those without) is irrelevant - don't need to pedantically add default = None for Optional values On Sun, 2021-01-24 at 19:46 +, Paul Bryan via Python-ideas wrote: > I've created a helper class in my own library th

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Richard Damon
On 1/24/21 1:18 PM, MRAB wrote: > On 2021-01-24 17:04, Chris Angelico wrote: >> On Mon, Jan 25, 2021 at 3:55 AM Stephen J. Turnbull >> wrote: >>> >>> Chris Angelico writes: >>>  > Right, but as long as there's only one system encoding, that's not >>>  > our problem. If you're on a Greek system and

[Python-ideas] dataclass: __init__ kwargs and Optional[type]

2021-01-24 Thread Paul Bryan via Python-ideas
I've created a helper class in my own library that enhances the existing dataclass: a) __init__ accepts keyword-only arguments, b) Optional[...] attribute without a specified default value would default to None in __init__. I think this could be useful in stdlib. I'm thinking a dataclass decorato

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread MRAB
On 2021-01-24 17:04, Chris Angelico wrote: On Mon, Jan 25, 2021 at 3:55 AM Stephen J. Turnbull wrote: Chris Angelico writes: > Right, but as long as there's only one system encoding, that's not > our problem. If you're on a Greek system and you want to decode > ISO-8859-9 text, you have to

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Chris Angelico
On Mon, Jan 25, 2021 at 3:55 AM Stephen J. Turnbull wrote: > > Chris Angelico writes: > > Right, but as long as there's only one system encoding, that's not > > our problem. If you're on a Greek system and you want to decode > > ISO-8859-9 text, you have to state that explicitly. For the > > s

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Stephen J. Turnbull
Chris Angelico writes: > Can anyone give an example of a current system encoding (ie one that > is likely to be the default currently used by open()) that can have > byte values below 128 which do NOT mean what they would mean in ASCII? > In other words, is it possible to read in a section of

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Richard Damon
On 1/24/21 6:00 AM, Chris Angelico wrote: > Sorry, let me clarify. > > Can anyone give an example of a current system encoding (ie one that > is likely to be the default currently used by open()) that can have > byte values below 128 which do NOT mean what they would mean in ASCII? > In other words

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread 2QdxY4RzWzUUiLuE
On 2021-01-25 at 00:29:41 +1100, Steven D'Aprano wrote: > On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote: > > > I think that you are going to create a bug magnet if you attempt to auto > > detect the encoding. > > > > First problem I see is that the file may be a pipe and then you

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Chris Angelico
On Mon, Jan 25, 2021 at 12:33 AM Steven D'Aprano wrote: > > On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote: > > > I think that you are going to create a bug magnet if you attempt to auto > > detect the encoding. > > > > First problem I see is that the file may be a pipe and then you w

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Steven D'Aprano
On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote: > I think that you are going to create a bug magnet if you attempt to auto > detect the encoding. > > First problem I see is that the file may be a pipe and then you will block > until you have enough data to do the auto detect. Can yo

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Random832
On Sat, Jan 23, 2021, at 22:43, Matt Wozniski wrote: > 1. Deprecate calling `open` for text mode (the default) unless an > `encoding=` is specified, I have a suggestion, if this is going to be done: If the third positional argument to open is a string, accept it as encoding instead of buffering

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Steven D'Aprano
On Sun, Jan 24, 2021 at 10:00:47PM +1100, Chris Angelico wrote: > On Sun, Jan 24, 2021 at 9:13 PM Stephen J. Turnbull > wrote: > > > > Chris Angelico writes: > > > > > Can anyone give an example of a current in-use system encoding that > > > would have [ASCII bytes in non-ASCII text]? > > > > Sh

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Chris Angelico
On Sun, Jan 24, 2021 at 9:13 PM Stephen J. Turnbull wrote: > > Chris Angelico writes: > > > Can anyone give an example of a current in-use system encoding that > > would have [ASCII bytes in non-ASCII text]? > > Shift JIS, Big5. (Both can have bytes < 128 inside multibyte > characters.) I don'

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Stephen J. Turnbull
Chris Angelico writes: > Can anyone give an example of a current in-use system encoding that > would have [ASCII bytes in non-ASCII text]? Shift JIS, Big5. (Both can have bytes < 128 inside multibyte characters.) I don't know if Big5 is still in use as the default encoding anywhere, but Shift

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Stephen J. Turnbull
Matt Wozniski writes: > Rather than introducing a new `open_utf8` function, I'd suggest the > following: > > 1. Deprecate calling `open` for text mode (the default) unless an > `encoding=` is specified, For that, we should have a sentinel for "system default encoding" (as you acknowledge, b

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Stephen J. Turnbull
Cameron Simpson writes: > I thought I'd seen [UTF-16 BOM] on Windows text files within the > last year or so (I don't use Windows often, so this is happenstance > from receiving some data, not an observation of the Windows > ecosystem; my recollection is that it was a UTF16 CSV file.) OK; my

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

2021-01-24 Thread Stephen J. Turnbull
Guido van Rossum writes: > I have definitely seen BOMs written by Notepad on Windows 10. I'm not clear on what circumstances we care if a UTF-8 file has or doesn't have a UTF-8 signature. Most software doesn't care, it just reads it and spits it back out if it's there and hasn't been edited out