On Sun, Jan 24, 2021 at 9:53 AM <2qdxy4rzwzuui...@potatochowder.com> wrote:
> On 2021-01-25 at 00:29:41 +1100,
> Steven D'Aprano wrote:
>
> > On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote:
> > > First problem I see is that the file may be a pipe and then you will
> block
> > > until
My previous thread is hijacked about "auto guessing" idea, so I split
this thread for pathlib.
Path.open() was added in Python 3.4. Path.read_text() and
Path.write_text() was added in Python 3.5.
Their history is shorter than built-in open(). Changing its default
encoding should be easier than bui
On Sun, Jan 24, 2021, at 13:18, MRAB wrote:
> Well, if you see patterns like b'\x00H\x00e\x00l\x00l\x00o' then it's
> probably UTF16-BE and if you see patterns like
> b'H\x00e\x00l\x00l\x00o\x00' then it's probably UTF16-LE.
>
> You could also look for, say, sequences of Latin characters and
>
The main benefits of this proposal:
- the order of fields (those with defaults, those without) is
irrelevant
- don't need to pedantically add default = None for Optional values
On Sun, 2021-01-24 at 19:46 +, Paul Bryan via Python-ideas wrote:
> I've created a helper class in my own library th
On 1/24/21 1:18 PM, MRAB wrote:
> On 2021-01-24 17:04, Chris Angelico wrote:
>> On Mon, Jan 25, 2021 at 3:55 AM Stephen J. Turnbull
>> wrote:
>>>
>>> Chris Angelico writes:
>>> > Right, but as long as there's only one system encoding, that's not
>>> > our problem. If you're on a Greek system and
I've created a helper class in my own library that enhances the
existing dataclass:
a) __init__ accepts keyword-only arguments,
b) Optional[...] attribute without a specified default value would
default to None in __init__.
I think this could be useful in stdlib. I'm thinking a dataclass
decorato
On 2021-01-24 17:04, Chris Angelico wrote:
On Mon, Jan 25, 2021 at 3:55 AM Stephen J. Turnbull
wrote:
Chris Angelico writes:
> Right, but as long as there's only one system encoding, that's not
> our problem. If you're on a Greek system and you want to decode
> ISO-8859-9 text, you have to
On Mon, Jan 25, 2021 at 3:55 AM Stephen J. Turnbull
wrote:
>
> Chris Angelico writes:
> > Right, but as long as there's only one system encoding, that's not
> > our problem. If you're on a Greek system and you want to decode
> > ISO-8859-9 text, you have to state that explicitly. For the
> > s
Chris Angelico writes:
> Can anyone give an example of a current system encoding (ie one that
> is likely to be the default currently used by open()) that can have
> byte values below 128 which do NOT mean what they would mean in ASCII?
> In other words, is it possible to read in a section of
On 1/24/21 6:00 AM, Chris Angelico wrote:
> Sorry, let me clarify.
>
> Can anyone give an example of a current system encoding (ie one that
> is likely to be the default currently used by open()) that can have
> byte values below 128 which do NOT mean what they would mean in ASCII?
> In other words
On 2021-01-25 at 00:29:41 +1100,
Steven D'Aprano wrote:
> On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote:
>
> > I think that you are going to create a bug magnet if you attempt to auto
> > detect the encoding.
> >
> > First problem I see is that the file may be a pipe and then you
On Mon, Jan 25, 2021 at 12:33 AM Steven D'Aprano wrote:
>
> On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote:
>
> > I think that you are going to create a bug magnet if you attempt to auto
> > detect the encoding.
> >
> > First problem I see is that the file may be a pipe and then you w
On Sat, Jan 23, 2021 at 03:24:12PM +, Barry Scott wrote:
> I think that you are going to create a bug magnet if you attempt to auto
> detect the encoding.
>
> First problem I see is that the file may be a pipe and then you will block
> until you have enough data to do the auto detect.
Can yo
On Sat, Jan 23, 2021, at 22:43, Matt Wozniski wrote:
> 1. Deprecate calling `open` for text mode (the default) unless an
> `encoding=` is specified,
I have a suggestion, if this is going to be done:
If the third positional argument to open is a string, accept it as encoding
instead of buffering
On Sun, Jan 24, 2021 at 10:00:47PM +1100, Chris Angelico wrote:
> On Sun, Jan 24, 2021 at 9:13 PM Stephen J. Turnbull
> wrote:
> >
> > Chris Angelico writes:
> >
> > > Can anyone give an example of a current in-use system encoding that
> > > would have [ASCII bytes in non-ASCII text]?
> >
> > Sh
On Sun, Jan 24, 2021 at 9:13 PM Stephen J. Turnbull
wrote:
>
> Chris Angelico writes:
>
> > Can anyone give an example of a current in-use system encoding that
> > would have [ASCII bytes in non-ASCII text]?
>
> Shift JIS, Big5. (Both can have bytes < 128 inside multibyte
> characters.) I don'
Chris Angelico writes:
> Can anyone give an example of a current in-use system encoding that
> would have [ASCII bytes in non-ASCII text]?
Shift JIS, Big5. (Both can have bytes < 128 inside multibyte
characters.) I don't know if Big5 is still in use as the default
encoding anywhere, but Shift
Matt Wozniski writes:
> Rather than introducing a new `open_utf8` function, I'd suggest the
> following:
>
> 1. Deprecate calling `open` for text mode (the default) unless an
> `encoding=` is specified,
For that, we should have a sentinel for "system default encoding" (as
you acknowledge, b
Cameron Simpson writes:
> I thought I'd seen [UTF-16 BOM] on Windows text files within the
> last year or so (I don't use Windows often, so this is happenstance
> from receiving some data, not an observation of the Windows
> ecosystem; my recollection is that it was a UTF16 CSV file.)
OK; my
Guido van Rossum writes:
> I have definitely seen BOMs written by Notepad on Windows 10.
I'm not clear on what circumstances we care if a UTF-8 file has or
doesn't have a UTF-8 signature. Most software doesn't care, it just
reads it and spits it back out if it's there and hasn't been edited
out
20 matches
Mail list logo