Chris Angelico writes:

 > Isn't that what file objects have attributes for?

You're absolutely right.  Not sure what I was thinking.  (Note: not an
excuse for my brain bubble, but Path.read_text and Path.read_binary do
have this problem because they return str and bytes respectively.)

 > Do you get files that lack the BOM?

As I wrote earlier, I don't get UTF-16 text files at all.  You'll have
to ask somebody else.  I'm just pointing out that it's pretty likely
that if they exist, there are languages that are likely to not
distinguish ASCII from UTF-16 in some files without a (fragile)
statistical analysis of byte frequencies.

Do you actually face the problem of receiving data that should be
decoded one way but Python does something different by default?  Or
are you just tired of hearing about the problems of people who can't
"just assume UTF-8 and wish Python would, too"?

 > so IMO it's not unreasonable to assert that all files that don't
 > start either b"\xFF\xFE" or b"\xFE\xFF" should be decoded using the
 > ASCII-compatible detection method.

As I've said before, I think Naoki's suggestion is aimed at something
different: the user for whom getpreferredencoding normally DTRTs but
has streams that they know are UTF-8 and want a simple obvious way to
read and write them.  That is the usual case in my experience.  As of
now, Guido and Naoki have agreed to document "encoding='utf-8'" and
drop 'open_text', so I think the discussion is moot, unless somebody
really wants to push autodetection of encodings.

If somebody has a different experience, I'd like to hear about it.
But note that my experience (and Naoki's) is special: in Japan we
encounter at least three different encodings of Japanese daily in
plain text (ISO-2022-JP in mail, UTF-8 and Shift-JIS in local files).
So if anybody is likely to experience the need, I believe we are.

Steve

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FIHZYB3W5ZXYFMOQSNPYB3SAE7DHD44I/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to