Guido van Rossum writes:

 > I have definitely seen BOMs written by Notepad on Windows 10.

I'm not clear on what circumstances we care if a UTF-8 file has or
doesn't have a UTF-8 signature.  Most software doesn't care, it just
reads it and spits it back out if it's there and hasn't been edited
out.

If people are seeing UTF-16 BOMs, that may be worth detecting,
depending on how often and how much trouble it is to deal with them.
I'm just saying that I never see them.  I was pretty careful about
saying that my sample is quite restricted.

However ...

 > Why can’t the future be that open() in text mode guesses the
 > encoding?

The medium-term future is UTF-8 in all UIs and public APIs, except for
archivists.  I think we all agree on that.

There are two issues with encoding guessing.  The statistically
unimportant one (at least for UTFs) is that guessing is guessing.  It
will get it wrong.  The people who want guessing are mostly people who
will be hurt most by wrong guesses.

Second, and a real issue for design AFAICS: if you introduce detection
of other encodings to 'open', the programmer may need to (1) discover
that encoding in order to match it on output (open does not return
that), or (2) choose the correct encoding on output, which may or may
not be the detected one depending on what the next software in the
pipeline expects.  At that point "in the face of ambiguity" really
does bind, "although practicality" notwithstanding.  I'm not sure that
putting detection into 'open' solves any problems, it just pushes them
into other parts of the code.

Remark: As I understand it, Naoki's proposal is about the casual coder
in a monolingual environment where either defaulting to
getpreferredencoding DTRTs or they need UTF-8 because some engineer
decided "UTF-8 is the future, and in my project the future is now!"
I don't think it's intended to be more general than that, but you'll
have to ask him about that.

Steve
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZRUF34M5QWQKCDCMEMJOAIIONISCMZIJ/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to