On 16Aug2016 1603, Victor Stinner wrote:
2016-08-16 17:56 GMT+02:00 Steve Dower <steve.do...@python.org>:
2. Windows file system encoding is *always* UTF-16. There's no "assuming
mbcs" or "assuming ACP" or "assuming UTF-8" or "asking the OS what encoding
it is". We know exactly what the encoding is on every supported version of
Windows. UTF-16.
I think that you missed a important issue (or "use case") which is
called the "Makefile problem" by Mercurial developers:
https://www.mercurial-scm.org/wiki/EncodingStrategy#The_.22makefile_problem.22
I already explained it before, but maybe you misunderstood or just
missed it, so here is a more concrete example.
I guess I misunderstood. The concrete example really help, thank you.
The problem here is that there is an application boundary without a
defined encoding, right where you put the comment.
filenameb = os.listdir(b'.')[0]
# Python 3.5 encodes Unicode (UTF-16) to the ANSI code page
# what if Python 3.7 encodes Unicode (UTF-16) to UTF-8?
print("filename bytes: %a" % filenameb)
proc = subprocess.Popen(['py', '-2', script],
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
stdout = proc.communicate(filenameb)[0]
print("File content: %a" % stdout)
If you are defining the encoding as 'mbcs', then you need to check that
sys.getfilesystemencoding() == 'mbcs', and if it doesn't then reencode.
Alternatively, since this script is the "new" code, you would use
`os.listdir('.')[0].encode('mbcs')`, given that you have explicitly
determined that mbcs is the encoding for the later transfer.
Essentially, the problem is that this code is relying on a certain
non-guaranteed behaviour of a deprecated API, where using
sys.getfilesystemencoding() as documented would have prevented any issue
(see
https://docs.python.org/3/library/os.html#file-names-command-line-arguments-and-environment-variables).
In one of the emails I think you missed, I called this out as the only
case where code will break with a change to sys.getfilesystemencoding().
So yes, breaking existing code is something I would never do lightly.
However, I'm very much of the opinion that the only code that will break
is code that is already broken (or at least fragile) and that nobody is
forced to take a major upgrade to Python or should necessarily expect
100% compatibility between major versions.
Cheers,
Steve
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/