On 13Aug2016 0523, Random832 wrote:
On Sat, Aug 13, 2016, at 04:12, Stephen J. Turnbull wrote:
Steve Dower writes:
> ISTM that changing sys.getfilesystemencoding() on Windows to
> "utf-8" and updating path_converter() (Python/posixmodule.c;
I think this proposal requires the assumption that strings intended to
be interpreted as file names invariably come from the Windows APIs. I
don't think that is true: Makefiles and similar, configuration files,
all typically contain filenames. Zipfiles (see below).
And what's going to happen if you shovel those bytes into the
filesystem without conversion on Linux, or worse, OSX? This problem
isn't unique to Windows.
Yeah, this is basically my view too. If your path bytes don't come from
the filesystem, you need to know the encoding regardless. But it's very
reasonable to be able to round-trip. Currently, the following two lines
of code can have different behaviour on Windows (i.e. the latter fails
to open the file):
>>> open(os.listdir('.')[-1])
>>> open(os.listdir(b'.')[-1])
On Windows, the filesystem encoding is inherently Unicode, which means
you can't reliably round-trip filenames through the current code page.
Changing all of Python to use the Unicode APIs internally and making the
bytes encoding utf-8 (or utf-16-le, which would save a conversion)
resolves this and doesn't really affect
These just aren't under OS control, so the assumption will
fail.
So I believe bytes-oriented software must expect non-UTF-8 file names
in Japan.
Even on Japanese Windows, non-UTF-8 file names must be encodable with
UTF-16 or they cannot exist on the file system. This moves the encoding
boundary into the application, which is where it needed to be anyway for
robust software - "Correct" path handling still requires decoding to
text, and if you know that your source is the encoded with the active
code page then byte_path.decode('mbcs', 'surrogateescape') is still valid.
Cheers,
Steve
_______________________________________________
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/