Re: [Python-Dev] File system path encoding on Windows

Steve Dower Mon, 29 Aug 2016 20:32:19 -0700

On 29Aug2016 1810, Nick Coghlan wrote:

On 30 August 2016 at 08:38, Victor Stinner <[email protected]> wrote:

Hi,


tl; dr: just drop byte support and help developers to use Unicode in
their application!


My view (and Steve's) is that this approach is likely to result in
Linux-centric projects just dropping even nominal native Windows
support, rather than more Python software that handles Unicode on
Windows (/the CLR/the JVM) correctly.

Yeah, this basically sums it up. If I could be sure that the Pythondevelopers who are 99% Linux/1% Windows (i.e. run unit tests once andthen release) weren't going to see dropping byte support completely as ahostile action, I'd much rather go that way.

But let's definitely take note that platform-specific deprecationwarnings are probably not a good idea for cross-platform functionality.

What Steve is proposing here is essentially a way of providing more
*nix like CPython behaviour on Windows

Yep. What actually spurred me into action on this was a Twitter rantfrom one of Twisted's developers about paths on Windows. So I presumethat Twisted is probably okay *now* (and hopefully because theyexplicitly decode from network traffic into str before accessing thefile system...)

Using bytes has essentially always been using an arbitrarily-encoded stron Windows. The active code page is not an equivalent of "give me thepath as raw bytes" as it is on POSIX, but my change will make it so thatit is. There'll be a performance penalty, but otherwise using bytes forpaths will become reliable.

Unfortunately, any implicitly-encoded cross-version interoperabilitywill have to be broken by such a change. There's just no way around it.But I've seen no evidence that it's common, and there are twoworkarounds available (set the environment variable, or change your codeto specify the encoding used).

However, this view is also why I don't agree with being aggressive in
making this behaviour the default on Windows - I think we should make
it readily available as a provisional feature through a single
cross-platform command line switch and environment setting (e.g. "-X
utf8" and "PYTHONASSUMEUTF8") so folks that need it can readily opt in
to it, but we can defer making it the default until 3.7 after folks
have had a full release cycle's worth of experience with it in the
wild.

Given the people who would need to opt-in to the behaviour are merelythe recipients of a library written by someone else, I don't think thisis the right approach. Stephen Turnbull in an earlier post referred toorganisations that fully control their systems in order to ensure thatthe implicit encodings all match. These are also the people who canapply an environment variable to avoid a behaviour change.

However, someone who just installed an HTTP library that was developedon POSIX and perhaps not even tested on Windows should not have to flickthe switch themselves. In contrast, if it is known that 3.6 *definitely*changed something here, we will certainly see more effort applied tomaking sure libraries are updated. (Compare these two bug reports: "yourlibrary breaks on Python 3.6" vs "your library breaks on Python 3.6 whenI set this environment variable". The fix for the latter is quitereasonably going to be "don't do that".)

The other discussion about OpenSSL and LTS systems is also interesting.Do we really expect users to take their fully functioning systems andblindly upgrade to a new major version of Python expecting everything tojust work? That seems very unlikely to me, and also doesn't match myexperience (but I can't quantify that in any useful way, so take it asyou wish).


Cheers,
Steve

_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] File system path encoding on Windows

Reply via email to