Re: [Python-ideas] Allow manual creation of DirEntry objects

2016-08-17 Thread Nick Coghlan
On 17 August 2016 at 09:56, Victor Stinner  wrote:
> 2016-08-17 1:50 GMT+02:00 Guido van Rossum :
>> We could expose the class with a
>> constructor that always fails (the C code could construct instances through
>> a backdoor).
>
> Oh, in fact you cannot create an instance of os.DirEntry, it has no
> (Python) constructor:
>
> $ ./python
> Python 3.6.0a4+ (default:e615718a6455+, Aug 17 2016, 00:12:17)
 import os
 os.DirEntry(1)
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: cannot create 'posix.DirEntry' instances
>
> Only os.scandir() can produce such objects.
>
> The question is still if it makes sense to allow to create DirEntry
> objects in Python :-)

I think it does, as it isn't really any different from someone calling
the stat() method on a DirEntry instance created by os.scandir(). It
also prevents folks attempting things like:

def slow_constructor(dirname, entryname):
for entry in os.scandir(dirname):
if entry.name == entryname:
entry.stat()
return entry

Allowing DirEntry construction from Python further gives us a
straightforward answer to the "stat caching" question: "just use
os.DirEntry instances and call stat() to make the snapshot"

If folks ask why os.DirEntry caches results when pathlib.Path doesn't,
we have the answer that cache invalidation is a hard problem, and
hence we consider it useful in the lower level interface that is
optimised for speed, but problematic in the higher level one that is
more focused on cross-platform correctness of filesystem interactions.

I don't know whether it would make sense to allow a pre-existing stat
result to be based to DirEntry, but it does seem like it might be
useful for adapting existing stat-based backend APIs to a more user
friendly DirEntry based front end API.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Stephen J. Turnbull
Paul Moore writes:
 > On 16 August 2016 at 16:56, Steve Dower  wrote:

 > > This discussion is for the developers who insist on using bytes
 > > for paths within Python, and the question is, "how do we best
 > > represent UTF-16 encoded paths in bytes?"

That's incomplete, AFAICS.  (Paul makes this point somewhat
differently.)  We don't want to represent paths in bytes on Windows if
we can avoid it.  Nor does UTF-16 really enter into it (except for the
technical issue of invalid surrogate pairs).  So a full statement is,
"How do we best represent Windows file system paths in bytes for
interoperability with systems that natively represent paths in bytes?"
("Other systems" refers to both other platforms and existing programs
on Windows.)

BTW, why "surrogate pairs"?  Does Windows validate surrogates to
ensure they come in pairs, but not necessarily in the right order (or
perhaps sometimes they resolve to non-characters such as U+1)?

Paul says:

 > People passing bytes to open() have in my view, already chosen not
 > to follow the standard advice of "decode incoming data at the
 > boundaries of your application". They may have good reasons for
 > that, but it's perfectly reasonable to expect them to take
  > responsibility for manually tracking the encoding of the resulting
 > bytes values flowing through their code.

Abstractly true, but in practice there's no such need for those who
made the choice!  In a properly set up POSIX locale[1], it Just Works by
design, especially if you use UTF-8 as the preferred encoding.  It's
Windows developers and users who suffer, not those who wrote the code,
nor their primary audience which uses POSIX platforms.

 > It is of course, also true that "works for me in my environment" is
 > a viable strategy - but the maintenance cost of this strategy if
 > things change (whether in Python, or in the environment) is on the
 > application developers - they are hoping that cost is minimal, but
 > that's a risk they choose to take.

Nick's point is that the risk is on Windows users and developers for
the Windows platform who did *not* make that choice, but rather had it
made for them by developers on a different platform where it Just
Works.  He argues that we should level the playing field.

It's also relevant that those developers on the originating platform
for the code typically resist complexifying changes to make things
work on other platforms too (cf. Victor's advocacy of removing the
bytes APIs on Windows).  Victor's points are good IMO; he's not just
resisting Windows, there are real resource consequences.

 > Code using Unicode is unaffected, certainly. Ideally that means that
 > only a tiny minority of users should be affected. Are we over-reacting
 > to reports of standard practices in Japan? I've no idea.

AFAIK, India and Southeast Asia have already abandoned their
indigenous standards in favor of Unicode/UTF-8, so it doesn't matter
if they use str or bytes, either way Steve's proposal will Just Work.
I don't know anything about Arabic, Hebrew, Cyrillic, and Eastern
Europeans.  That leaves China, which is like Japan in having had a
practically universal encoding (ie, every script you'll actually see
roundtrips, emoji being the only practical issue) since the 1970s.  So
I suspect Chinese also primarily use their local code page (GB2312 or
GB18030) for plain text documents, possibly including .ini and
Makefiles.

Over-reaction?  I have no idea either.  Just a potentially widespread
risk, both to users and to Python's reputation for maintaining
compatibility.  (I don't think it's "fair", but among my acquaintances
Python has a poor rep -- Steve's argument that if you develop code for
3.5 you should expect to have to modify it to use it with 3.6 cuts no
ice with them.)

 > > If you see an alternative choice to those listed above, feel free
 > > to contribute it. Otherwise, can we focus the discussion on these
 > > (or any new) choices?
 > 
 > Accept that we should have deprecated builtin open and the io module,
 > but didn't do so. Extend the existing deprecation of bytes paths on
 > Windows, to cover *all* APIs, not just the os module, But modify the
 > deprecation to be "use of the Windows CP_ACP code page (via the ...A
 > Win32 APIs) is deprecated and will be replaced with use of UTF-8 as
 > the implied encoding for all bytes paths on Windows starting in Python
 > 3.7". Document and publicise it much more prominently, as it is a
 > breaking change. Then leave it one release for people to prepare for
 > the change.

I like this one!  If my paranoid fears are realized, in practice it
might have to wait two releases, but at least this announcement should
get people who are at risk to speak up.  If they don't, then you can
just call me "Chicken Little" and go ahead!


Footnotes: 
[1]  An oxymoron, but there you go.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
C

Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread eryk sun
On Wed, Aug 17, 2016 at 9:35 AM, Stephen J. Turnbull
 wrote:
> BTW, why "surrogate pairs"?  Does Windows validate surrogates to
> ensure they come in pairs, but not necessarily in the right order (or
> perhaps sometimes they resolve to non-characters such as U+1)?

A program can pass the filesystem a name containing one or more
surrogate codes that isn't in a valid UTF-16 surrogate pair (i.e. a
leading code in the range D800-DBFF followed by a trailing code in the
range DC00-DFFF). In the user-mode runtime library and kernel
executive, nothing up to the filesystem driver checks for a valid
UTF-16 string. Microsoft's filesystems remain compatible with UCS2
from the 90s and don't care that the name isn't legal UTF-16. The same
goes for the in-memory filesystems used for named pipes (NPFS,
\\.\pipe) and mailslots (MSFS, \\.\mailslot). But non-Microsoft
filesystems don't necessarily store names as wide-character strings.
They may use UTF-8, in which case an invalid UTF-16 name will cause
the system call to fail because it's an invalid parameter.

If the filesystem allows creating such a  badly named file or
directory, it can still be accessed using a regular unicode path,
which is how things stand currently. I see that Victor has suggested
using "surrogatepass" in issue 27781. That would allow seamless
operation. The downside is that bytes have a higher chance of leaking
out of Python than strings created by 'surrogateescape' on Unix. But
since it isn't a proper Unicode string on disk, at least nothing has
changed substantively by transcoding to "surrogatepass" UTF-8.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow manual creation of DirEntry objects

2016-08-17 Thread Guido van Rossum
Brendan,

The conclusion is that you should just file a bug asking for a working
constructor -- or upload a patch if you want to.

--Guido

On Wed, Aug 17, 2016 at 12:18 AM, Nick Coghlan  wrote:

> On 17 August 2016 at 09:56, Victor Stinner 
> wrote:
> > 2016-08-17 1:50 GMT+02:00 Guido van Rossum :
> >> We could expose the class with a
> >> constructor that always fails (the C code could construct instances
> through
> >> a backdoor).
> >
> > Oh, in fact you cannot create an instance of os.DirEntry, it has no
> > (Python) constructor:
> >
> > $ ./python
> > Python 3.6.0a4+ (default:e615718a6455+, Aug 17 2016, 00:12:17)
>  import os
>  os.DirEntry(1)
> > Traceback (most recent call last):
> >   File "", line 1, in 
> > TypeError: cannot create 'posix.DirEntry' instances
> >
> > Only os.scandir() can produce such objects.
> >
> > The question is still if it makes sense to allow to create DirEntry
> > objects in Python :-)
>
> I think it does, as it isn't really any different from someone calling
> the stat() method on a DirEntry instance created by os.scandir(). It
> also prevents folks attempting things like:
>
> def slow_constructor(dirname, entryname):
> for entry in os.scandir(dirname):
> if entry.name == entryname:
> entry.stat()
> return entry
>
> Allowing DirEntry construction from Python further gives us a
> straightforward answer to the "stat caching" question: "just use
> os.DirEntry instances and call stat() to make the snapshot"
>
> If folks ask why os.DirEntry caches results when pathlib.Path doesn't,
> we have the answer that cache invalidation is a hard problem, and
> hence we consider it useful in the lower level interface that is
> optimised for speed, but problematic in the higher level one that is
> more focused on cross-platform correctness of filesystem interactions.
>
> I don't know whether it would make sense to allow a pre-existing stat
> result to be based to DirEntry, but it does seem like it might be
> useful for adapting existing stat-based backend APIs to a more user
> friendly DirEntry based front end API.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Steve Dower

On 17Aug2016 0235, Stephen J. Turnbull wrote:

Paul Moore writes:
 > On 16 August 2016 at 16:56, Steve Dower  wrote:

 > > This discussion is for the developers who insist on using bytes
 > > for paths within Python, and the question is, "how do we best
 > > represent UTF-16 encoded paths in bytes?"

That's incomplete, AFAICS.  (Paul makes this point somewhat
differently.)  We don't want to represent paths in bytes on Windows if
we can avoid it.  Nor does UTF-16 really enter into it (except for the
technical issue of invalid surrogate pairs).  So a full statement is,
"How do we best represent Windows file system paths in bytes for
interoperability with systems that natively represent paths in bytes?"
("Other systems" refers to both other platforms and existing programs
on Windows.)


That's incorrect, or at least possible to interpret correctly as the 
wrong thing. The goal is "code compatibility with systems ...", not 
interoperability.


Nothing about this will make it easier to take a path from Windows and 
use it on Linux or vice versa, but it will make it easier/more reliable 
to take code that uses paths on Linux and use it on Windows.



BTW, why "surrogate pairs"?  Does Windows validate surrogates to
ensure they come in pairs, but not necessarily in the right order (or
perhaps sometimes they resolve to non-characters such as U+1)?


Eryk answered this better than I would have.


Paul says:

 > People passing bytes to open() have in my view, already chosen not
 > to follow the standard advice of "decode incoming data at the
 > boundaries of your application". They may have good reasons for
 > that, but it's perfectly reasonable to expect them to take
  > responsibility for manually tracking the encoding of the resulting
 > bytes values flowing through their code.

Abstractly true, but in practice there's no such need for those who
made the choice!  In a properly set up POSIX locale[1], it Just Works by
design, especially if you use UTF-8 as the preferred encoding.  It's
Windows developers and users who suffer, not those who wrote the code,
nor their primary audience which uses POSIX platforms.


You mentioned "locale", "preferred" and "encoding" in the same sentence, 
so I hope you're not thinking of locale.getpreferredencoding()? Changing 
that function is orthogonal to this discussion, despite the fact that in 
most cases it returns the same code page as what is going to be used by 
the file system functions (which in most cases will also be used by the 
encoding returned from sys.getfilesystemencoding()).


When Windows developers and users suffer, I see it as my responsibility 
to reduce that suffering. Changing Python on Windows should do that 
without affecting developers on Linux, even though the Right Way is to 
change all the developers on Linux to use str for paths.



 > > If you see an alternative choice to those listed above, feel free
 > > to contribute it. Otherwise, can we focus the discussion on these
 > > (or any new) choices?
 >
 > Accept that we should have deprecated builtin open and the io module,
 > but didn't do so. Extend the existing deprecation of bytes paths on
 > Windows, to cover *all* APIs, not just the os module, But modify the
 > deprecation to be "use of the Windows CP_ACP code page (via the ...A
 > Win32 APIs) is deprecated and will be replaced with use of UTF-8 as
 > the implied encoding for all bytes paths on Windows starting in Python
 > 3.7". Document and publicise it much more prominently, as it is a
 > breaking change. Then leave it one release for people to prepare for
 > the change.

I like this one!  If my paranoid fears are realized, in practice it
might have to wait two releases, but at least this announcement should
get people who are at risk to speak up.  If they don't, then you can
just call me "Chicken Little" and go ahead!


I don't think there's any reasonable way to noisily deprecate these 
functions within Python, but certainly the docs can be made clearer. 
People who explicitly encode with sys.getfilesystemencoding() should not 
get the deprecation message, but we can't tell whether they got their 
bytes from the right encoding or a RNG, so there's no way to discriminate.


I'm going to put together a summary post here (hopefully today) and get 
those who have been contributing to basically sign off on it, then I'll 
take it to python-dev. The possible outcomes I'll propose will basically 
be "do we keep the status quo, undeprecate and change the functionality, 
deprecate the deprecation and undeprecate/change in a couple releases, 
or say that it wasn't a real deprecation so we can deprecate and then 
change functionality in a couple releases".


Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Nick Coghlan
On 17 August 2016 at 02:06, Chris Barker  wrote:
> Just to make sure this is clear, the Pragmatic logic is thus:
>
> * There are more *nix-centric developers in the Python ecosystem than
> Windows-centric (or even Windows-agnostic) developers.
>
> * The bytes path approach works fine on *nix systems.

For the given value of "works fine" that is "works fine, except when
it doesn't, and then you end up with mojibake".

> * Whatever might be Right and Just -- the reality is that a number of
> projects, including important and widely used libraries and frameworks, use
> the bytes API for working with filenames and paths, etc.
>
> Therefore, there is a lot of code that does not work right on Windows.
>
> Currently, to get it to work right on Windows, you need to write Windows
> specific code, which many folks don't want or know how to do (or just can't
> support one way or the other).
>
> So the Solution is to either:
>
>  (A) get everyone to use Unicode  "properly", which will work on all
> platforms (but only on py3.5 and above?)
>
> or
>
> (B) kludge some *nix-compatible support for byte paths into Windows, that
> will work at least much of the time.
>
> It's clear (to me at least) that (A) it the "Right Thing", but real world
> experience has shown that it's unlikely to happen any time soon.
>
> Practicality beats Purity and all that -- this is a judgment call.
>
> Have I got that right?

Yep, pretty much. Based on Stephen Turnbull's concerns, I wonder if we
could make a whitelist of universal encodings that Python-on-Windows
will use in preference to UTF-8 if they're configured as the current
code page. If we accepted GB18030, GB2312, Shift-JIS, and ISO-2022-*
as overrides, then problems would be significantly less likely.

Another alternative would be to apply a similar solution as we do on
Linux with regards to the "surrogateescape" error handler: there are
some interfaces (like the standard streams) where we only enable that
error handler specifically if the preferred encoding is reported as
ASCII. In 2016, we're *very* skeptical about any properly configured
system actually being ASCII-only (rather than that value showing up
because the POSIX standards mandate it as the default), so we don't
really believe the OS when it tells us that.

The equivalent for Windows would be to disbelieve the configured code
page only when it was reported as "mbcs" - for folks that had
configured their system to use something other than the default,
Python would believe them, just as we do on Linux.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Steve Dower

On 17Aug2016 0901, Nick Coghlan wrote:

On 17 August 2016 at 02:06, Chris Barker  wrote:

So the Solution is to either:

 (A) get everyone to use Unicode  "properly", which will work on all
platforms (but only on py3.5 and above?)

or

(B) kludge some *nix-compatible support for byte paths into Windows, that
will work at least much of the time.

It's clear (to me at least) that (A) it the "Right Thing", but real world
experience has shown that it's unlikely to happen any time soon.

Practicality beats Purity and all that -- this is a judgment call.

Have I got that right?


Yep, pretty much. Based on Stephen Turnbull's concerns, I wonder if we
could make a whitelist of universal encodings that Python-on-Windows
will use in preference to UTF-8 if they're configured as the current
code page. If we accepted GB18030, GB2312, Shift-JIS, and ISO-2022-*
as overrides, then problems would be significantly less likely.

Another alternative would be to apply a similar solution as we do on
Linux with regards to the "surrogateescape" error handler: there are
some interfaces (like the standard streams) where we only enable that
error handler specifically if the preferred encoding is reported as
ASCII. In 2016, we're *very* skeptical about any properly configured
system actually being ASCII-only (rather than that value showing up
because the POSIX standards mandate it as the default), so we don't
really believe the OS when it tells us that.

The equivalent for Windows would be to disbelieve the configured code
page only when it was reported as "mbcs" - for folks that had
configured their system to use something other than the default,
Python would believe them, just as we do on Linux.


The problem here is that "mbcs" is not configurable - it's a 
meta-encoder that uses whatever is configured as the "language (system 
locale) to use when displaying text in programs that do not support 
Unicode" (quote from the dialog where administrators can configure 
this). So there's nothing to disbelieve here.


And even on machines where the current code page is "reliable", UTF-16 
is still the actual encoding, which means UTF-8 is still a better choice 
for representing the path as a blob of bytes. Currently we have 
inconsistent encoding between different Windows machines and could 
either remove that inconsistency completely or simply reduce it for 
(approx.) English speakers. I would rather an extreme here - either make 
it consistent regardless of user configuration, or make it so broken 
that nobody can use it at all. (And note that the correct way to support 
*some* other FS encodings would be to change the return value from 
sys.getfilesystemencoding(), which breaks people who currently ignore 
that just as badly as changing it to utf-8 would.)


Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow manual creation of DirEntry objects

2016-08-17 Thread Serhiy Storchaka

On 16.08.16 22:35, Brendan Moloney wrote:

I have a bunch of functions that operate on DirEntry objects, typically
doing some sort of filtering
to select the paths I actually want to process. The overwhelming
majority of the time these functions
are going to be operating on DirEntry objects produced by the scandir
function, but there are some
cases where the user will be supplying the path themselves (for example,
the root of a directory tree
to process). In my current code base that uses the scandir package I
just wrap these paths in a
'GenericDirEntry' object and then pass them through the filter functions
the same as any results
coming from the scandir function.


You can just create an object that duck-types DirEntry. See for example 
_DummyDirEntry in the os module.



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Stephen J. Turnbull
eryk sun writes:
 > On Wed, Aug 17, 2016 at 9:35 AM, Stephen J. Turnbull
 >  wrote:
 > > BTW, why "surrogate pairs"?  Does Windows validate surrogates to
 > > ensure they come in pairs, but not necessarily in the right order (or
 > > perhaps sometimes they resolve to non-characters such as U+1)?
 > 
 > Microsoft's filesystems remain compatible with UCS2

So it's not just invalid surrogate *pairs*, it's invalid surrogates of
all kinds.  This means that it's theoretically possible (though I
gather that it's unlikely in the extreme) for a real Windows filename
to indistinguishable from one generated by Python's surrogateescape
handler.

What happens when Python's directory manipulation functions on Windows
encounter such a filename?  Do they try to write it to the disk
directory?  Do they succeed?  Does that depend on surrogateescape?

Is there a reason in practice to allow surrogateescape at all on names
in Windows filesystems, at least when using the *W API?  You mention
non-Microsoft filesystems; are they common enough to matter?

I admit that as we converge on sanity (UTF-8 for text/* content, some
kind of Unicode for filesystem names) none of this is very likely to
matter, but I'm a worrywart

Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Stephen J. Turnbull
Steve Dower writes:
 > On 17Aug2016 0235, Stephen J. Turnbull wrote:

 > > So a full statement is, "How do we best represent Windows file
 > > system paths in bytes for interoperability with systems that
 > > natively represent paths in bytes?"  ("Other systems" refers to
 > > both other platforms and existing programs on Windows.)
 > 
 > That's incorrect, or at least possible to interpret correctly as
 > the wrong thing. The goal is "code compatibility with systems ...",
 > not interoperability.

You're right, I stated that incorrectly.  I don't have anything to add
to your corrected version.

 > > In a properly set up POSIX locale[1], it Just Works by design,
 > > especially if you use UTF-8 as the preferred encoding.  It's
 > > Windows developers and users who suffer, not those who wrote the
 > > code, nor their primary audience which uses POSIX platforms.
 > 
 > You mentioned "locale", "preferred" and "encoding" in the same sentence, 
 > so I hope you're not thinking of locale.getpreferredencoding()? Changing 
 > that function is orthogonal to this discussion,

You consistently ignore Makefiles, .ini, etc.  It is *not* orthogonal,
it is *the* reason for all opposition to your proposal or request that
it be delayed.  Filesystem names *are* text in part because they are
*used as filenames in text*.

 > When Windows developers and users suffer, I see it as my responsibility 
 > to reduce that suffering. Changing Python on Windows should do that 
 > without affecting developers on Linux, even though the Right Way is to 
 > change all the developers on Linux to use str for paths.

I resent that.  If I were a partisan Linux fanboy, I'd be cheering you
on because I think your proposal is going to hurt an identifiable and
large class of *Windows* users.  I know about and fear this possiblity
because they use a language I love (Japanese) and an encoding I hate
but have achieved a state of peaceful coexistence with (Shift JIS).

And on the general principle, *I* don't disagree.  I mentioned earlier
that I use only the str interfaces in my own code on Linux and Mac OS
X, and that I suspect that there are no real efficiency implications
to using str rather than bytes for those interfaces.

On the other hand, the programming convenience of reading the
occasional "text" filename (or other text, such as XML tags) out of a
binary stream and passing it directly to filesystem APIs cannot be
denied.  I think that the kind of usage you propose (a fixed,
universal codec, universally accepted; ie, 'utf-8') is the best way to
handle that in the long run.  But as Grandmaster Lasker said, "Before
the end game, the gods have placed the middle game."  (Lord Keynes
isn't relevant here, Python will outlive all of us. :-)

 > I don't think there's any reasonable way to noisily deprecate these
 > functions within Python, but certainly the docs can be made
 > clearer. People who explicitly encode with
 > sys.getfilesystemencoding() should not get the deprecation message,
 > but we can't tell whether they got their bytes from the right
 > encoding or a RNG, so there's no way to discriminate.

I agree with you within Python; the custom is for DeprecationWarnings
to be silent by default.

As for "making noise", how about announcing the deprecation as like
the top headline for 3.6, postponing the actual change to 3.7, and in
the meantime you and Nick do a keynote duet at PyCon?  (Your partner
could be Guido, too, but Nick has been the most articulate proponent
for this particular aspect of "inclusion".  I think having a
representative from the POSIX world explaining the importance of this
for "all of us" would greatly multiply the impact.)  Perhaps, given my
proposed timing, a discussion at the language summit in '17 and the
keynote in '18 would be the best timing.

(OT, political: I've been strongly influenced in this proposal by
recently reading http://blog.aurynn.com/contempt-culture.  There's not
as much of it in Python as in other communities I'm involved in, but I
think this would be a good symbolic opportunity to express our
oppostion to it.  "Inclusion" isn't just about gender and race!)

 > I'm going to put together a summary post here (hopefully today) and get 
 > those who have been contributing to basically sign off on it, then I'll 
 > take it to python-dev. The possible outcomes I'll propose will basically 
 > be "do we keep the status quo, undeprecate and change the functionality, 
 > deprecate the deprecation and undeprecate/change in a couple releases, 
 > or say that it wasn't a real deprecation so we can deprecate and then 
 > change functionality in a couple releases".

FWIW, of those four, I dislike 'status quo' the most, and like 'say it
wasn't real, deprecate and change' the best.  Although I lean toward
phrasing that as "we deprecated it, but we realize that practitioners
are by and large not aware of the deprecation, and nobody expects the
Spanish Inquisition".

@Nick, if you're watching: I wonder if it would be