Re: [Python-ideas] tweaking the file system path protocol

2017-05-29 Thread Wolfgang Maier

On 05/29/2017 09:55 AM, Serhiy Storchaka wrote:

29.05.17 00:33, Wolfgang Maier пише:
The path protocol does *not* use __fspath__ as an indicator that an 
object's str-representation is intended to be used as a path. If you 
had wanted this, the PEP should have defined __fspath__ not as a 
method, but as a flag and have the protocol check that flag, then call 
__str__ if appropriate.


__fspath__ is a method because there is a need to support bytes paths. 
__fspath__() can return a bytes object, str() can't.




That's certainly one reason, but again just shows that calling 
str(path_object) to get a path representation is wrong.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-29 Thread Serhiy Storchaka

29.05.17 00:33, Wolfgang Maier пише:
The path protocol does 
*not* use __fspath__ as an indicator that an object's str-representation 
is intended to be used as a path. If you had wanted this, the PEP should 
have defined __fspath__ not as a method, but as a flag and have the 
protocol check that flag, then call __str__ if appropriate.


__fspath__ is a method because there is a need to support bytes paths. 
__fspath__() can return a bytes object, str() can't.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-28 Thread Juancarlo Añez
On Sun, May 28, 2017 at 5:33 PM, Wolfgang Maier <
wolfgang.ma...@biologie.uni-freiburg.de> wrote:

> With __fspath__ being a method that can return whatever its author sees
> fit, calling str to get a path from an arbitrary object is just as wrong as
> it always was - it will work for pathlib.Path objects and might or might
> not work for some other types. Importantly, this has nothing to do with
> this proposal, but is in the nature of the protocol as it is defined *now*.


+1


-- 
Juancarlo *Añez*
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-28 Thread Wolfgang Maier

On 28.05.2017 18:32, Steven D'Aprano wrote:

On Sun, May 28, 2017 at 05:35:38PM +0300, Koos Zevenhoven wrote:


Don't get me wrong, I like consistency very much. But regarding the
__fspath__ case, there are not that many people *writing*
fspath-enabled classes. Instead, there are many many many more people
*using* such classes (and dealing with their compatibility issues in
different ways).


What sort of compatibility issues are you referring to? os.fspath is new
in 3.6, and 3.7 isn't out yet, so I'm having trouble understanding what
compatibility issues you mean.



As far as I'm aware the only such issue people had was with building 
interfaces that could deal with regular strings and pathlib.Path 
(introduced in 3.4 if I remember correctly) instances alike. Because 
calling str on a pathlib.Path instance returns the path as a regular 
string it looked like it could become a (bad) habit to just always call 
str on any received object for "compatibility" with both types of path 
representations. The path protocol is a response to this that provides 
an explicit and safe alternative.





For those people, the current behavior brings consistency


That's a very unintuitive statement. How is it consistent for fspath to
call the __fspath__ dunder method for some objects but ignore it for
others?



The path protocol brings a standard way of dealing with diverse path 
representations, but only if you use it. If people keep using 
str(path_object) as before, then they are doing things wrongly and are 
no better or safer off than they were before! The path protocol does 
*not* use __fspath__ as an indicator that an object's str-representation 
is intended to be used as a path. If you had wanted this, the PEP should 
have defined __fspath__ not as a method, but as a flag and have the 
protocol check that flag, then call __str__ if appropriate.
With __fspath__ being a method that can return whatever its author sees 
fit, calling str to get a path from an arbitrary object is just as wrong 
as it always was - it will work for pathlib.Path objects and might or 
might not work for some other types. Importantly, this has nothing to do 
with this proposal, but is in the nature of the protocol as it is 
defined *now*.





---after all, it was of course designed by thinking about
it from all angles and not just based on my or anyone else's own use
cases only.


Can explain the reasoning to us? I don't think it is explained in the
PEP.




___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-28 Thread Steven D'Aprano
On Sun, May 28, 2017 at 05:35:38PM +0300, Koos Zevenhoven wrote:

> Don't get me wrong, I like consistency very much. But regarding the
> __fspath__ case, there are not that many people *writing*
> fspath-enabled classes. Instead, there are many many many more people
> *using* such classes (and dealing with their compatibility issues in
> different ways).

What sort of compatibility issues are you referring to? os.fspath is new 
in 3.6, and 3.7 isn't out yet, so I'm having trouble understanding what 
compatibility issues you mean.


> For those people, the current behavior brings consistency

That's a very unintuitive statement. How is it consistent for fspath to 
call the __fspath__ dunder method for some objects but ignore it for 
others?


> ---after all, it was of course designed by thinking about
> it from all angles and not just based on my or anyone else's own use
> cases only.

Can explain the reasoning to us? I don't think it is explained in the 
PEP.


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-28 Thread Koos Zevenhoven
On Sun, May 28, 2017 at 9:15 AM, Nick Coghlan  wrote:
>
> However, if we *did* make such a change, it should also be made for
> operator.index as well, since that is similarly inconsistent with the
> way the int/float/etc constructor protocols work:
>

Part of this discussion seems to consider consistency as the only
thing that matters, but consistency is only the surface here. I won't
comment on the __index__ issue, and especially not call it a
"misfeature", because I haven't thought about it deeply, and my
comments on it would be very shallow. I might ask about it though,
like the OP did.

Don't get me wrong, I like consistency very much. But regarding the
__fspath__ case, there are not that many people *writing*
fspath-enabled classes. Instead, there are many many many more people
*using* such classes (and dealing with their compatibility issues in
different ways). For those people, the current behavior brings
consistency---after all, it was of course designed by thinking about
it from all angles and not just based on my or anyone else's own use
cases only.

-- Koos

> >>> from operator import index
> >>> class MyInt(int):
> ... def __int__(self):
> ... return 5
> ... def __index__(self):
> ... return 5
> ...
> >>> int(MyInt(10))
> 5
> >>> index(MyInt(10))
> 10
> >>> class MyFloat(float):
> ... def __float__(self):
> ... return 5.0
> ...
> >>> float(MyFloat(10))
> 5.0
> >>> class MyComplex(complex):
> ... def __complex__(self):
> ... return 5j
> ...
> >>> complex(MyComplex(10j))
> 5j
> >>> class MyStr(str):
> ... def __str__(self):
> ... return "Hello"
> ...
> >>> str(MyStr("Not hello"))
> 'Hello'
> >>> class MyBytes(bytes):
> ... def __bytes__(self):
> ... return b"Hello"
> ...
> >>> bytes(MyBytes(b"Not hello"))
> b'Hello'
>
> Regards,
> Nick.
>

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-27 Thread Nick Coghlan
On 28 May 2017 at 15:18, Steven D'Aprano  wrote:
> On Fri, May 26, 2017 at 03:58:23PM +0300, Koos Zevenhoven wrote:
> I think this argument about backwards compatibility is a storm in a tea
> cup. We can enumerate all the possibilities:
>
> 1. object that doesn't inherit from str/bytes: behaviour is unchanged;
>
> 2. object that does inherit from str/bytes, but doesn't override
>the __fspath__ method: behaviour is unchanged;
>
> 3. object that inherits from str/bytes, *and* overrides the __fspath__
>method: behaviour is changed.
>
> Okay, the behaviour changes. I doubt that there will be many
> classes that subclass str and override __fspath__ now, because
> that would have been a waste of time up to now. So the main risk is:
>
> - classes created from Python 3.7 onwards;
> - which inherit from str/bytes;
> - and which override __fspath__;
> - and are back-ported to 3.6;
> - without taking into account that __fspath__ will be ignored in 3.6;
> - and the users don't read the docs to learn about the difference.
>
> The danger here is the possibility that the wrong pathname will be used,
> if str(obj) and fspath(obj) return a different string.
>
> Personally I think this is unlikely and not worth worrying about beyond a 
> note in
> the documentation, but if people really feel this is a problem we could
> make this a __future__ import. But that just feels like overkill.

It wouldn't even need to be a __future__ import, as we have a runtime
warning category specifically for this kind of change:
https://docs.python.org/3/library/exceptions.html#FutureWarning

So *if* a change like this was made, the appropriate transition plan would be:

Python 3.7: at *class definition time*, we emit FutureWarning for
subclasses of str and bytes that define __fspath__, saying that it is
currently ignored for such subclasses, but will be called in Python
3.8+
Python 3.8: os.fspath() is changed as Wolgang proposes, such that
explicit protocol support takes precedence over builtin inheritance

However, if we *did* make such a change, it should also be made for
operator.index as well, since that is similarly inconsistent with the
way the int/float/etc constructor protocols work:

>>> from operator import index
>>> class MyInt(int):
... def __int__(self):
... return 5
... def __index__(self):
... return 5
...
>>> int(MyInt(10))
5
>>> index(MyInt(10))
10
>>> class MyFloat(float):
... def __float__(self):
... return 5.0
...
>>> float(MyFloat(10))
5.0
>>> class MyComplex(complex):
... def __complex__(self):
... return 5j
...
>>> complex(MyComplex(10j))
5j
>>> class MyStr(str):
... def __str__(self):
... return "Hello"
...
>>> str(MyStr("Not hello"))
'Hello'
>>> class MyBytes(bytes):
... def __bytes__(self):
... return b"Hello"
...
>>> bytes(MyBytes(b"Not hello"))
b'Hello'

Regards,
Nick.

P.S. I'll also echo Steven's observations that it is entirely
inappropriate to describe the thinking of other posters to the list as
being overly shallow. The entire reason we *have* python-ideas and the
PEP process is because programming language design is a *hard
problem*, especially for a language with as broad a set of use cases
as Python. Rather than trying to somehow survey the entire world of
Python developers, we instead provide them with an open forum where
they can say "This surprises or otherwise causes problems for me" and
describe their perspective. That's neither deep nor shallow thinking,
it's just different people using the same language in different ways,
and hence encountering different pain points.

As far as the specific point at hand goes, I think contrasting the
behaviour of PEP 357 (__index__) and PEP 519 (__fspath__) with the
behaviour of the builtin constructor protocols suggest that this is
better characterised as an oversight in the design of the more recent
protocols, since neither PEP explicitly discusses the problem, both
PEPs were specifically designed to permit the use of objects that
*don't* inherit from the relevant builtin types (since subclasses
already worked), and both PEPs handle the "subclass that also
implements the corresponding protocol" scenario differently from the
way the builtin constructor protocols handle it.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-27 Thread Steven D'Aprano
On Fri, May 26, 2017 at 03:58:23PM +0300, Koos Zevenhoven wrote:
> On Wed, May 24, 2017 at 5:52 PM, Wolfgang Maier
>  wrote:
> > On 05/24/2017 02:41 AM, Steven D'Aprano wrote:

[...]

> > This is almost exactly what I have been thinking (just that I couldn't have
> > presented it so clearly)!
> 
> Unfortunately, this thinking is also very shallow compared to what
> went into PEP519.

That is a rather rude comment. How would you feel if Wolfgang or I said 
that the PEP's thinking was "very shallow"? (I see you are listed as 
co-author.)

If you are going to criticise our reasoning, you better give reasons for 
why we are wrong, not just insult us:

"...this thinking is very shallow..."

"This is exactly the kind of code that causes the problems."

"Isn't it great that it doesn't work, so it's not attractive anymore?"

"Yes, this is another way of shooting yourself in the foot."


Let me look at your objections:

> str and bytes subclasses that
> return something different from the str/bytes content should not be
> written.

That's your opinion, other people might disagree. In another post, you 
said it would be "confusing". I think this argument is FUD ("Fear, 
Uncertainty, Doubt"). We can already write confusing code in a million 
other ways, why is this one to be prohibited?

I don't know of any other area of Python where a type isn't permitted to 
override its own dunders:

strings have __str__ and __repr__
floats have __float__
ints have __int__
tuples can override __getitem__ to return whatever they like

etc. This is legal:

py> class ConfusingStr(str):
... def __getitem__(self, i):
... return 'x'
...
py> s = ConfusingStr("Nobody expects the Spanish Inquisition!")
py> s[5]
'x'


People have had the ability to write "confusing" strings, floats and 
ints which could return something different from their own value. They 
either don't do it, or if they do, they have a good reason and it 
isn't so confusing.

And if somebody does use it to write a confusing class? So what? 
"consenting adults" applies here. We aren't responsible for every abuse 
of the language that somebody might do. Why is __fspath__ so special 
that we need to protect users from doing something confusing?

What *really is* confusing is to ignore __fspath__ methods in some 
objects but not other objects. If that decision was intentional, I don't 
think it was justified in the PEP. (At least, I didn't see it.)


> > Lets look at a potential usecase for this. Assume that in a package you want
> > to handle several paths to different files and directories that are all
> > located in a common package-specific parent directory. Then using the path
> > protocol you could write this:
> >
> > class PackageBase (object):
> > basepath = '/home/.package'
> >
> > class PackagePath (str, PackageBase):
> > def __fspath__ ():
> > return os.path.join(self.basepath, str(self))
> >
> > config_file = PackagePath('.config')
> > log_file = PackagePath('events.log')
> > data_dir = PackagePath('data')
> >
> > with open(log_file) as log:
> > log.write('package paths initialized.\n')
> >
> 
> This is exactly the kind of code that causes the problems. It will do
> the wrong thing when code like open(str(log_file), 'w') is used for
> compatiblity.

Then don't do that.

Using open(str(log_file), 'w') is not the right way to emulate the Path 
protocol for backwards compatibility. The whole reason the Path protocol 
exists is because calling str(obj) is the wrong way to convert an 
unknown object to a file system path string.

I think this argument about backwards compatibility is a storm in a tea 
cup. We can enumerate all the possibilities:

1. object that doesn't inherit from str/bytes: behaviour is unchanged;

2. object that does inherit from str/bytes, but doesn't override
   the __fspath__ method: behaviour is unchanged;

3. object that inherits from str/bytes, *and* overrides the __fspath__ 
   method: behaviour is changed.

Okay, the behaviour changes. I doubt that there will be many 
classes that subclass str and override __fspath__ now, because 
that would have been a waste of time up to now. So the main risk is:

- classes created from Python 3.7 onwards;
- which inherit from str/bytes;
- and which override __fspath__;
- and are back-ported to 3.6;
- without taking into account that __fspath__ will be ignored in 3.6;
- and the users don't read the docs to learn about the difference.

The danger here is the possibility that the wrong pathname will be used, 
if str(obj) and fspath(obj) return a different string.

Personally I think this is unlikely and not worth worrying about beyond a note 
in 
the documentation, but if people really feel this is a problem we could 
make this a __future__ import. But that just feels like overkill.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://pytho

Re: [Python-ideas] tweaking the file system path protocol

2017-05-26 Thread Koos Zevenhoven
Accidentally sent the email before it was done. Additions / corrections below:

On Fri, May 26, 2017 at 3:58 PM, Koos Zevenhoven  wrote:
> On Wed, May 24, 2017 at 5:52 PM, Wolfgang Maier
>>
>> - makes the "path.__fspath__() if hasattr(path, "__fspath__") else path"
>> idiom consistent for subclasses of str and bytes that define __fspath__
>>
>
> One can discuss whether this is the best idiom to use (I did not write
> it, so maybe someone else has comments).
>
> Anyway, some may want to use
>
> path.__fspath__() if hasattr(path, "__fspath__") else str(path)
>
> and some may want
>
> path if isinstance(path, (str, bytes)) else path.__fspath__()
>
> Or others may not be after oneliners like this and instead include the
> full implementation of fspath in their code—or even better, with some
> modifications.
>
> Really, the best thing to use in pre-3.6 might be more like:
>
> def fspath(path):
> if isinstance(path, (str, bytes)):
> return path
> if hasattr(path, '__fspath__'):
> return path.__fspath__()
> if type(path).__name__ == 'DirEntry':
> return path.path
> if isinstance(path, pathlib.PurePath):
> return str(path)
> raise TypeError("Argument cannot be interpreted as a file system path: " 
> + repr(path))
>

In the above, I have to check type(path).__name__, because DirEntry
was not exposed as os.DirEntry in 3.5 yet.

For pre-3.4 Python and for older third-party libraries that do inherit
from str/bytes, one could even use something like:

def fspath(path):
if isinstance(path, (str, bytes)):
return path
if hasattr(type(path), '__fspath__'):
return type(path).__fspath__(path)
if type(path).__name__ == 'DirEntry':
return path.path
if "Path" in type(path).__name__: # add whatever known names for
path classes (what a hack!)
return str(path)
raise TypeError("Argument cannot be interpreted as a file system
path: " + repr(path))


—Koos


-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-26 Thread Koos Zevenhoven
On Wed, May 24, 2017 at 5:52 PM, Wolfgang Maier
 wrote:
> On 05/24/2017 02:41 AM, Steven D'Aprano wrote:
>>
>>
>> It would be annoying and inconsistent if int(x) avoided calling __int__
>> on int subclasses. But that's exactly what happens with fspath and str.
>> I see that as a bug, not a feature: I find it hard to believe that we
>> would design an interface for string-like objects (paths) and then
>> intentionally prohibit it from applying to strings.
>>
>> And if we did, surely its a misfeature. Why *shouldn't* subclasses of
>> str get the same opportunity to customize the result of __fspath__ as
>> they get to customize their __repr__ and __str__?
>>
>> py> class MyStr(str):
>> ... def __repr__(self):
>> ... return 'repr'
>> ... def __str__(self):
>> ... return 'str'
>> ...
>> py> s = MyStr('abcdef')
>> py> repr(s)
>> 'repr'
>> py> str(s)
>> 'str'
>>
>
> This is almost exactly what I have been thinking (just that I couldn't have
> presented it so clearly)!

Unfortunately, this thinking is also very shallow compared to what
went into PEP519.

>
> Lets look at a potential usecase for this. Assume that in a package you want
> to handle several paths to different files and directories that are all
> located in a common package-specific parent directory. Then using the path
> protocol you could write this:
>
> class PackageBase (object):
> basepath = '/home/.package'
>
> class PackagePath (str, PackageBase):
> def __fspath__ ():
> return os.path.join(self.basepath, str(self))
>
> config_file = PackagePath('.config')
> log_file = PackagePath('events.log')
> data_dir = PackagePath('data')
>
> with open(log_file) as log:
> log.write('package paths initialized.\n')
>

This is exactly the kind of code that causes the problems. It will do
the wrong thing when code like open(str(log_file), 'w') is used for
compatiblity.

> Just that this wouldn't currently work because PackagePath inherits from
> str. Of course, there are other ways to achieve the above, but when you
> think about designing a Path-like object class str is just a pretty
> attractive base class to start from.

Isn't it great that it doesn't work, so it's not attractive anymore?

> Now lets look at compatibility of a class like PackagePath under this
> proposal:
>
> - if client code uses e.g. str(config_file) and proceeds to treat the
> resulting object as a path unexpected things will happen and, yes, that's
> bad. However, this is no different from any other Path-like object for which
> __str__ and __fspath__ don't define the same return value.
>

Yes, this is another way of shooting yourself in the foot. Luckily,
this one is probably less attractive.

> - if client code uses the PEP-recommended backwards-compatible way of
> dealing with paths,
>
> path.__fspath__() if hasattr(path, "__fspath__") else path
>
> things will just work. Interstingly, this would *currently* produce an
> unexpected result namely that it would execute the__fspath__ method of the
> str-subclass
>

So people not testing for 3.6+ might think their code works while it
doesn't. Luckily people not testing with 3.6+ are perhaps unlikely to
try funny tricks with __fspath__.

> - if client code uses instances of PackagePath as paths directly then in
> Python3.6 and below that would lead to unintended outcome, while in
> Python3.7 things would work. This is *really* bad.
>
> But what it means is that, under the proposal, using a str or bytes subclass
> with an __fspath__ method defined makes your code backwards-incompatible and
> the solution would be not to use such a class if you want to be
> backwards-compatible (and that should get documented somewhere). This
> restriction, of course, limits the usefulness of the proposal in the near
> future, but that disadvantage will vanish over time. In 5 years, not
> supporting Python3.6 anymore maybe won't be a big deal anymore (for
> comparison, Python3.2 was released 6 years ago and since last years pip is
> no longer supporting it). As Steven pointed out the proposal is *very*
> unlikely to break existing code.
>
> So to summarize, the proposal
>
> - avoids an up-front isinstance check in the protocol and thereby speeds up
> the processing of exact strings and bytes and of anything that follows the
> path protocol.*

Speedup for things with __fspath__ is the only virtue of this
proposal, and it has not been shown that that speedup matters
anywhere.

> - slows down the processing of instances of regular str and bytes
> subclasses*
>
> - makes the "path.__fspath__() if hasattr(path, "__fspath__") else path"
> idiom consistent for subclasses of str and bytes that define __fspath__
>

One can discuss whether this is the best idiom to use (I did not write
it, so maybe someone else has comments).

Anyway, some may want to use

path.__fspath__() if hasattr(path, "__fspath__") else str(path)

and some may want

path if isinstance(path, (str, bytes)) else path.__fspath__()

Or others may not 

Re: [Python-ideas] tweaking the file system path protocol

2017-05-26 Thread Koos Zevenhoven
On Wed, May 24, 2017 at 3:41 AM, Steven D'Aprano  wrote:
> On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote:
>> I don't know a reasonable use case for this feature. The __fspath__
>> method of str or bytes subclasses returning something not equivalent to
>> self looks confusing to me.
>
> I can imagine at least two:
>
> - emulating something like DOS 8.3 versus long file names;
> - case normalisation
>

These are not reasonable use cases because they should not subclass
str or bytes. That would be confusing.

> but what would make this really useful is for debugging. For instance, I
> have used something like this to debug problems with int() being called
> wrongly:
>
> py> class MyInt(int):
> ... def __int__(self):
> ... print("__int__ called")
> ... return super().__int__()
> ...
> py> x = MyInt(23)
> py> int(x)
> __int__ called
> 23
>

You can monkeypatch the stdlib:

from os import fspath as real_fspath
mystr = "23"

def fspath(path):
if path is mystr:
print("fspath was called on mystr")
return real_fspath(path)

os.fspath = fspath

try_something_with(mystr)

Having __fspath__ on str and bytes by default would destroy the
ability to distinguish between PathLike and non-PathLike, because all
strings would appear to be PathLike. (Not to mention the important
compatibility issues between different Python versions and different
ways of dealing with pre-PEP519 path objects.)

—Koos


-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-24 Thread Wolfgang Maier

On 05/24/2017 02:41 AM, Steven D'Aprano wrote:

On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote:


It seems to me that the purpose of this proposition is not performance,
but the possibility to use __fspath__ in str or bytes subclasses.
Currently defining __fspath__ in str or bytes subclasses doesn't have
any effect.


That's how I interpreted the proposal, with any performance issue being
secondary. (I don't expect that converting path-like objects to strings
would be the bottleneck in any application doing actual disk IO.)

  

I don't know a reasonable use case for this feature. The __fspath__
method of str or bytes subclasses returning something not equivalent to
self looks confusing to me.


I can imagine at least two:

- emulating something like DOS 8.3 versus long file names;
- case normalisation

but what would make this really useful is for debugging. For instance, I
have used something like this to debug problems with int() being called
wrongly:

py> class MyInt(int):
... def __int__(self):
... print("__int__ called")
... return super().__int__()
...
py> x = MyInt(23)
py> int(x)
__int__ called
23

It would be annoying and inconsistent if int(x) avoided calling __int__
on int subclasses. But that's exactly what happens with fspath and str.
I see that as a bug, not a feature: I find it hard to believe that we
would design an interface for string-like objects (paths) and then
intentionally prohibit it from applying to strings.

And if we did, surely its a misfeature. Why *shouldn't* subclasses of
str get the same opportunity to customize the result of __fspath__ as
they get to customize their __repr__ and __str__?

py> class MyStr(str):
... def __repr__(self):
... return 'repr'
... def __str__(self):
... return 'str'
...
py> s = MyStr('abcdef')
py> repr(s)
'repr'
py> str(s)
'str'



This is almost exactly what I have been thinking (just that I couldn't 
have presented it so clearly)!


Lets look at a potential usecase for this. Assume that in a package you 
want to handle several paths to different files and directories that are 
all located in a common package-specific parent directory. Then using 
the path protocol you could write this:


class PackageBase (object):
basepath = '/home/.package'

class PackagePath (str, PackageBase):
def __fspath__ ():
return os.path.join(self.basepath, str(self))

config_file = PackagePath('.config')
log_file = PackagePath('events.log')
data_dir = PackagePath('data')

with open(log_file) as log:
log.write('package paths initialized.\n')


Just that this wouldn't currently work because PackagePath inherits from 
str. Of course, there are other ways to achieve the above, but when you 
think about designing a Path-like object class str is just a pretty 
attractive base class to start from.


Now lets look at compatibility of a class like PackagePath under this 
proposal:


- if client code uses e.g. str(config_file) and proceeds to treat the 
resulting object as a path unexpected things will happen and, yes, 
that's bad. However, this is no different from any other Path-like 
object for which __str__ and __fspath__ don't define the same return value.


- if client code uses the PEP-recommended backwards-compatible way of 
dealing with paths,


path.__fspath__() if hasattr(path, "__fspath__") else path

things will just work. Interstingly, this would *currently* produce an 
unexpected result namely that it would execute the__fspath__ method of 
the str-subclass


- if client code uses instances of PackagePath as paths directly then in 
Python3.6 and below that would lead to unintended outcome, while in 
Python3.7 things would work. This is *really* bad.


But what it means is that, under the proposal, using a str or bytes 
subclass with an __fspath__ method defined makes your code 
backwards-incompatible and the solution would be not to use such a class 
if you want to be backwards-compatible (and that should get documented 
somewhere). This restriction, of course, limits the usefulness of the 
proposal in the near future, but that disadvantage will vanish over 
time. In 5 years, not supporting Python3.6 anymore maybe won't be a big 
deal anymore (for comparison, Python3.2 was released 6 years ago and 
since last years pip is no longer supporting it). As Steven pointed out 
the proposal is *very* unlikely to break existing code.


So to summarize, the proposal

- avoids an up-front isinstance check in the protocol and thereby speeds 
up the processing of exact strings and bytes and of anything that 
follows the path protocol.*


- slows down the processing of instances of regular str and bytes 
subclasses*


- makes the "path.__fspath__() if hasattr(path, "__fspath__") else path" 
idiom consistent for subclasses of str and bytes that define __fspath__


- opens up the opportunity to write str/bytes subclasses that represent 
a path other than just their self in the future*

Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Steven D'Aprano
On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote:
> 23.05.17 20:04, Brett Cannon пише:

> >What exactly is the performance issue you are having that is leading to 
> >this proposal?
> 
> It seems to me that the purpose of this proposition is not performance, 
> but the possibility to use __fspath__ in str or bytes subclasses. 
> Currently defining __fspath__ in str or bytes subclasses doesn't have 
> any effect.

That's how I interpreted the proposal, with any performance issue being 
secondary. (I don't expect that converting path-like objects to strings 
would be the bottleneck in any application doing actual disk IO.)

 
> I don't know a reasonable use case for this feature. The __fspath__ 
> method of str or bytes subclasses returning something not equivalent to 
> self looks confusing to me.

I can imagine at least two:

- emulating something like DOS 8.3 versus long file names;
- case normalisation

but what would make this really useful is for debugging. For instance, I 
have used something like this to debug problems with int() being called 
wrongly:

py> class MyInt(int):
... def __int__(self):
... print("__int__ called")
... return super().__int__()
...
py> x = MyInt(23)
py> int(x)
__int__ called
23

It would be annoying and inconsistent if int(x) avoided calling __int__ 
on int subclasses. But that's exactly what happens with fspath and str. 
I see that as a bug, not a feature: I find it hard to believe that we 
would design an interface for string-like objects (paths) and then 
intentionally prohibit it from applying to strings.

And if we did, surely its a misfeature. Why *shouldn't* subclasses of 
str get the same opportunity to customize the result of __fspath__ as 
they get to customize their __repr__ and __str__?

py> class MyStr(str):
... def __repr__(self):
... return 'repr'
... def __str__(self):
... return 'str'
...
py> s = MyStr('abcdef')
py> repr(s)
'repr'
py> str(s)
'str'

I don't think that backwards compatibility is an issue here. Nobody will 
have had reason to write str subclasses with __fspath__ methods, so 
changing the behaviour to no longer ignore them shouldn't break any 
code. But of course, we should treat this as a new feature, and only 
change the behaviour in 3.7.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread tritium-list


> -Original Message-
> From: Python-ideas [mailto:python-ideas-bounces+tritium-
> list=sdamon@python.org] On Behalf Of Koos Zevenhoven
> Sent: Tuesday, May 23, 2017 5:31 PM
> To: Serhiy Storchaka 
> Cc: python-ideas 
> Subject: Re: [Python-ideas] tweaking the file system path protocol
> 
> On Wed, May 24, 2017 at 12:18 AM, Serhiy Storchaka
>  wrote:
> > It seems to me that the purpose of this proposition is not performance,
but
> > the possibility to use __fspath__ in str or bytes subclasses. Currently
> > defining __fspath__ in str or bytes subclasses doesn't have any effect.
> >
> > I don't know a reasonable use case for this feature. The __fspath__
> method
> > of str or bytes subclasses returning something not equivalent to self
looks
> > confusing to me.
> 
> Yes, that would be another reason.
> 
> Only when Python drops support for strings as paths, can
> people start writing such subclasses. I'm sure many
> would now say dropping str/bytes path support won't even happen in
> Python 4.
> 
> -- Koos


It is highly unlikely that python will ever drop str/bytes support for
dealing with filesystem paths; case and point, they just ADDED bytes support
back for windows filesystem paths.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Koos Zevenhoven
On Wed, May 24, 2017 at 12:18 AM, Serhiy Storchaka  wrote:
> It seems to me that the purpose of this proposition is not performance, but
> the possibility to use __fspath__ in str or bytes subclasses. Currently
> defining __fspath__ in str or bytes subclasses doesn't have any effect.
>
> I don't know a reasonable use case for this feature. The __fspath__ method
> of str or bytes subclasses returning something not equivalent to self looks
> confusing to me.

Yes, that would be another reason.

Only when Python drops support for strings as paths, can
people start writing such subclasses. I'm sure many
would now say dropping str/bytes path support won't even happen in
Python 4.

-- Koos


-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Serhiy Storchaka

23.05.17 20:04, Brett Cannon пише:
On Tue, 23 May 2017 at 03:13 Wolfgang Maier 
> wrote:

My proposal is to change this to:
1) check whether the type of the argument is str or bytes *exactly*; if
so, return the argument unchanged
2) check wether __fspath__ can be called on the type and returns an
instance of str, bytes, or any subclass (just like in the current
version)
3) check whether the type is a subclass of str or bytes and, if so,
return it unchanged

What exactly is the performance issue you are having that is leading to 
this proposal?


It seems to me that the purpose of this proposition is not performance, 
but the possibility to use __fspath__ in str or bytes subclasses. 
Currently defining __fspath__ in str or bytes subclasses doesn't have 
any effect.


I don't know a reasonable use case for this feature. The __fspath__ 
method of str or bytes subclasses returning something not equivalent to 
self looks confusing to me.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Koos Zevenhoven
On Tue, May 23, 2017 at 7:53 PM, Wolfgang Maier
 wrote:
>
> Ah, sorry, I misunderstood what you were trying to say, but now I'm getting
> it! subclasses of str and bytes were of course usable as path arguments
> before simply because they were subclasses of them. Now they would be picked
> up based on their __fspath__ method, but old versions of Python executing
> code using them would still use them directly. Have to think about this one
> a bit, but thanks for pointing it out.
>

Yes, this is exactly what I meant. I noticed I had left out some of
the details of the reasoning, sorry. I tried to fix that in my
response to Steven.

— Koos


-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Guido van Rossum
I see no future for this proposal. Sorry Wolfgang! For future reference,
the proposal was especially weak because it gave no concrete examples of
code that was inconvenienced in any way by the current behavior. (And the
performance hack of checking for exact str/bytes can be made without
changing the semantics.)

On Tue, May 23, 2017 at 10:04 AM, Brett Cannon  wrote:

>
>
> On Tue, 23 May 2017 at 03:13 Wolfgang Maier  freiburg.de> wrote:
>
>> What do you think of this idea for a slight modification to os.fspath:
>> the current version checks whether its arg is an instance of str, bytes
>> or any subclass and, if so, returns the arg unchanged. In all other
>> cases it tries to call the type's __fspath__ method to see if it can get
>> str, bytes, or a subclass thereof this way.
>>
>> My proposal is to change this to:
>> 1) check whether the type of the argument is str or bytes *exactly*; if
>> so, return the argument unchanged
>> 2) check wether __fspath__ can be called on the type and returns an
>> instance of str, bytes, or any subclass (just like in the current version)
>> 3) check whether the type is a subclass of str or bytes and, if so,
>> return it unchanged
>>
>> This would have the following implications:
>> a) it would speed up the very common case when the arg is either a str
>> or a bytes instance exactly
>> b) user-defined classes that inherit from str or bytes could control
>> their path representation just like any other class
>> c) subclasses of str/bytes that don't define __fspath__ would still work
>> like they do now, but their processing would be slower
>> d) subclasses of str/bytes that accidentally define a __fspath__ method
>> would change their behavior
>>
>> I think cases c) and d) could be sufficiently rare that the pros
>> outweigh the cons?
>>
>
> What exactly is the performance issue you are having that is leading to
> this proposal? I ask because b) and d) change semantics and so it's not a
> small thing to make this change at this point since Python 3.6 has been
> released. So unless there's a major performance impact I'm reluctant to
> want to change it at this point.
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Brett Cannon
On Tue, 23 May 2017 at 03:13 Wolfgang Maier <
wolfgang.ma...@biologie.uni-freiburg.de> wrote:

> What do you think of this idea for a slight modification to os.fspath:
> the current version checks whether its arg is an instance of str, bytes
> or any subclass and, if so, returns the arg unchanged. In all other
> cases it tries to call the type's __fspath__ method to see if it can get
> str, bytes, or a subclass thereof this way.
>
> My proposal is to change this to:
> 1) check whether the type of the argument is str or bytes *exactly*; if
> so, return the argument unchanged
> 2) check wether __fspath__ can be called on the type and returns an
> instance of str, bytes, or any subclass (just like in the current version)
> 3) check whether the type is a subclass of str or bytes and, if so,
> return it unchanged
>
> This would have the following implications:
> a) it would speed up the very common case when the arg is either a str
> or a bytes instance exactly
> b) user-defined classes that inherit from str or bytes could control
> their path representation just like any other class
> c) subclasses of str/bytes that don't define __fspath__ would still work
> like they do now, but their processing would be slower
> d) subclasses of str/bytes that accidentally define a __fspath__ method
> would change their behavior
>
> I think cases c) and d) could be sufficiently rare that the pros
> outweigh the cons?
>

What exactly is the performance issue you are having that is leading to
this proposal? I ask because b) and d) change semantics and so it's not a
small thing to make this change at this point since Python 3.6 has been
released. So unless there's a major performance impact I'm reluctant to
want to change it at this point.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Wolfgang Maier

On 05/23/2017 06:17 PM, Koos Zevenhoven wrote:

On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier
 wrote:

What do you think of this idea for a slight modification to os.fspath:
the current version checks whether its arg is an instance of str, bytes or
any subclass and, if so, returns the arg unchanged. In all other cases it
tries to call the type's __fspath__ method to see if it can get str, bytes,
or a subclass thereof this way.

My proposal is to change this to:
1) check whether the type of the argument is str or bytes *exactly*; if so,
return the argument unchanged
2) check wether __fspath__ can be called on the type and returns an instance
of str, bytes, or any subclass (just like in the current version)
3) check whether the type is a subclass of str or bytes and, if so, return
it unchanged




Hi Koos and thanks for your detailed response,


The reason why this was not done was that a str or bytes subclass that
implements __fspath__(self) would work in both pre-3.6 and 3.6+ but
behave differently. This would be also be incompatible with existing
code using str(path) for compatibility with the stdlib (the old way,
which people still use for pre-3.6 compatibility even in new code).



I'm not sure that sounds very convincing because that exact problem 
exists, was discussed and accepted in your PEP 519 for all other 
classes. I do not really see why subclasses of str and bytes should 
require special backwards compatibility here. Is there a reason why you 
are thinking they should be treated specially?



This would have the following implications:
a) it would speed up the very common case when the arg is either a str or a
bytes instance exactly


To get the same performance benefit for str and bytes, but without
changing functionality, there could first be the exact type check and
then the isinstance check. This would add some performance penalty for
PathLike objects. Removing the isinstance part of the __fspath__()
return value, which I find less useful, would compensate for that. (3)
would not be necessary in this version.



Right, that was one thing I forgot to mention in my list. My proposal 
would also speed up processing of pathlike objects because it moves the 
__fspath__ call up in front of the isinstance check. Your alternative 
would speed up only str and bytes, but would slow down Path-like classes.
In addition, I'm not sure that removing the isinstance check on the 
return value of __fspath__() is a good idea because that would mean 
giving up the guarantee that os.fspath returns an instance of str or 
bytes and would effectively force library code to do the isinstance 
check anyway even if the function may have performed it already, which 
would worsen performance further.



Are you asking for other reasons, or because you actually have a use
case where this matters? If this performance really matters somewhere,
the version I describe above could be considered. It would have 100%
backwards compatibility, or a little less (99% ?) if the isinstance
check of the __fspath__() return value is removed for performance
compensation.



That use case question is somewhat difficult to answer. I had this idea 
when working on two bug tracker issues (one concerning fnmatch and a 
follow-up one on os.path.normcase, which is called by fnmatch.filter 
and, in turn, calls os.fspath. fnmatchfilter is a case where performance 
matters and the decision when and where to call the rather expensive 
os.path.normcase->os.fspath there is not entirely straightforward. So, 
yes, I was basically looking at this because of a potential use case, 
but I say potential because I'm far from sure that any speed gain in 
os.fspath will be big enough to be useful for fnmatch.filter in the end.




b) user-defined classes that inherit from str or bytes could control their
path representation just like any other class


Again, this would cause differences in behavior between different
Python versions, and based on whether str(path) is used or not.

—Koos



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Wolfgang Maier

On 05/23/2017 06:41 PM, Wolfgang Maier wrote:

On 05/23/2017 06:17 PM, Koos Zevenhoven wrote:

On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier
 wrote:

What do you think of this idea for a slight modification to os.fspath:
the current version checks whether its arg is an instance of str, 
bytes or
any subclass and, if so, returns the arg unchanged. In all other 
cases it
tries to call the type's __fspath__ method to see if it can get str, 
bytes,

or a subclass thereof this way.

My proposal is to change this to:
1) check whether the type of the argument is str or bytes *exactly*; 
if so,

return the argument unchanged
2) check wether __fspath__ can be called on the type and returns an 
instance

of str, bytes, or any subclass (just like in the current version)
3) check whether the type is a subclass of str or bytes and, if so, 
return

it unchanged




Hi Koos and thanks for your detailed response,


The reason why this was not done was that a str or bytes subclass that
implements __fspath__(self) would work in both pre-3.6 and 3.6+ but
behave differently. This would be also be incompatible with existing
code using str(path) for compatibility with the stdlib (the old way,
which people still use for pre-3.6 compatibility even in new code).



I'm not sure that sounds very convincing because that exact problem 
exists, was discussed and accepted in your PEP 519 for all other 
classes. I do not really see why subclasses of str and bytes should 
require special backwards compatibility here. Is there a reason why you 
are thinking they should be treated specially?




Ah, sorry, I misunderstood what you were trying to say, but now I'm 
getting it! subclasses of str and bytes were of course usable as path 
arguments before simply because they were subclasses of them. Now they 
would be picked up based on their __fspath__ method, but old versions of 
Python executing code using them would still use them directly. Have to 
think about this one a bit, but thanks for pointing it out.



This would have the following implications:
a) it would speed up the very common case when the arg is either a 
str or a

bytes instance exactly


To get the same performance benefit for str and bytes, but without
changing functionality, there could first be the exact type check and
then the isinstance check. This would add some performance penalty for
PathLike objects. Removing the isinstance part of the __fspath__()
return value, which I find less useful, would compensate for that. (3)
would not be necessary in this version.



Right, that was one thing I forgot to mention in my list. My proposal 
would also speed up processing of pathlike objects because it moves the 
__fspath__ call up in front of the isinstance check. Your alternative 
would speed up only str and bytes, but would slow down Path-like classes.
In addition, I'm not sure that removing the isinstance check on the 
return value of __fspath__() is a good idea because that would mean 
giving up the guarantee that os.fspath returns an instance of str or 
bytes and would effectively force library code to do the isinstance 
check anyway even if the function may have performed it already, which 
would worsen performance further.



Are you asking for other reasons, or because you actually have a use
case where this matters? If this performance really matters somewhere,
the version I describe above could be considered. It would have 100%
backwards compatibility, or a little less (99% ?) if the isinstance
check of the __fspath__() return value is removed for performance
compensation.



That use case question is somewhat difficult to answer. I had this idea 
when working on two bug tracker issues (one concerning fnmatch and a 
follow-up one on os.path.normcase, which is called by fnmatch.filter 
and, in turn, calls os.fspath. fnmatchfilter is a case where performance 
matters and the decision when and where to call the rather expensive 
os.path.normcase->os.fspath there is not entirely straightforward. So, 
yes, I was basically looking at this because of a potential use case, 
but I say potential because I'm far from sure that any speed gain in 
os.fspath will be big enough to be useful for fnmatch.filter in the end.



b) user-defined classes that inherit from str or bytes could control 
their

path representation just like any other class


Again, this would cause differences in behavior between different
Python versions, and based on whether str(path) is used or not.

—Koos



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Koos Zevenhoven
On Tue, May 23, 2017 at 1:49 PM, Steven D'Aprano  wrote:
>
> How about simplifying the implementation of fspath by giving str and
> bytes a __fspath__ method that returns str(self) or bytes(self)?
>

The compatiblity issue I mention in the other email I just sent as a
response to the OP will appear if a subclass returns something other
than str(self) or bytes(self).

—Koos

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Koos Zevenhoven
On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier
 wrote:
> What do you think of this idea for a slight modification to os.fspath:
> the current version checks whether its arg is an instance of str, bytes or
> any subclass and, if so, returns the arg unchanged. In all other cases it
> tries to call the type's __fspath__ method to see if it can get str, bytes,
> or a subclass thereof this way.
>
> My proposal is to change this to:
> 1) check whether the type of the argument is str or bytes *exactly*; if so,
> return the argument unchanged
> 2) check wether __fspath__ can be called on the type and returns an instance
> of str, bytes, or any subclass (just like in the current version)
> 3) check whether the type is a subclass of str or bytes and, if so, return
> it unchanged

The reason why this was not done was that a str or bytes subclass that
implements __fspath__(self) would work in both pre-3.6 and 3.6+ but
behave differently. This would be also be incompatible with existing
code using str(path) for compatibility with the stdlib (the old way,
which people still use for pre-3.6 compatibility even in new code).

> This would have the following implications:
> a) it would speed up the very common case when the arg is either a str or a
> bytes instance exactly

To get the same performance benefit for str and bytes, but without
changing functionality, there could first be the exact type check and
then the isinstance check. This would add some performance penalty for
PathLike objects. Removing the isinstance part of the __fspath__()
return value, which I find less useful, would compensate for that. (3)
would not be necessary in this version.

Are you asking for other reasons, or because you actually have a use
case where this matters? If this performance really matters somewhere,
the version I describe above could be considered. It would have 100%
backwards compatibility, or a little less (99% ?) if the isinstance
check of the __fspath__() return value is removed for performance
compensation.

> b) user-defined classes that inherit from str or bytes could control their
> path representation just like any other class

Again, this would cause differences in behavior between different
Python versions, and based on whether str(path) is used or not.

—Koos


-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Steven D'Aprano
On Tue, May 23, 2017 at 12:12:11PM +0200, Wolfgang Maier wrote:

> Here's how the proposal could be implemented in the pure Python version 
> (os._fspath):
> 
> def _fspath(path):
> path_type = type(path)
> if path_type is str or path_type is bytes:
> return path

How about simplifying the implementation of fspath by giving str and 
bytes a __fspath__ method that returns str(self) or bytes(self)?

class str:
 def __fspath__(self):
 return str(self)  # Must be str, not type(self).


(1) We can avoid most of the expensive type checks.

(2) Subclasses of str and bytes don't have to do anything to get a 
useful default behaviour.


def fspath(path):
try:
dunder = type(path).__fspath__
except AttributeError:
raise TypeError(...) from None
else:
if dunder is not None:
result = dunder(path)
if type(result) in (str, byte):
return result
raise TypeError('expected a str or bytes, got ...')


The reason for the not None check is to allow subclasses to explicitly 
deny that they can be used for paths by setting __fspath__ to None in 
the subclass.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Wolfgang Maier

What do you think of this idea for a slight modification to os.fspath:
the current version checks whether its arg is an instance of str, bytes 
or any subclass and, if so, returns the arg unchanged. In all other 
cases it tries to call the type's __fspath__ method to see if it can get 
str, bytes, or a subclass thereof this way.


My proposal is to change this to:
1) check whether the type of the argument is str or bytes *exactly*; if 
so, return the argument unchanged
2) check wether __fspath__ can be called on the type and returns an 
instance of str, bytes, or any subclass (just like in the current version)
3) check whether the type is a subclass of str or bytes and, if so, 
return it unchanged


This would have the following implications:
a) it would speed up the very common case when the arg is either a str 
or a bytes instance exactly
b) user-defined classes that inherit from str or bytes could control 
their path representation just like any other class
c) subclasses of str/bytes that don't define __fspath__ would still work 
like they do now, but their processing would be slower
d) subclasses of str/bytes that accidentally define a __fspath__ method 
would change their behavior


I think cases c) and d) could be sufficiently rare that the pros 
outweigh the cons?



Here's how the proposal could be implemented in the pure Python version 
(os._fspath):


def _fspath(path):
path_type = type(path)
if path_type is str or path_type is bytes:
return path

# Work from the object's type to match method resolution of other magic
# methods.
try:
path_repr = path_type.__fspath__(path)
except AttributeError:
if hasattr(path_type, '__fspath__'):
raise
elif issubclass(path_type, (str, bytes)):
return path
else:
raise TypeError("expected str, bytes or os.PathLike object, "
"not " + path_type.__name__)
if isinstance(path_repr, (str, bytes)):
return path_repr
else:
raise TypeError("expected {}.__fspath__() to return str or bytes, "
"not {}".format(path_type.__name__,
type(path_repr).__name__))

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/