Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
Nick Coghlan writes:
> On 15 April 2016 at 00:52, Stephen J. Turnbull wrote:
> > Nick Coghlan writes:
> >
> > > The use case for returning bytes from __fspath__ is DirEntry, so you
> > > can write things like this in low level code:
> > >
> > > def myscandir(dirpath):
> > > for entry in os.scandir(dirpath):
> > > if entry.is_file():
> > > with open(entry) as f:
> > > # do something
> >
> > Excuse me, but that is *not* a use case for returning bytes from
> > DirEntry.__fspath__. open() is perfectly happy taking str (including
> > surrogate-encoded rawbytes).
>
> That results in a different type for the file object's name:
>
> >>> open("README.md").name
> 'README.md'
> >>> open(b"README.md").name
> b'README.md'
OK, you win, __fspath__ needs to be polymorphic.
But you've just shifted me to -1 on "os.fspath": it's an attractive
nuisance. EIBTI, applications and high-level library functions should
use os.fsdecode or os.fsencode. Functions that take a polymorphic
argument and want preserve type should invoke __fspath__ on the
argument. That will visually signal that the caller is not merely
low-level, but is explicitly a boundary function. (You could rename
the generic function as "os._fspath", I guess, but I *really* want to
deprecate calling the polymorphic version in user code. _fspath can
be added if experience shows that polymorphic usage is very desireable
outside the stdlib. This remark is in my not-so-Dutch opinion, of
course.)
> The guarantee we want to provide those folks is that if they're
> operating in the binary domain they'll stay there.
Et tu, Nick? "Guarantee"?! You can't guarantee any such thing with
an implicitly invoked polymorphic API like this one -- unless you
consider a crashed program to be in the binary domain. ;-) Note that
the current proposala don't even do that for the binary domain, only
for the text domain!
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 16 April 2016 at 12:21, Stephen J. Turnbull wrote: > OK, you win, __fspath__ needs to be polymorphic. > > But you've just shifted me to -1 on "os.fspath": it's an attractive > nuisance. EIBTI, applications and high-level library functions should > use os.fsdecode or os.fsencode. I presume your expectation is that os.fsencode/os.fsdecode will work with objects supporting the __fspath__ protocol? So the question for me is, if I'm writing a function that takes a path argument p (in the most general sense - I want my function to be able to handle anything the stdlib functions can) then how do I write the code? There are 4 cases I can think of: 1. I just want to pass the argument on to other functions - just do so, stdlib functions will work fine. 2. I need a string - use os.fsdecode(p) 3. I need bytes - use os.fsencode(p) 4. I need a guaranteed pathlib.Path object so that I can use Path methods - convert via pathlib.Path(os.fsdecode(p)) I guess there's the possibility that you want to deliberately reject bytes-like paths, and it's not immediately obvious how you'd do that without os.fspath or using the __fspath__ protocol directly, but I'm not sure what anyone gains by doing so (maybe the chance to fail early? but doesn't using fsdecode mean I never need to fail at all?) While I don't have any specific reason to object to os.fspath, I'd appreciate someone describing a concrete use case that needs it (and isn't covered by any of the options above). Paul ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 8 updated on whether to break before or after a binary update
Hi, On 04/15/2016 07:43 PM, Guido van Rossum wrote: > The update is already serving its real purpose: showing that style is > debatable and cannot always easily be reduced to fixed rules. > As you said, there will be always some kind personal preferences or style taste and one can see on the debate that the current rules are context dependent. But I wonder how far that style context/rule (function) evaluation/application issue could be solved in a machine learning context. Regards, francis ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
Paul Moore writes: > On 16 April 2016 at 12:21, Stephen J. Turnbull wrote: > > OK, you win, __fspath__ needs to be polymorphic. > > > > But you've just shifted me to -1 on "os.fspath": it's an attractive > > nuisance. EIBTI, applications and high-level library functions should > > use os.fsdecode or os.fsencode. > > I presume your expectation is that os.fsencode/os.fsdecode will work > with objects supporting the __fspath__ protocol? Yes, I've suggested that before, and I think it's TOOWTDI, rather than insisting on a os.fspath intervening, even if os.fspath is included after all. > So the question for me is, if I'm writing a function that takes a path > argument p: > 1. I just want to pass the argument on to other functions - just do > so, stdlib functions will work fine. I think this is a bad idea unless you *need* polymorphism, but OK, it's "consenting adults". > 2. I need a string - use os.fsdecode(p) > 3. I need bytes - use os.fsencode(p) > 4. I need a guaranteed pathlib.Path object so that I can use Path > methods - convert via pathlib.Path(os.fsdecode(p)) LGTM. Applications or user toolkits could provide a derived IFeelLuckyPath(Path) for symmetry with the os functions. > I guess there's the possibility that you want to deliberately reject > bytes-like paths, I wouldn't put it that way. I think more likely is the possibility that you want to restrict yourself to a particular type, as all your code is written in terms of that type and expects that type. Note that Nick's example shows that in both the bytes domain and the text domain you can easily end up with a filelike.name of the wrong type. > and it's not immediately obvious how you'd do that without > os.fspath or using the __fspath__ protocol directly, but I'm not > sure what anyone gains by doing so (maybe the chance to fail early? > but doesn't using fsdecode mean I never need to fail at all?) Well, wouldn't you like to raise there if your dataflow spec says only one type should ever be observed? The reasons that I wouldn't bother are that (1) I suspect it's going to be very rare to see bytes in a text application, and (2) in bytes- oriented code I would be fairly likely to either specify literals as str (a bug, but nobody would ever notice) or importing them from an .ini or other text source (which might very well be in a non- filesystem encoding in my environment!) In either case it's probably the filename I want but specified in the wrong form. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 8 updated on whether to break before or after a binary update
Victor Stinner writes:
> Hum.
>
> if (width == 0
> and height == 0
> and color == 'red'
> and emphasis == 'strong'
> or highlight > 100):
> raise ValueError("sorry, you lose")
>
> Please remove one space to vertically align "and" operators with the
> opening parenthesis:
>
> if (width == 0
>and height == 0
>and color == 'red'
>and emphasis == 'strong'
>or highlight > 100):
> raise ValueError("sorry, you lose")
The RightThang[tm] is to remove "if" and replace it with the Japanese
"moshi":
moshi (width == 0
and height == 0
and color == 'red'
and emphasis == 'strong'
or highlight > 100):
raise ValueError("sorry, you lose")
It-works-for-me-ly y'rs,
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 16 April 2016 at 14:46, Stephen J. Turnbull wrote: > Paul Moore writes: [...] > > 1. I just want to pass the argument on to other functions - just do > > so, stdlib functions will work fine. > > I think this is a bad idea unless you *need* polymorphism, but OK, > it's "consenting adults". All I'm really saying here is that if you don't need to care about type checking (and 99% of Python programs rely on duck typing, so this is pretty much the norm) then everything will be OK. I'm not suggesting encouraging polymorphism, just pointing out that most code should simply work and this whole debate is a non-issue for code like that. (That's the whole point of getting the stdlib functions to accept Path objects, after all :-)) > > 2. I need a string - use os.fsdecode(p) > > 3. I need bytes - use os.fsencode(p) > > 4. I need a guaranteed pathlib.Path object so that I can use Path > > methods - convert via pathlib.Path(os.fsdecode(p)) > > LGTM. Applications or user toolkits could provide a derived > IFeelLuckyPath(Path) for symmetry with the os functions. > > > I guess there's the possibility that you want to deliberately reject > > bytes-like paths, > > I wouldn't put it that way. I think more likely is the possibility > that you want to restrict yourself to a particular type, as all your > code is written in terms of that type and expects that type. Note > that Nick's example shows that in both the bytes domain and the text > domain you can easily end up with a filelike.name of the wrong type. But within your own code, you do that by convention and good coding practices, not by explicit type checks (except in boundary code). If you're writing a library to be used by others, you should be as permissive as possible - you may not expect your code to be called with bytes-like paths, but why go out of your way to reject it? That's not Pythonic, IMO. (On the other hand, documenting that only text-like path objects are supported by your library is fine). In my experience, bytes/text safety is about being aware of where the two different types appear in your program, not about forcing only one type. So my cases are about keeping the types clear - the output of (1) is "same as input", of (2) is "string", of (3) is "bytes" and of (4) is "Path". Call me with whatever you like, I can work with it in terms I need. But we're mostly just debating coding style here, I think we agree on the basic principle. > > and it's not immediately obvious how you'd do that without > > os.fspath or using the __fspath__ protocol directly, but I'm not > > sure what anyone gains by doing so (maybe the chance to fail early? > > but doesn't using fsdecode mean I never need to fail at all?) > > Well, wouldn't you like to raise there if your dataflow spec says only > one type should ever be observed? Meh. Maybe asserts, maybe unit tests. But typechecks throughout my code sounds more like strong typing than Python. But as I say, coding style - I write scripts, glue code, and general-use libraries. None of these lend themselves to that sort of rigorous dataflow analysis (this is the same reason I have little personal use for the new typechecking stuff). > The reasons that I wouldn't bother are that (1) I suspect it's going > to be very rare to see bytes in a text application, and (2) in bytes- > oriented code I would be fairly likely to either specify literals as > str (a bug, but nobody would ever notice) or importing them from an > .ini or other text source (which might very well be in a non- > filesystem encoding in my environment!) In either case it's probably > the filename I want but specified in the wrong form. Also, that feels very much like the sort of boundary code that needs to do the fiddly rigorous stuff so the rest of us don't have to :-) Paul ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
> On Apr 13, 2016, at 8:31 PM, Nick Coghlan wrote:
>
>>> class Special(bytes):
>>> def __fspath__(self):
>>> return 'str-val'
>>> obj = Special('bytes-val', 'utf8')
>>> path_obj = fspath(obj, allow_bytes=True)
>>>
>>> With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'.
>
> In this kind of case, inheritance tends to trump protocol.
Sure, but...
> example, int subclasses can't override operator.index:
...
> The reasons for that behaviour are more pragmatic than philosophical:
> builtins and their subclasses are extensively special-cased for speed
> reasons,
OK, but in this case, purity can beat practicality. If the author
writes an __fspath__ method, presumably it's because it should be
used.
And I can certainly imagine one might want to store a path
representation as bytes, but NOT want the raw bytes passed off to file
handling libs.
(of course you could use composition rather than subclassing if you had to)
-CHB
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
The outstanding bug with this patch right now is a regression in line numbers causing the test for http://bugs.python.org/issue9936 to fail. I've tried to debug it without success ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 16 April 2016 at 21:21, Stephen J. Turnbull wrote:
> Nick Coghlan writes:
>
> > On 15 April 2016 at 00:52, Stephen J. Turnbull wrote:
> > > Nick Coghlan writes:
> > >
> > > > The use case for returning bytes from __fspath__ is DirEntry, so you
> > > > can write things like this in low level code:
> > > >
> > > > def myscandir(dirpath):
> > > > for entry in os.scandir(dirpath):
> > > > if entry.is_file():
> > > > with open(entry) as f:
> > > > # do something
> > >
> > > Excuse me, but that is *not* a use case for returning bytes from
> > > DirEntry.__fspath__. open() is perfectly happy taking str (including
> > > surrogate-encoded rawbytes).
> >
> > That results in a different type for the file object's name:
> >
> > >>> open("README.md").name
> > 'README.md'
> > >>> open(b"README.md").name
> > b'README.md'
>
> OK, you win, __fspath__ needs to be polymorphic.
>
> But you've just shifted me to -1 on "os.fspath": it's an attractive
> nuisance.
>
> EIBTI, applications and high-level library functions should
> use os.fsdecode or os.fsencode. Functions that take a polymorphic
> argument and want preserve type should invoke __fspath__ on the
> argument. That will visually signal that the caller is not merely
> low-level, but is explicitly a boundary function.
str and bytes aren't going to implement __fspath__ (since they're only
*sometimes* path objects), so asking people to call the protocol
method directly for any purpose would be a pain.
> (You could rename
> the generic function as "os._fspath", I guess, but I *really* want to
> deprecate calling the polymorphic version in user code. _fspath can
> be added if experience shows that polymorphic usage is very desireable
> outside the stdlib. This remark is in my not-so-Dutch opinion, of
> course.)
You may have missed my email where I agreed os.fspath() itself needs
to ensure the output is a str object and throw an exception otherwise.
The remaining API design debate relates to whether the polymorphic
version should be "os.fspath(obj, allow_bytes=True)" or
"os._raw_fspath(obj)" (with Ethan favouring the former, and me the
latter).
>
> > The guarantee we want to provide those folks is that if they're
> > operating in the binary domain they'll stay there.
>
> Et tu, Nick? "Guarantee"?! You can't guarantee any such thing with
> an implicitly invoked polymorphic API like this one -- unless you
> consider a crashed program to be in the binary domain. ;-)
I do, as one of the core changes in design philosophy between Python 2
and 3 is attempting to remove the implicit level shifting between the
binary and text domains, and instead throw exceptions in those cases.
Pragmatism requires us to keep some of them (e.g. the codecs module is
officially object<->object in both Python 2 and Python 3, and string
formatting codes can still do unexpected things), but a great many of
them are already gone, and we don't want to add any new ones if
alternative designs are available.
> Note that
> the current proposala don't even do that for the binary domain, only
> for the text domain!
Folks that want to ensure they're working in the binary domain can
already do "memoryview(obj)" to ensure they have a bytes-like object
without constraining it to a specific type.
Cheers,
Nick.
--
Nick Coghlan | [email protected] | Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 17 April 2016 at 04:47, Chris Barker - NOAA Federal
wrote:
>> On Apr 13, 2016, at 8:31 PM, Nick Coghlan wrote:
>>
class Special(bytes):
def __fspath__(self):
return 'str-val'
obj = Special('bytes-val', 'utf8')
path_obj = fspath(obj, allow_bytes=True)
With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'.
>>
>> In this kind of case, inheritance tends to trump protocol.
>
> Sure, but...
>
>> example, int subclasses can't override operator.index:
> ...
>> The reasons for that behaviour are more pragmatic than philosophical:
>> builtins and their subclasses are extensively special-cased for speed
>> reasons,
>
> OK, but in this case, purity can beat practicality. If the author
> writes an __fspath__ method, presumably it's because it should be
> used.
>
> And I can certainly imagine one might want to store a path
> representation as bytes, but NOT want the raw bytes passed off to file
> handling libs.
>
> (of course you could use composition rather than subclassing if you had to)
Exactly - inheritance is a really strong relationship that directly
affects the in-memory layout of instances (at least in CPython), and
also the kinds of assumption other code will make about that type (for
example, subclasses are special cased to allow them to override the
behaviour of numeric binary operators when they appear as the right
operand with an instance of the parent type as the left operand, while
with unrelated types, the left operand always gets the first chance to
handle the operation).
When folks don't want to trigger those "this is an " behaviours,
the appropriate design pattern is composition, not inheritance (and
many of the ABCs were introduced to make it easier to implement
particular interfaces without inheriting from the corresponding
builtin types).
Cheers,
Nick.
--
Nick Coghlan | [email protected] | Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
