Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-16 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > On 15 April 2016 at 00:52, Stephen J. Turnbull  wrote:
 > > Nick Coghlan writes:
 > >
 > >  > The use case for returning bytes from __fspath__ is DirEntry, so you
 > >  > can write things like this in low level code:
 > >  >
 > >  > def myscandir(dirpath):
 > >  > for entry in os.scandir(dirpath):
 > >  > if entry.is_file():
 > >  > with open(entry) as f:
 > >  > # do something
 > >
 > > Excuse me, but that is *not* a use case for returning bytes from
 > > DirEntry.__fspath__.  open() is perfectly happy taking str (including
 > > surrogate-encoded rawbytes).
 > 
 > That results in a different type for the file object's name:
 > 
 > >>> open("README.md").name
 > 'README.md'
 > >>> open(b"README.md").name
 > b'README.md'

OK, you win, __fspath__ needs to be polymorphic.

But you've just shifted me to -1 on "os.fspath": it's an attractive
nuisance.  EIBTI, applications and high-level library functions should
use os.fsdecode or os.fsencode.  Functions that take a polymorphic
argument and want preserve type should invoke __fspath__ on the
argument.  That will visually signal that the caller is not merely
low-level, but is explicitly a boundary function.  (You could rename
the generic function as "os._fspath", I guess, but I *really* want to
deprecate calling the polymorphic version in user code.  _fspath can
be added if experience shows that polymorphic usage is very desireable
outside the stdlib.  This remark is in my not-so-Dutch opinion, of
course.)

 > The guarantee we want to provide those folks is that if they're
 > operating in the binary domain they'll stay there.

Et tu, Nick?  "Guarantee"?!  You can't guarantee any such thing with
an implicitly invoked polymorphic API like this one -- unless you
consider a crashed program to be in the binary domain. ;-)  Note that
the current proposala don't even do that for the binary domain, only
for the text domain!

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-16 Thread Paul Moore
On 16 April 2016 at 12:21, Stephen J. Turnbull  wrote:
> OK, you win, __fspath__ needs to be polymorphic.
>
> But you've just shifted me to -1 on "os.fspath": it's an attractive
> nuisance.  EIBTI, applications and high-level library functions should
> use os.fsdecode or os.fsencode.

I presume your expectation is that os.fsencode/os.fsdecode will work
with objects supporting the __fspath__ protocol?

So the question for me is, if I'm writing a function that takes a path
argument p (in the most general sense - I want my function to be able
to handle anything the stdlib functions can) then how do I write the
code? There are 4 cases I can think of:

1. I just want to pass the argument on to other functions - just do
so, stdlib functions will work fine.
2. I need a string - use os.fsdecode(p)
3. I need bytes - use os.fsencode(p)
4. I need a guaranteed pathlib.Path object so that I can use Path
methods - convert via pathlib.Path(os.fsdecode(p))

I guess there's the possibility that you want to deliberately reject
bytes-like paths, and it's not immediately obvious how you'd do that
without os.fspath or using the __fspath__ protocol directly, but I'm
not sure what anyone gains by doing so (maybe the chance to fail
early? but doesn't using fsdecode mean I never need to fail at all?)

While I don't have any specific reason to object to os.fspath, I'd
appreciate someone describing a concrete use case that needs it (and
isn't covered by any of the options above).

Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 8 updated on whether to break before or after a binary update

2016-04-16 Thread francismb
Hi,

On 04/15/2016 07:43 PM, Guido van Rossum wrote:
> The update is already serving its real purpose: showing that style is
> debatable and cannot always easily be reduced to fixed rules.
> 

As you said, there will be always some kind personal preferences or
style taste and one can see on the debate that the current rules are
context dependent. But I wonder how far that style context/rule
(function) evaluation/application issue could be solved in a machine
learning context.

Regards,
francis
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-16 Thread Stephen J. Turnbull
Paul Moore writes:
 > On 16 April 2016 at 12:21, Stephen J. Turnbull  wrote:
 > > OK, you win, __fspath__ needs to be polymorphic.
 > >
 > > But you've just shifted me to -1 on "os.fspath": it's an attractive
 > > nuisance.  EIBTI, applications and high-level library functions should
 > > use os.fsdecode or os.fsencode.
 > 
 > I presume your expectation is that os.fsencode/os.fsdecode will work
 > with objects supporting the __fspath__ protocol?

Yes, I've suggested that before, and I think it's TOOWTDI, rather than
insisting on a os.fspath intervening, even if os.fspath is included
after all.

 > So the question for me is, if I'm writing a function that takes a path
 > argument p:

 > 1. I just want to pass the argument on to other functions - just do
 > so, stdlib functions will work fine.

I think this is a bad idea unless you *need* polymorphism, but OK,
it's "consenting adults".

 > 2. I need a string - use os.fsdecode(p)
 > 3. I need bytes - use os.fsencode(p)
 > 4. I need a guaranteed pathlib.Path object so that I can use Path
 > methods - convert via pathlib.Path(os.fsdecode(p))

LGTM.  Applications or user toolkits could provide a derived
IFeelLuckyPath(Path) for symmetry with the os functions.

 > I guess there's the possibility that you want to deliberately reject
 > bytes-like paths,

I wouldn't put it that way.  I think more likely is the possibility
that you want to restrict yourself to a particular type, as all your
code is written in terms of that type and expects that type.  Note
that Nick's example shows that in both the bytes domain and the text
domain you can easily end up with a filelike.name of the wrong type.

 > and it's not immediately obvious how you'd do that without
 > os.fspath or using the __fspath__ protocol directly, but I'm not
 > sure what anyone gains by doing so (maybe the chance to fail early? 
 > but doesn't using fsdecode mean I never need to fail at all?)

Well, wouldn't you like to raise there if your dataflow spec says only
one type should ever be observed?

The reasons that I wouldn't bother are that (1) I suspect it's going
to be very rare to see bytes in a text application, and (2) in bytes-
oriented code I would be fairly likely to either specify literals as
str (a bug, but nobody would ever notice) or importing them from an
.ini or other text source (which might very well be in a non-
filesystem encoding in my environment!)  In either case it's probably
the filename I want but specified in the wrong form.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 8 updated on whether to break before or after a binary update

2016-04-16 Thread Stephen J. Turnbull
Victor Stinner writes:
 > Hum.
 > 
 > if (width == 0
 > and height == 0
 > and color == 'red'
 > and emphasis == 'strong'
 > or highlight > 100):
 > raise ValueError("sorry, you lose")
 > 
 > Please remove one space to vertically align "and" operators with the
 > opening parenthesis:
 > 
 > if (width == 0
 >and height == 0
 >and color == 'red'
 >and emphasis == 'strong'
 >or highlight > 100):
 > raise ValueError("sorry, you lose")

The RightThang[tm] is to remove "if" and replace it with the Japanese
"moshi":

moshi (width == 0
   and height == 0
   and color == 'red'
   and emphasis == 'strong'
   or highlight > 100):
raise ValueError("sorry, you lose")

It-works-for-me-ly y'rs,

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-16 Thread Paul Moore
On 16 April 2016 at 14:46, Stephen J. Turnbull  wrote:
> Paul Moore writes:
[...]
>  > 1. I just want to pass the argument on to other functions - just do
>  > so, stdlib functions will work fine.
>
> I think this is a bad idea unless you *need* polymorphism, but OK,
> it's "consenting adults".

All I'm really saying here is that if you don't need to care about
type checking (and 99% of Python programs rely on duck typing, so this
is pretty much the norm) then everything will be OK. I'm not
suggesting encouraging polymorphism, just pointing out that most code
should simply work and this whole debate is a non-issue for code like
that. (That's the whole point of getting the stdlib functions to
accept Path objects, after all :-))

>  > 2. I need a string - use os.fsdecode(p)
>  > 3. I need bytes - use os.fsencode(p)
>  > 4. I need a guaranteed pathlib.Path object so that I can use Path
>  > methods - convert via pathlib.Path(os.fsdecode(p))
>
> LGTM.  Applications or user toolkits could provide a derived
> IFeelLuckyPath(Path) for symmetry with the os functions.
>
>  > I guess there's the possibility that you want to deliberately reject
>  > bytes-like paths,
>
> I wouldn't put it that way.  I think more likely is the possibility
> that you want to restrict yourself to a particular type, as all your
> code is written in terms of that type and expects that type.  Note
> that Nick's example shows that in both the bytes domain and the text
> domain you can easily end up with a filelike.name of the wrong type.

But within your own code, you do that by convention and good coding
practices, not by explicit type checks (except in boundary code). If
you're writing a library to be used by others, you should be as
permissive as possible - you may not expect your code to be called
with bytes-like paths, but why go out of your way to reject it? That's
not Pythonic, IMO. (On the other hand, documenting that only text-like
path objects are supported by your library is fine).

In my experience, bytes/text safety is about being aware of where the
two different types appear in your program, not about forcing only one
type. So my cases are about keeping the types clear - the output of
(1) is "same as input", of (2) is "string", of (3) is "bytes" and of
(4) is "Path". Call me with whatever you like, I can work with it in
terms I need.

But we're mostly just debating coding style here, I think we agree on
the basic principle.

>  > and it's not immediately obvious how you'd do that without
>  > os.fspath or using the __fspath__ protocol directly, but I'm not
>  > sure what anyone gains by doing so (maybe the chance to fail early?
>  > but doesn't using fsdecode mean I never need to fail at all?)
>
> Well, wouldn't you like to raise there if your dataflow spec says only
> one type should ever be observed?

Meh. Maybe asserts, maybe unit tests. But typechecks throughout my
code sounds more like strong typing than Python. But as I say, coding
style - I write scripts, glue code, and general-use libraries. None of
these lend themselves to that sort of rigorous dataflow analysis (this
is the same reason I have little personal use for the new typechecking
stuff).

> The reasons that I wouldn't bother are that (1) I suspect it's going
> to be very rare to see bytes in a text application, and (2) in bytes-
> oriented code I would be fairly likely to either specify literals as
> str (a bug, but nobody would ever notice) or importing them from an
> .ini or other text source (which might very well be in a non-
> filesystem encoding in my environment!)  In either case it's probably
> the filename I want but specified in the wrong form.

Also, that feels very much like the sort of boundary code that needs
to do the fiddly rigorous stuff so the rest of us don't have to :-)

Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-16 Thread Chris Barker - NOAA Federal
> On Apr 13, 2016, at 8:31 PM, Nick Coghlan  wrote:
>
>>>   class Special(bytes):
>>>   def __fspath__(self):
>>> return 'str-val'
>>>   obj = Special('bytes-val', 'utf8')
>>>   path_obj = fspath(obj, allow_bytes=True)
>>>
>>> With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'.
>
> In this kind of case, inheritance tends to trump protocol.

Sure, but...

> example, int subclasses can't override operator.index:
...
> The reasons for that behaviour are more pragmatic than philosophical:
> builtins and their subclasses are extensively special-cased for speed
> reasons,

OK, but in this case, purity can beat practicality. If the author
writes an __fspath__ method, presumably it's because it should be
used.

And I can certainly imagine one might want to store a path
representation as bytes, but NOT want the raw bytes passed off to file
handling libs.

(of course you could use composition rather than subclassing if you had to)

-CHB
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-16 Thread Demur Rumed
 The outstanding bug with this patch right now is a regression in line
numbers causing the test for http://bugs.python.org/issue9936 to fail. I've
tried to debug it without success
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-16 Thread Nick Coghlan
On 16 April 2016 at 21:21, Stephen J. Turnbull  wrote:
> Nick Coghlan writes:
>
>  > On 15 April 2016 at 00:52, Stephen J. Turnbull  wrote:
>  > > Nick Coghlan writes:
>  > >
>  > >  > The use case for returning bytes from __fspath__ is DirEntry, so you
>  > >  > can write things like this in low level code:
>  > >  >
>  > >  > def myscandir(dirpath):
>  > >  > for entry in os.scandir(dirpath):
>  > >  > if entry.is_file():
>  > >  > with open(entry) as f:
>  > >  > # do something
>  > >
>  > > Excuse me, but that is *not* a use case for returning bytes from
>  > > DirEntry.__fspath__.  open() is perfectly happy taking str (including
>  > > surrogate-encoded rawbytes).
>  >
>  > That results in a different type for the file object's name:
>  >
>  > >>> open("README.md").name
>  > 'README.md'
>  > >>> open(b"README.md").name
>  > b'README.md'
>
> OK, you win, __fspath__ needs to be polymorphic.
>
> But you've just shifted me to -1 on "os.fspath": it's an attractive
> nuisance.
>
> EIBTI, applications and high-level library functions should
> use os.fsdecode or os.fsencode.  Functions that take a polymorphic
> argument and want preserve type should invoke __fspath__ on the
> argument. That will visually signal that the caller is not merely
> low-level, but is explicitly a boundary function.

str and bytes aren't going to implement __fspath__ (since they're only
*sometimes* path objects), so asking people to call the protocol
method directly for any purpose would be a pain.

>  (You could rename
> the generic function as "os._fspath", I guess, but I *really* want to
> deprecate calling the polymorphic version in user code.  _fspath can
> be added if experience shows that polymorphic usage is very desireable
> outside the stdlib.  This remark is in my not-so-Dutch opinion, of
> course.)

You may have missed my email where I agreed os.fspath() itself needs
to ensure the output is a str object and throw an exception otherwise.
The remaining API design debate relates to whether the polymorphic
version should be "os.fspath(obj, allow_bytes=True)" or
"os._raw_fspath(obj)" (with Ethan favouring the former, and me the
latter).

>
>  > The guarantee we want to provide those folks is that if they're
>  > operating in the binary domain they'll stay there.
>
> Et tu, Nick?  "Guarantee"?!  You can't guarantee any such thing with
> an implicitly invoked polymorphic API like this one -- unless you
> consider a crashed program to be in the binary domain. ;-)

I do, as one of the core changes in design philosophy between Python 2
and 3 is attempting to remove the implicit level shifting between the
binary and text domains, and instead throw exceptions in those cases.
Pragmatism requires us to keep some of them (e.g. the codecs module is
officially object<->object in both Python 2 and Python 3, and string
formatting codes can still do unexpected things), but a great many of
them are already gone, and we don't want to add any new ones if
alternative designs are available.

> Note that
> the current proposala don't even do that for the binary domain, only
> for the text domain!

Folks that want to ensure they're working in the binary domain can
already do "memoryview(obj)" to ensure they have a bytes-like object
without constraining it to a specific type.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-16 Thread Nick Coghlan
On 17 April 2016 at 04:47, Chris Barker - NOAA Federal
 wrote:
>> On Apr 13, 2016, at 8:31 PM, Nick Coghlan  wrote:
>>
   class Special(bytes):
   def __fspath__(self):
 return 'str-val'
   obj = Special('bytes-val', 'utf8')
   path_obj = fspath(obj, allow_bytes=True)

 With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'.
>>
>> In this kind of case, inheritance tends to trump protocol.
>
> Sure, but...
>
>> example, int subclasses can't override operator.index:
> ...
>> The reasons for that behaviour are more pragmatic than philosophical:
>> builtins and their subclasses are extensively special-cased for speed
>> reasons,
>
> OK, but in this case, purity can beat practicality. If the author
> writes an __fspath__ method, presumably it's because it should be
> used.
>
> And I can certainly imagine one might want to store a path
> representation as bytes, but NOT want the raw bytes passed off to file
> handling libs.
>
> (of course you could use composition rather than subclassing if you had to)

Exactly - inheritance is a really strong relationship that directly
affects the in-memory layout of instances (at least in CPython), and
also the kinds of assumption other code will make about that type (for
example, subclasses are special cased to allow them to override the
behaviour of numeric binary operators when they appear as the right
operand with an instance of the parent type as the left operand, while
with unrelated types, the left operand always gets the first chance to
handle the operation).

When folks don't want to trigger those "this is an " behaviours,
the appropriate design pattern is composition, not inheritance (and
many of the ABCs were introduced to make it easier to implement
particular interfaces without inheriting from the corresponding
builtin types).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com