Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Nick Coghlan
On 14 April 2016 at 14:05, Random832  wrote:
> On Wed, Apr 13, 2016, at 23:27, Nick Coghlan wrote:
>> In this kind of case, inheritance tends to trump protocol. For
>> example, int subclasses can't override operator.index:
> ...
>> The reasons for that behaviour are more pragmatic than philosophical:
>> builtins and their subclasses are extensively special-cased for speed
>> reasons, and those shortcuts are encountered before the interpreter
>> even considers using the general protocol.
>>
>> In cases where the magic method return types are polymorphic (so
>> subclasses may want to override them) we'll use more restrictive exact
>> type checks for the shortcuts, but that argument doesn't apply for
>> typechecked protocols where the result is required to be an instance
>> of a particular builtin type (but subclasses are considered
>> acceptable).
>
> Then why aren't we doing it for str? Because "try: path =
> path.__fspath__()" is more idiomatic than the alternative?

The sketches Brett posted will bear little resemblance to the actual
implementation - that will be in C and use similar idioms to those we
use for other abstract protocols (such as shortcuts for instances of
builtin types, and doing the method lookup via the passed in object's
type, rather than on the instance).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Nick Coghlan
On 14 April 2016 at 13:54, Random832  wrote:
> On Wed, Apr 13, 2016, at 23:17, Nick Coghlan wrote:
>
>> - os.fspath -> str (no coercion)
>> - os.fsdecode -> str (with coercion from bytes)
>> - os.fsencode -> bytes (with coercion from str)
>> - os._raw_fspath -> str-or-bytes (no coercion)
>>
>> (with "coercion" referring to how the result of __fspath__ and any
>> directly passed in str or bytes objects are handled)
>>
>> The leading underscore on _raw_fspath would be of the "this is a
>> documented and stable API, but you probably don't want to use it
>> unless you really know what you're doing" variety, rather than the
>> "this is an undocumented and potentially unstable private API"
>> variety.
>
> In this scenario could the protocol return bytes?

Yes, that's desirable to handle DirEntry transparently regardless of type.

> If the protocol can return bytes, then that means that types (DirEntry?
> someone had an alternate path library with a bPath?) which return bytes
> via the protocol will proliferate, and cannot be safely passed to
> anything that uses os.fspath. Numerous copies of "def myfspath(x):
> return os.fsdecode(os._raw_fspath(x))" will proliferate (or they'll just
> monkey-patch os.fspath), and no-one actually uses os.fspath except toy
> examples.

If folks want coercion, they can just use os.fsdecode(x), as that
already has a str -> str passthrough from the input to the output
(unlike codecs.decode) and will presumably be updated to include an
implicit call to os._raw_fspath() on the passed in object.

> Why is it so objectionable for os.fspath to do coercion?

The first problem is that binary paths on Windows basically don't
work, so it's preferable for them to fail fast regardless of platform,
rather than to have them implicitly work on *nix, only to fail for
Windows users using non-ASCII paths later.

The second is that it would make os.fspath and os.fsdecode
functionally equivalent, so we'd have two different spellings for the
same operation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Random832
On Wed, Apr 13, 2016, at 23:27, Nick Coghlan wrote:
> In this kind of case, inheritance tends to trump protocol. For
> example, int subclasses can't override operator.index:
...
> The reasons for that behaviour are more pragmatic than philosophical:
> builtins and their subclasses are extensively special-cased for speed
> reasons, and those shortcuts are encountered before the interpreter
> even considers using the general protocol.
> 
> In cases where the magic method return types are polymorphic (so
> subclasses may want to override them) we'll use more restrictive exact
> type checks for the shortcuts, but that argument doesn't apply for
> typechecked protocols where the result is required to be an instance
> of a particular builtin type (but subclasses are considered
> acceptable).

Then why aren't we doing it for str? Because "try: path =
path.__fspath__()" is more idiomatic than the alternative?

If some sort of reasoned decision has been made to require the protocol
to trump the special case for str subclasses, it's unreasonable not to
apply the same decision to bytes subclasses. The decision should be
"always use the protocol first" or "always use the type match first".

In other words, why not this:

def fspath(path, *, allow_bytes=False):
if isinstance(path, (bytes, str) if allow_bytes else str)
return path
try:
m = path.__fspath__
except AttributeError:
raise TypeError
path = m()
if isinstance(path, (bytes, str) if allow_bytes else str)
return path
raise TypeError
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Random832
On Wed, Apr 13, 2016, at 23:17, Nick Coghlan wrote:

> - os.fspath -> str (no coercion)
> - os.fsdecode -> str (with coercion from bytes)
> - os.fsencode -> bytes (with coercion from str)
> - os._raw_fspath -> str-or-bytes (no coercion)
> 
> (with "coercion" referring to how the result of __fspath__ and any
> directly passed in str or bytes objects are handled)
> 
> The leading underscore on _raw_fspath would be of the "this is a
> documented and stable API, but you probably don't want to use it
> unless you really know what you're doing" variety, rather than the
> "this is an undocumented and potentially unstable private API"
> variety.

In this scenario could the protocol return bytes?

If the protocol cannot return bytes, then _raw_fspath will only return
bytes if directly passed bytes. This limits its utility for the
functions that consume it (presumably path_convert (os.open and friends)
and builtin open), since they already have to act specially based on the
types of their arguments (builtin open can accept an integer;
path_convert has to behave radically differently on str or bytes input)
and there's no reason they couldn't simply accept bytes directly while
they're doing that.

If the protocol can return bytes, then that means that types (DirEntry?
someone had an alternate path library with a bPath?) which return bytes
via the protocol will proliferate, and cannot be safely passed to
anything that uses os.fspath. Numerous copies of "def myfspath(x):
return os.fsdecode(os._raw_fspath(x))" will proliferate (or they'll just
monkey-patch os.fspath), and no-one actually uses os.fspath except toy
examples.

Why is it so objectionable for os.fspath to do coercion?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Nick Coghlan
On 14 April 2016 at 13:14, Ethan Furman  wrote:
> On 04/13/2016 07:57 PM, Nikolaus Rath wrote:
>> Either I haven't understood your answer, or you haven't understood my
>> question. I'm concerned about this case:
>>
>>class Special(bytes):
>>def __fspath__(self):
>>  return 'str-val'
>>obj = Special('bytes-val', 'utf8')
>>path_obj = fspath(obj, allow_bytes=True)
>>
>> With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'.
>
> I misunderstood your question.  That is... an interesting case.  ;)

In this kind of case, inheritance tends to trump protocol. For
example, int subclasses can't override operator.index:

>>> from operator import index
>>> class NotAnInt():
... def __index__(self):
... return 42
...
>>> index(NotAnInt())
42
>>> class MyInt(int):
... def __index__(self):
... return 42
...
>>> index(MyInt(53))
53

The reasons for that behaviour are more pragmatic than philosophical:
builtins and their subclasses are extensively special-cased for speed
reasons, and those shortcuts are encountered before the interpreter
even considers using the general protocol.

In cases where the magic method return types are polymorphic (so
subclasses may want to override them) we'll use more restrictive exact
type checks for the shortcuts, but that argument doesn't apply for
typechecked protocols where the result is required to be an instance
of a particular builtin type (but subclasses are considered
acceptable).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Nick Coghlan
On 14 April 2016 at 12:49, Nick Coghlan  wrote:
> The API could be something like:
>
> - os.fspath -> str-or-bytes
> - os.fsencode -> bytes (with coercion from str)
> - os.fsdecode -> str (with coercion from bytes)
> - os.strpath -> str (no coercion)

There seems to be fairly broad opposition to the idea of defining the
public API in terms of what os and os.path are likely to need, which
reminded me of Koos's suggestion of using a private API for the
str-or-bytes variant. That approach would give us something like:

- os.fspath -> str (no coercion)
- os.fsdecode -> str (with coercion from bytes)
- os.fsencode -> bytes (with coercion from str)
- os._raw_fspath -> str-or-bytes (no coercion)

(with "coercion" referring to how the result of __fspath__ and any
directly passed in str or bytes objects are handled)

The leading underscore on _raw_fspath would be of the "this is a
documented and stable API, but you probably don't want to use it
unless you really know what you're doing" variety, rather than the
"this is an undocumented and potentially unstable private API"
variety.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Ethan Furman

On 04/13/2016 07:57 PM, Nikolaus Rath wrote:

On Apr 13 2016, Ethan Furman wrote:

On 04/13/2016 03:45 PM, Nikolaus Rath wrote:



When passing an object that is of type str and has a __fspath__
attribute, all approaches return the value of __fspath__().

However, when passing something of type bytes, the second approach
returns the object, while the third returns the value of __fspath__().

Is this intentional? I think a __fspath__ attribute should always be
preferred.


Yes, it is intentional.  The second approach assumes __fspath__ can
only contain str, so there is no point in checking it for bytes.


Either I haven't understood your answer, or you haven't understood my
question. I'm concerned about this case:

   class Special(bytes):
   def __fspath__(self):
 return 'str-val'
   obj = Special('bytes-val', 'utf8')
   path_obj = fspath(obj, allow_bytes=True)

With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'.


I misunderstood your question.  That is... an interesting case.  ;)

--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Nick Coghlan
On 14 April 2016 at 08:26, Victor Stinner  wrote:
> 2016-04-14 0:11 GMT+02:00 Ryan Gonzalez :
>> So code that depends on iterating through bytecode via HAS_ARG is going to
>> break...
>
> Sure. This change is backward incompatible for applications parsing
> bytecode in C or Python. That's why the patch also has to update the
> dis module.
>
> I don't see how you plan to keep the backwad compatibility, since the
> argument size changed from 2 bytes to 1 byte. You must update your
> code (written in C or Python or whatever).
>
> Hopefully, the dis was enhanced in Python 3.4: get_instructions() now
> gives nice Instructon objects rather than only pure text output.
>
> FYI I wrote my own library to decode and decode bytecode. It provides
> abstract bytecode objects to easily modify bytecode:
> https://bytecode.readthedocs.org/
>
> I suggest to use such library (or simply the dis module for simple
> needs) if you have to handle bytecode, rather than writing your own
> code.
>
> I know a few other projects which handle directly bytecode:
>
> * https://pypi.python.org/pypi/codetransformer
> * https://github.com/serprex/byteplay
> * https://pypi.python.org/pypi/coverage
>
> IHMO it's not a big deal to update these projects for the future
> Python 3.6. I can even help them to support the new bytecode format.

+1

We've also had previous discussions on adding a "minimum viable
bytecode editing" API to the standard library, and updating these
third party modules to support wordcode instead of bytecode could
provide a good use-case-driven opportunity for defining that (i.e. it
wouldn't be about providing an end user facing API directly, but
rather about letting CPython take care of the bookkeeping details for
things like lnotab and sorting out jump targets).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Nikolaus Rath
On Apr 13 2016, Ethan Furman  wrote:
> On 04/13/2016 03:45 PM, Nikolaus Rath wrote:
>
>> When passing an object that is of type str and has a __fspath__
>> attribute, all approaches return the value of __fspath__().
>>
>> However, when passing something of type bytes, the second approach
>> returns the object, while the third returns the value of __fspath__().
>>
>> Is this intentional? I think a __fspath__ attribute should always be
>> preferred.
>
> Yes, it is intentional.  The second approach assumes __fspath__ can
> only contain str, so there is no point in checking it for bytes.

Either I haven't understood your answer, or you haven't understood my
question. I'm concerned about this case:

  class Special(bytes):
  def __fspath__(self):
return 'str-val'
  obj = Special('bytes-val', 'utf8')
  path_obj = fspath(obj, allow_bytes=True)  

With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'.

I would expect that fspath(obj, allow_bytes=True) == 'str-val' (after
all, it's allow_bytes, not require_bytes). Bu


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Nick Coghlan
On 14 April 2016 at 07:37, Victor Stinner  wrote:
> Le mercredi 13 avril 2016, Brett Cannon  a écrit :
>>
>> All of this is demonstrated in
>> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 by the
>> various possibilities. In the end it's not a corner case because the
>> definition of __fspath__ will be such that there's no ambiguity in what
>> os.fspath() will accept and what __fspath__ can return and the code will be
>> written to conform to what the PEP dictates (IOW I'm aware that this needs
>> to be considered in the implementation :) .
>
> I'm not a big fan of a flag parameter to change the return type of a
> function. Usually, two functions are preferred. In the os module we have
> getcwd/getcwdb for example. I don't know if it's a good example

It is, as one of the benefits of the "two separate functions" model is
to improve type inference during static analysis - you don't
necessarily know the values of parameters at analysis time, but you do
know which function is being called.

> Do you know other examples of Python functions taking a (flag) parameter to
> change the result type?

subprocess.Popen has a couple of flags that can do that (more
precisely, they change the return type of some methods on the
resulting object), but that's not an especially pretty API in general.
String based type variations are more common (e.g. file mode flags,
using the codec module registry), but they're still used only
sparingly (since they make the code harder to reason about for both
humans and static analysers).

In terms of types for filesystem path APIs:

1. I assume we'll want a fast path for bytes & str to avoid
performance regressions (especially in os.path, where we may be doing
pure data manipulation without any IO operations)
2. I favour defining __fspath__ and os.fspath() in terms of what the
os and os.path modules need to handle both DirEntry and pathlib (which
I currently expect to be str-or-bytes)
3. For the benefit of higher level cross-platform code like pathlib,
it likely makes sense to also have a str-only API that throws an
exception rather than returning bytes

However, I also suggest deferring a decision on 3 until 2 has been
definitively answered by way of implementing the changes. If I'm right
about 2, then the API could be something like:

- os.fspath -> str-or-bytes
- os.fsencode -> bytes (with coercion from str)
- os.fsdecode -> str (with coercion from bytes)
- os.strpath -> str (no coercion)

It's also worth noting that os.fsencode and os.fsdecode are already
idempotent - their current signatures are "str-or-bytes -> bytes" and
"str-or-bytes -> str". With a str-or-bytes return type on os.fspath,
adapting them to handle rich path objects should just be a matter of
adding an os.fspath call as the first step.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Random832

On Apr 13, 2016 19:06, Brett Cannon  wrote:
> On Wed, 13 Apr 2016 at 15:46 Nikolaus Rath  wrote:
>> When passing an object that is of type str and has a __fspath__
>> attribute, all approaches return the value of __fspath__().
>>
>> However, when passing something of type bytes, the second approach
>> returns the object, while the third returns the value of __fspath__().
>>
>> Is this intentional? I think a __fspath__ attribute should always be
>> preferred.
>
>
> It's very much intentional. If we define __fspath__() to only return strings 
> but still want to minimize boilerplate of allowing bytes to simply pass 
> through without checking a path argument to see if it is bytes then approach 
> #2 is warranted. But if __fspath__() can return bytes then approach #3 allows 
> for it. 

Er, the difference comes in when the object passed to os.fspath is a subclass 
of bytes that, itself, has a __fspath__ method (which may return a str). It's 
unlikely to occur in the wild, but is a semantic difference between this case 
and all other objects with __fspath__ methods.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Random832

On Apr 13, 2016 20:06, Chris Barker  wrote:
>
> In this case, I don't know that we need to be tolerant of buggy __fspathname__() implementations -- they should be tested outside these checks, and not be buggy. So a buggy implementation may raise and may be ignored, depending on what Exception the bug triggers -- big deal. The only time it would matter is when the implementer is debugging the implementation.
>
> -CHB
Yes but you can often, and can in this case, restrict the contents of the try block to a single operation - a name access, an attribute, a subscript - and that sharply limits the risk of such a thing happening. Sure, the object's __getattr(ibute)__ could still fail from something deep inside it missing a different attribute, but that's it.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Ethan Furman

On 04/13/2016 05:06 PM, Chris Barker wrote:


In this case, I don't know that we need to be tolerant of buggy
__fspathname__() implementations -- they should be tested outside these
checks, and not be buggy. So a buggy implementation may raise and may be
ignored, depending on what Exception the bug triggers -- big deal. The
only time it would matter is when the implementer is debugging the
implementation.


Yet the idea behind robust exception handling is to test as little as 
possible and only catch what you know how to correct.


This code catches only one thing, only at one place, and we know how to 
deal with it:


  try:
 fsp = obj.__fspath__
  except AttributeError:
 pass
  else:
 fsp = fsp()

Contrarily, this next code catches the same error, but it could happen 
at the one place we know how to deal with it *or* anywhere further down 
the call stack where we have no clue what the proper course is to handle 
the problem... yet we suppress it anyway:


  try:
fsp = obj.__fspath__()
  except AttributeError:
pass

Certainly not code I want to see in the stdlib.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Chris Barker
On Wed, Apr 13, 2016 at 1:47 PM, Random832  wrote:

> On Wed, Apr 13, 2016, at 16:39, Chris Barker wrote:
> > so are we worried that __fspath__ will exist and be callable, but  might
> > raise an AttributeError somewhere inside itself? if so isn't it broken
> > anyway, so should it be ignored?
>
> Well, if you're going to say "ignore the protocol because it's broken",
> where do you stop? What if it raises some other exception? What if it
> raises SystemExit?


this is pretty much always the case with EAFTP coding:

try:
something()
except SomeError:
do_something_else()

unless SomeError is a custom defined error that you know is never going to
get raised anywhere else, then something() could raise SomeError for the
reason you expect, or some code deep in the call stack could raise
SomeError also, and you wouldn't know that.

I had a student run into this and it took him a good while to debug it. But
that was because the code in something() was pretty darn buggy. If he had
tested something() by itself, there would have been no issue finding the
problem.

In this case, I don't know that we need to be tolerant of buggy
__fspathname__() implementations -- they should be tested outside these
checks, and not be buggy. So a buggy implementation may raise and may be
ignored, depending on what Exception the bug triggers -- big deal. The only
time it would matter is when the implementer is debugging the
implementation.

-CHB





-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 15:20 Victor Stinner 
wrote:

> Oh, since others voted, I will also vote and explain my vote.
>
> I like choice 1, str only, because it's very well defined. In Python
> 3, Unicode is simply the native type for text. It's accepted by almost
> all functions. In other emails, I also explained that Unicode is fine
> to store undecodable filenames on UNIX, it works as expected since
> many years (since Python 3.3).
>
> --
>
> If you cannot survive without bytes, I suggest to add two functions:
> one for str only, another which can return str or bytes.
>
> Maybe you want in fact two protocols: __fspath__(str only) and
> __fspathb__ (bytes only)? os.fspathb() would first try __fspathb__, or
> fallback to os.fsencode(__fspath__). os.fspath() would first try
> __fspath__, or fallback to os.fsdecode(__fspathb__). IMHO it's not
> worth to have such complexity while Unicode handles all use cases.
>

Implementing two magic methods for this seems like overkill. Best I would
be willing to do with automatic encode/decode is use
os.fsencode()/os.fsdecode() on the argument or what __fspath__() returned.


>
> Or do you know functions implemented in Python accepting str *and* bytes?
>

On purpose, nothing off the top of my head.


>
> --
>
> The C implementation of the os module has an important
> path_converter() function:
>
>  * path_converter accepts (Unicode) strings and their
>  * subclasses, and bytes and their subclasses.  What
>  * it does with the argument depends on the platform:
>  *
>  *   * On Windows, if we get a (Unicode) string we
>  * extract the wchar_t * and return it; if we get
>  * bytes we extract the char * and return that.
>  *
>  *   * On all other platforms, strings are encoded
>  * to bytes using PyUnicode_FSConverter, then we
>  * extract the char * from the bytes object and
>  * return that.
>
> This function will implement something like os.fspath().
>
> With os.fspath() only accepting str, we will return directly the
> Unicode string on Windows. On UNIX, Unicode will be encoded, as it's
> already done for Unicode strings.
>
> This specific function would benefit of the flavor 4 (os.fspath() can
> return str and bytes), but it's more an exception than the rule. I
> would be more a micro-optimization than a good reason to drive the API
> design.
>

Yep, it's interesting to know but Chris and I won't let it drive the
decision (I assume).

-Brett


>
> Victor
>
> Le mercredi 13 avril 2016, Brett Cannon  a écrit :
> >
> > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1
> has the four potential approaches implemented (although it doesn't follow
> the "separate functions" approach some are proposing and instead goes with
> the allow_bytes approach I originally proposed).
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 15:46 Nikolaus Rath  wrote:

> On Apr 13 2016, Brett Cannon  wrote:
> > On Tue, 12 Apr 2016 at 22:38 Michael Mysinger via Python-Dev <
> > python-dev@python.org> wrote:
> >
> >> Ethan Furman  stoneleaf.us> writes:
> >>
> >> > Do we allow bytes to be returned from os.fspath()?  If yes, then do we
> >> > allow bytes from __fspath__()?
> >>
> >> De-lurking. Especially since the ultimate goal is better
> interoperability,
> >> I
> >> feel like an implementation that people can play with would help guide
> the
> >> few remaining decisions. To help test the various options you could
> >> temporarily add a _allow_bytes=GLOBAL_CONFIG_OPTION default argument to
> >> both
> >> pathlib.__fspath__() and os.fspath(), with distinct configurable
> defaults
> >> for
> >> each.
> >>
> >> In the spirit of Python 3 I feel like bytes might not be needed in
> >> practice,
> >> but something like this with defaults of False will allow people to
> easily
> >> test all the various options.
> >>
> >
> > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has
> > the four potential approaches implemented (although it doesn't follow the
> > "separate functions" approach some are proposing and instead goes with
> the
> > allow_bytes approach I originally proposed).
>
>
> When passing an object that is of type str and has a __fspath__
> attribute, all approaches return the value of __fspath__().
>
> However, when passing something of type bytes, the second approach
> returns the object, while the third returns the value of __fspath__().
>
> Is this intentional? I think a __fspath__ attribute should always be
> preferred.
>

It's very much intentional. If we define __fspath__() to only return
strings but still want to minimize boilerplate of allowing bytes to simply
pass through without checking a path argument to see if it is bytes then
approach #2 is warranted. But if __fspath__() can return bytes then
approach #3 allows for it.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Ethan Furman

On 04/13/2016 03:45 PM, Nikolaus Rath wrote:


When passing an object that is of type str and has a __fspath__
attribute, all approaches return the value of __fspath__().

However, when passing something of type bytes, the second approach
returns the object, while the third returns the value of __fspath__().

Is this intentional? I think a __fspath__ attribute should always be
preferred.


Yes, it is intentional.  The second approach assumes __fspath__ can only 
contain str, so there is no point in checking it for bytes.


--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Nikolaus Rath
On Apr 13 2016, Brett Cannon  wrote:
> On Tue, 12 Apr 2016 at 22:38 Michael Mysinger via Python-Dev <
> python-dev@python.org> wrote:
>
>> Ethan Furman  stoneleaf.us> writes:
>>
>> > Do we allow bytes to be returned from os.fspath()?  If yes, then do we
>> > allow bytes from __fspath__()?
>>
>> De-lurking. Especially since the ultimate goal is better interoperability,
>> I
>> feel like an implementation that people can play with would help guide the
>> few remaining decisions. To help test the various options you could
>> temporarily add a _allow_bytes=GLOBAL_CONFIG_OPTION default argument to
>> both
>> pathlib.__fspath__() and os.fspath(), with distinct configurable defaults
>> for
>> each.
>>
>> In the spirit of Python 3 I feel like bytes might not be needed in
>> practice,
>> but something like this with defaults of False will allow people to easily
>> test all the various options.
>>
>
> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has
> the four potential approaches implemented (although it doesn't follow the
> "separate functions" approach some are proposing and instead goes with the
> allow_bytes approach I originally proposed).


When passing an object that is of type str and has a __fspath__
attribute, all approaches return the value of __fspath__().

However, when passing something of type bytes, the second approach
returns the object, while the third returns the value of __fspath__().

Is this intentional? I think a __fspath__ attribute should always be
preferred.


Best,
-Nikolaus


-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Yury Selivanov



On 2016-04-13 12:24 PM, Victor Stinner wrote:

Can someone please review the change?


+1 for the change.  I can take a look at the patch in a few days.

Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Victor Stinner
2016-04-14 0:11 GMT+02:00 Ryan Gonzalez :
> So code that depends on iterating through bytecode via HAS_ARG is going to
> break...

Sure. This change is backward incompatible for applications parsing
bytecode in C or Python. That's why the patch also has to update the
dis module.

I don't see how you plan to keep the backwad compatibility, since the
argument size changed from 2 bytes to 1 byte. You must update your
code (written in C or Python or whatever).

Hopefully, the dis was enhanced in Python 3.4: get_instructions() now
gives nice Instructon objects rather than only pure text output.

FYI I wrote my own library to decode and decode bytecode. It provides
abstract bytecode objects to easily modify bytecode:
https://bytecode.readthedocs.org/

I suggest to use such library (or simply the dis module for simple
needs) if you have to handle bytecode, rather than writing your own
code.

I know a few other projects which handle directly bytecode:

* https://pypi.python.org/pypi/codetransformer
* https://github.com/serprex/byteplay
* https://pypi.python.org/pypi/coverage

IHMO it's not a big deal to update these projects for the future
Python 3.6. I can even help them to support the new bytecode format.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Victor Stinner
Oh, since others voted, I will also vote and explain my vote.

I like choice 1, str only, because it's very well defined. In Python
3, Unicode is simply the native type for text. It's accepted by almost
all functions. In other emails, I also explained that Unicode is fine
to store undecodable filenames on UNIX, it works as expected since
many years (since Python 3.3).

--

If you cannot survive without bytes, I suggest to add two functions:
one for str only, another which can return str or bytes.

Maybe you want in fact two protocols: __fspath__(str only) and
__fspathb__ (bytes only)? os.fspathb() would first try __fspathb__, or
fallback to os.fsencode(__fspath__). os.fspath() would first try
__fspath__, or fallback to os.fsdecode(__fspathb__). IMHO it's not
worth to have such complexity while Unicode handles all use cases.

Or do you know functions implemented in Python accepting str *and* bytes?

--

The C implementation of the os module has an important
path_converter() function:

 * path_converter accepts (Unicode) strings and their
 * subclasses, and bytes and their subclasses.  What
 * it does with the argument depends on the platform:
 *
 *   * On Windows, if we get a (Unicode) string we
 * extract the wchar_t * and return it; if we get
 * bytes we extract the char * and return that.
 *
 *   * On all other platforms, strings are encoded
 * to bytes using PyUnicode_FSConverter, then we
 * extract the char * from the bytes object and
 * return that.

This function will implement something like os.fspath().

With os.fspath() only accepting str, we will return directly the
Unicode string on Windows. On UNIX, Unicode will be encoded, as it's
already done for Unicode strings.

This specific function would benefit of the flavor 4 (os.fspath() can
return str and bytes), but it's more an exception than the rule. I
would be more a micro-optimization than a good reason to drive the API
design.

Victor

Le mercredi 13 avril 2016, Brett Cannon  a écrit :
>
> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the 
> four potential approaches implemented (although it doesn't follow the 
> "separate functions" approach some are proposing and instead goes with the 
> allow_bytes approach I originally proposed).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Ryan Gonzalez
So code that depends on iterating through bytecode via HAS_ARG is going to
break...

Darn it. :/

--
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something’s wrong.
http://kirbyfan64.github.io/
On Apr 13, 2016 4:44 PM, "Victor Stinner"  wrote:

> Le mercredi 13 avril 2016, Ryan Gonzalez  a écrit :
>
>> What is the value of HAS_ARG going to be now?
>>
>
> I asked Demur to keep HAS_ARG(). Not really for backward compatibility,
> but for the dis module: to keep a nice assembler. There are also debug
> traces in ceval.c which use it.
>
> For ceval.c, we might use HAS_ARG() to micro-optimize oparg=0 (hardcode 0
> rather than reading the bytecode) for operators with no argument. Or maybe
> it's completly useless :-)
>
> Victor
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Victor Stinner
Le mercredi 13 avril 2016, Ryan Gonzalez  a écrit :

> What is the value of HAS_ARG going to be now?
>

I asked Demur to keep HAS_ARG(). Not really for backward compatibility, but
for the dis module: to keep a nice assembler. There are also debug traces
in ceval.c which use it.

For ceval.c, we might use HAS_ARG() to micro-optimize oparg=0 (hardcode 0
rather than reading the bytecode) for operators with no argument. Or maybe
it's completly useless :-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Tag-based buildmaster (was: Most 3.x buildbots are green again ... )

2016-04-13 Thread Zachary Ware
(Cross-posting to python-buildbots, discussion is probably best continued there)

On Wed, Apr 13, 2016 at 3:37 PM, Brett Cannon  wrote:
> On Wed, 13 Apr 2016 at 13:17 Zachary Ware 
> wrote:
>> After receiving a suggestion from koobs several months ago, I've been
>> intermittently thinking about completely redoing our buildmaster setup
>> such that instead of a single builder per version on each slave, we
>> instead set up a series of builders with particular 'tags', and each
>> builder attaches to each slave that satisfies the tags (running each
>> build only on the first slave available).  This would allow us to test
>> some of the rarer options (such as --without-threads) significantly
>> more often than 'never', and generally get a lot more
>> customization/flexibility of builds.  I haven't had a chance to sit
>> down and think out all the edge cases of this idea, but what do people
>> generally think of it?  I think the GitHub switchover will be a good
>> time to do this if it's generally seen as a decent idea, since there
>> will need to be some work on the buildmaster to do the switch anyway.
>
> So we have slaves connect to multiple builders who have requirements of what
> they are testing? So the --without-threads master would have all slaves able
> to compile --without-threads connect to it and then do that build? And those
> same slaves may also connect to the gcc and clang masters to do those builds
> as well? So would that mean slaves could potentially do a bunch of builds
> per change? That sounds nice to me as long as the slave maintainers are also
> up to utilizing this by double/triple/quadrupling their builds.

Basically, yes.  I'm unsure as to whether the build would be done on
all matching slaves on each change, or rotate between them (or use the
next available) on each change; that would likely come down to which
scheme we collectively want.  I also have vague ideas about having
'daily' or even 'weekly' tags for builds that are deemed to not need a
build for every changeset, which could alleviate some of the
multiplying.

-- 
Zach
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Victor Stinner
Oops sorry, I forgot to add that I have no strong opinion on the type (I
only have a minor preference for str only).

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Victor Stinner
Le mercredi 13 avril 2016, Brett Cannon  a écrit :
>
> All of this is demonstrated in
> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 by
> the various possibilities. In the end it's not a corner case because the
> definition of __fspath__ will be such that there's no ambiguity in what
> os.fspath() will accept and what __fspath__ can return and the code will be
> written to conform to what the PEP dictates (IOW I'm aware that this needs
> to be considered in the implementation :) .
>

I'm not a big fan of a flag parameter to change the return type of a
function. Usually, two functions are preferred. In the os module we have
getcwd/getcwdb for example. I don't know if it's a good example

Do you know other examples of Python functions taking a (flag) parameter to
change the result type?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Eric Fahlgren
The EXTENDED_ARG is included in the multibyte ops, I treat it just like any
other operator.  Here's a snippet of my hacked-dis.dis output, which made
it clear to me that I could just count them as an "operator with word
operand."

Line 3000: x = x if x or not x and x is None else x
0001dc83 7c 00 00 LOAD_FAST   x
0001dc86 91 01 00 EXTENDED_ARG1
0001dc89 70 9f dc JUMP_IF_TRUE_OR_POP L1dc9f
0001dc8c 7c 00 00 LOAD_FAST   x
0001dc8f 0c   UNARY_NOT
0001dc90 91 01 00 EXTENDED_ARG1
0001dc93 6f 9f dc JUMP_IF_FALSE_OR_POPL1dc9f
0001dc96 7c 00 00 LOAD_FAST   x
0001dc99 74 01 00 LOAD_GLOBAL None
0001dc9c 6b 08 00 COMPARE_OP  'is'
  L1dc9f:
0001dc9f 91 01 00 EXTENDED_ARG1
0001dca2 72 ab dc POP_JUMP_IF_FALSE   L1dcab
0001dca5 7c 00 00 LOAD_FAST   x
0001dca8 6e 03 00 JUMP_FORWARDL1dcae (+3)
  L1dcab:
0001dcab 7c 00 00 LOAD_FAST   x
  L1dcae:
0001dcae 7d 00 00 STORE_FAST  x


On Wed, Apr 13, 2016 at 2:23 PM, Victor Stinner 
wrote:

> 2016-04-13 23:02 GMT+02:00 Eric Fahlgren :
> > Percentage of 1-byte args= 96.80%
>
> Yeah, I expected such high ratio. Good news that you confirm it.
>
>
> > Non-argument ops =53,719
> > One-byte args=   368,787
> > Multi-byte args  =12,191
>
> Again, only a very few arguments take multiple bytes. Good, the
> bytecode will be smaller.
>
> IMHO it's more a nice side effect than a real goal. The runtime
> performance matters more than the size of the bytecode, it's not like
> a bytecode take 4 MB. It's probably closer to 1 KB and so can probably
> benefit of the fatest CPU caches.
>
>
> > Just for the record, here's my arithmetic:
> > byteCodeSize = 1*nonArgumentOps + 3*oneByteArgs + 3*multiByteArgs
> > wordCodeSize = 2*nonArgumentOps + 2*oneByteArgs + 4*multiByteArgs
>
> If multiByteArgs means any size > 1 byte, the wordCodeSize formula is
> wrong:
>
> - no parameter: 2 bytes
> - 8-bit parameter: 2 bytes
> - 16-bit parameter: 4 bytes
> - 24-bit parameter: 6 bytes
> - 32-bit parameter: 8 bytes
>
> But you wrote that you didn't see EXTEND_ARG, so I guess that
> multibyte means 16-bit in your case, and so your formula is correct.
>
> Hopefully, I don't expect 32-bit parameters in the wild, only 24-bit
> parameter for function with annotation.
>
>
> > (It is interesting to note that I have never encountered an EXTENDED_ARG
> operator in the wild, only in my own synthetic examples.)
>
> As I wrote, EXTENDED_ARG can be seen when MAKE_FUNCTION is used with
> annotations.
>
> Victor
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Ryan Gonzalez
What is the value of HAS_ARG going to be now?

--
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something’s wrong.
http://kirbyfan64.github.io/
On Apr 13, 2016 11:26 AM, "Victor Stinner"  wrote:

> Hi,
>
> In the middle of recent discussions about Python performance, it was
> discussed to change the Python bytecode. Serhiy proposed to reuse
> MicroPython short bytecode to reduce the disk space and reduce the
> memory footprint.
>
> Demur Rumed proposes a different change to use a regular bytecode
> using 16-bit units: an instruction has always one 8-bit argument, it's
> zero if the instruction doesn't have an argument:
>
>http://bugs.python.org/issue26647
>
> According to benchmarks, it looks faster:
>
>   http://bugs.python.org/issue26647#msg263339
>
> IMHO it's a nice enhancement: it makes the code simpler. The most
> interesting change is made in Python/ceval.c:
>
> -if (HAS_ARG(opcode))
> -oparg = NEXTARG();
> +oparg = NEXTARG();
>
> This code is the very hot loop evaluating Python bytecode. I expect
> that removing a conditional branch here can reduce the CPU branch
> misprediction.
>
> I reviewed first versions of the change, and IMHO it's almost ready to
> be merged. But I would prefer to have a review from a least a second
> core reviewer.
>
> Can someone please review the change?
>
> --
>
> The side effect of wordcode is that arguments in 0..255 now uses 2
> bytes per instruction instead of 3, so it also reduce the size of
> bytecode for the most common case.
>
> Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead
> of 3. Arguments are supported up to 32-bit: 24-bit uses 3 units (6
> bytes), 32-bit uses 4 units (8 bytes). MAKE_FUNCTION uses 16-bit
> argument for keyword defaults and 24-bit argument for annotations.
> Other common instruction known to use large argument are jumps for
> bytecode longer than 256 bytes.
>
> --
>
> Right now, ceval.c still fetchs opcode and then oparg with two 8-bit
> instructions. Later, we can discuss if it would be possible to ensure
> that the bytecode is always aligned to 16-bit in memory to fetch the
> two bytes using a uint16_t* pointer.
>
> Maybe we can overallocate 1 byte in codeobject.c and align manually
> the memory block if needed. Or ceval.c should maybe copy the code if
> it's not aligned?
>
> Raymond Hettinger proposes something like that, but it looks like
> there are concerns about non-aligned memory accesses:
>
>http://bugs.python.org/issue25823
>
> The cost of non-aligned memory accesses depends on the CPU
> architecture, but it can raise a SIGBUS on some arch (MIPS and
> SPARC?).
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Victor Stinner
2016-04-13 23:02 GMT+02:00 Eric Fahlgren :
> Percentage of 1-byte args= 96.80%

Yeah, I expected such high ratio. Good news that you confirm it.


> Non-argument ops =53,719
> One-byte args=   368,787
> Multi-byte args  =12,191

Again, only a very few arguments take multiple bytes. Good, the
bytecode will be smaller.

IMHO it's more a nice side effect than a real goal. The runtime
performance matters more than the size of the bytecode, it's not like
a bytecode take 4 MB. It's probably closer to 1 KB and so can probably
benefit of the fatest CPU caches.


> Just for the record, here's my arithmetic:
> byteCodeSize = 1*nonArgumentOps + 3*oneByteArgs + 3*multiByteArgs
> wordCodeSize = 2*nonArgumentOps + 2*oneByteArgs + 4*multiByteArgs

If multiByteArgs means any size > 1 byte, the wordCodeSize formula is wrong:

- no parameter: 2 bytes
- 8-bit parameter: 2 bytes
- 16-bit parameter: 4 bytes
- 24-bit parameter: 6 bytes
- 32-bit parameter: 8 bytes

But you wrote that you didn't see EXTEND_ARG, so I guess that
multibyte means 16-bit in your case, and so your formula is correct.

Hopefully, I don't expect 32-bit parameters in the wild, only 24-bit
parameter for function with annotation.


> (It is interesting to note that I have never encountered an EXTENDED_ARG 
> operator in the wild, only in my own synthetic examples.)

As I wrote, EXTENDED_ARG can be seen when MAKE_FUNCTION is used with
annotations.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Eric Fahlgren
On Wednesday, April 13, 2016 09:25, Victor Stinner wrote:
> The side effect of wordcode is that arguments in 0..255 now uses 2 bytes per
> instruction instead of 3, so it also reduce the size of bytecode for the most
> common case.
> 
> Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead of 3.
> Arguments are supported up to 32-bit: 24-bit uses 3 units (6 bytes), 32-bit 
> uses 4
> units (8 bytes). MAKE_FUNCTION uses 16-bit argument for keyword defaults and
> 24-bit argument for annotations.
> Other common instruction known to use large argument are jumps for bytecode
> longer than 256 bytes.

A couple months ago during an earlier discussion of wordcode, I got curious 
enough to instrument dis.dis so that I could calculate the actual size changes 
expected in practice.  I ran it on a large chunk of our product code, here are 
the results (looks best with a fixed font).  I suspect the fairly significant 
reduction in footprint will also give better cache hit characteristics, so we 
might see some "magic" speed ups from that, too.

Code-generating source lines =70,792
Total bytes  = 1,196,653
Argument-bearing operators   =   380,978
Operands over 1 byte long=12,191
Extended arguments   = 0
Percentage of 1-byte args= 96.80%

Total operators  =   434,697
Non-argument ops =53,719
One-byte args=   368,787
Multi-byte args  =12,191
Byte code size   = 1,196,653
Word code size   =   893,776
Word:byte size   = 74.69%

Just for the record, here's my arithmetic:
byteCodeSize = 1*nonArgumentOps + 3*oneByteArgs + 3*multiByteArgs
wordCodeSize = 2*nonArgumentOps + 2*oneByteArgs + 4*multiByteArgs

(It is interesting to note that I have never encountered an EXTENDED_ARG 
operator in the wild, only in my own synthetic examples.)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Random832
On Wed, Apr 13, 2016, at 16:39, Chris Barker wrote:
> so are we worried that __fspath__ will exist and be callable, but  might
> raise an AttributeError somewhere inside itself? if so isn't it broken
> anyway, so should it be ignored?

Well, if you're going to say "ignore the protocol because it's broken",
where do you stop? What if it raises some other exception? What if it
raises SystemExit? 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 13:40 Chris Barker  wrote:

> so are we worried that __fspath__ will exist and be callable, but  might
> raise an AttributeError somewhere inside itself? if so isn't it broken
> anyway, so should it be ignored?
>

It should propagate instead of swallowing up the exception, otherwise it's
hard to debug why __fspath__ seems to be ignored.


>
> and I know it's asking permission rather than forgiveness, but what's
> wrong with:
>
> if hasattr(path, "__fspath__"):
> path = path.__fspath__()
>
> if you really want to check for the existence of the attribute first?
>
>
Nothing.


> or even:
>
> path = path.__fspath__ if hasattr(path, "__fspath__") else path
>
>
That also works.


>
> (OK, really a Pythonic style question now)
>

Yes, this is getting a bit side-tracked over some example code to just get
a concept across.

-Brett


>
> -CHB
>
>
>
> On Wed, Apr 13, 2016 at 12:54 PM, Brett Cannon  wrote:
>
>>
>>
>> On Wed, 13 Apr 2016 at 12:39 Fred Drake  wrote:
>>
>>> On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico 
>>> wrote:
>>> > Is that the intention, or should the exception catching be narrower? I
>>> > know it's clunky to write it in Python, but AIUI it's less so in C:
>>> >
>>> > try:
>>> > callme = path.__fspath__
>>> > except AttributeError:
>>> > pass
>>> > else:
>>> > path = callme()
>>>
>>> +1 for this variant; I really don't like masking errors inside the
>>> __fspath__ implementation.
>>>
>>
>> Don't read too much into the code in that gist. I just did them quickly
>> to get the point across of the proposals in terms of str/bytes, not what
>> will be proposed in any final patch.
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>>
> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov
>>
>>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Chris Barker
so are we worried that __fspath__ will exist and be callable, but  might
raise an AttributeError somewhere inside itself? if so isn't it broken
anyway, so should it be ignored?

and I know it's asking poermission rather than forgiveness, but what's
wrong with:

if hasattr(path, "__fspath__"):
path = path.__fspath__()

if you really want to check for the existence of the attribute first?

or even:

path = path.__fspath__ if hasattr(path, "__fspath__") else path


(OK, really a Pythonic style question now)

-CHB



On Wed, Apr 13, 2016 at 12:54 PM, Brett Cannon  wrote:

>
>
> On Wed, 13 Apr 2016 at 12:39 Fred Drake  wrote:
>
>> On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico  wrote:
>> > Is that the intention, or should the exception catching be narrower? I
>> > know it's clunky to write it in Python, but AIUI it's less so in C:
>> >
>> > try:
>> > callme = path.__fspath__
>> > except AttributeError:
>> > pass
>> > else:
>> > path = callme()
>>
>> +1 for this variant; I really don't like masking errors inside the
>> __fspath__ implementation.
>>
>
> Don't read too much into the code in that gist. I just did them quickly to
> get the point across of the proposals in terms of str/bytes, not what will
> be proposed in any final patch.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov
>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 13:17 Zachary Ware 
wrote:

> [SNIP]
> ---
>
> After receiving a suggestion from koobs several months ago, I've been
> intermittently thinking about completely redoing our buildmaster setup
> such that instead of a single builder per version on each slave, we
> instead set up a series of builders with particular 'tags', and each
> builder attaches to each slave that satisfies the tags (running each
> build only on the first slave available).  This would allow us to test
> some of the rarer options (such as --without-threads) significantly
> more often than 'never', and generally get a lot more
> customization/flexibility of builds.  I haven't had a chance to sit
> down and think out all the edge cases of this idea, but what do people
> generally think of it?  I think the GitHub switchover will be a good
> time to do this if it's generally seen as a decent idea, since there
> will need to be some work on the buildmaster to do the switch anyway.
>

So we have slaves connect to multiple builders who have requirements of
what they are testing? So the --without-threads master would have all
slaves able to compile --without-threads connect to it and then do that
build? And those same slaves may also connect to the gcc and clang masters
to do those builds as well? So would that mean slaves could potentially do
a bunch of builds per change? That sounds nice to me as long as the slave
maintainers are also up to utilizing this by double/triple/quadrupling
their builds.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Zachary Ware
On Wed, Apr 13, 2016 at 6:40 AM, Victor Stinner
 wrote:
> Hi,
>
> Last months, most 3.x buildbots failed randomly. Some of them were
> always failing. I spent some time to fix almost all Windows and Linux
> buildbots. There were a lot of different issues.

Thank you for doing this!

> Maybe it's time to move more 3.x buildbots to the "stable" category?
> http://buildbot.python.org/all/waterfall?category=3.x.stable

A few months ago, I put together a list of suggestions for updating
the stable/unstable list, but never got around to implementing it.

> We have many offline buildbots. What's the status of these buildbots?
> Should we expect that they come back soon?

My Windows 8.1 bot is a VM that resides on a machine that has been
disturbingly unstable lately, and it's starting to seem like the
instability is due to that VM.  I hope to have it back up (and stable)
again soon, but have no timetable for it.  My Docs bot was off after
losing power over the weekend, and I just hadn't noticed yet.  It's
back now.

I'll ping the python-buildbots list about other offline bots.

> Or would it be possible to hide them? It would help to check the
> status of all buildbots.

I'm not sure, but that would be a nice feature.

> - the 4 ICC buildbots are failing with stack overflow, segfault, etc.
> Again, I'm not sure that these buildbots are useful since it looks
> like we don't support this compiler yet. Or does it help to work on
> supporting this compiler? Who is working on ICC support?

The Ubuntu ICC bot is generally quite stable.  The OSX ICC bot is
currently offline, but has only a couple of known issues.  The Windows
ICC bot is still a bit experimental, but has inched closer to
producing a working build.  R. David Murray and I have been working
with Intel on ICC support.

> By the way, I'm always surprised by the huge difference of time needed
> to run a build on the different slaves: from a few minutes to more
> than 3 hours. The fatest Windows slave takes 28 minutes (run tests in
> parallel using 4 child processes), whereas the 3 others (run tests
> sequentially and) take between 2 hours and more than 3 hours! Why
> running tests on Windows takes so long?

Most of that is down to debug mode; building Python in debug mode
links with the debug CRT which also enables all manner of extra
checks.  When it's up, the non-debug Windows bot also runs the test
suite in ~28 minutes, running sequentially.

---

After receiving a suggestion from koobs several months ago, I've been
intermittently thinking about completely redoing our buildmaster setup
such that instead of a single builder per version on each slave, we
instead set up a series of builders with particular 'tags', and each
builder attaches to each slave that satisfies the tags (running each
build only on the first slave available).  This would allow us to test
some of the rarer options (such as --without-threads) significantly
more often than 'never', and generally get a lot more
customization/flexibility of builds.  I haven't had a chance to sit
down and think out all the edge cases of this idea, but what do people
generally think of it?  I think the GitHub switchover will be a good
time to do this if it's generally seen as a decent idea, since there
will need to be some work on the buildmaster to do the switch anyway.

-- 
Zach
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] List posting custom [was: current status of discussions]

2016-04-13 Thread Koos Zevenhoven
On Wed, Apr 13, 2016 at 5:56 AM, Stephen J. Turnbull  wrote:
> The following is my opinion, as will become obvious, but it's based on
> over a decade of observing these lists, and other open source
> development lists.  In a context where some core developers have
> unsubscribed from these lists, and others regularly report muting
> threads with a certain air of asperity, I think it's worth the risk of
> seeming arrogant to explain some of the customs (which are complex and
> subtle) around posting to Python developer lists.  I'm posting
> publicly because there are several new developers whose activity and
> fresh perspective is very welcome, but harmony *is* being disturbed,
> IMO unnecessarily.
>

Thank you for this thoughtful post. While none of the quotes you refer
to are mine, I did try to find whether any of the advice is something
I should learn from. While I didn't find a whole lot (please do
correct me if you think otherwise), it is also valuable to hear these
things from someone more experienced, even just to confirm what I may
have thought or guessed. I can't really tell, but possibly some of the
thoughts are interesting even to people significantly more experienced
than me.

I know you are not interested in discussing this further here, but
I'll add some inexperienced points of view inline below, just in case
someone is interested:

> This particular post caught my eye, but it's only an example of one of
> the most unharmonious posting styles that has become common recently.
> Attribution deliberately removed.
>
>  > Sorry for disturbing this thread's harmony.
>
> *sigh*  There is way too much of this on Python-Ideas recently, and
> there shouldn't be any on Python-Dev.  So please don't.  Specifically,
> disagreement with an apparently developing consensus is fine but
> please avoid this:
>
>  > >> Path is an alternative to os.path -- you don't need to use both.
>  >
>  > I agree with that quote of Chris.
>
> It's a waste of time to post *what* you agree with.[1]  Decisions are
> not taken by vote in this community, except for the color of the
> bikeshed, where it is agreed that *what* decision is taken doesn't
> matter, but that some decision should be taken expeditiously.[2]
> Chris already stated this position clearly and it's not a "color", so
> there is no need to reiterate.  It simply wastes others' time to read
> it.  (Whether it was a waste of the poster's time is not for me to
> comment on.)
>
> What matters to the decision is *why* you agree (or disagree).  If you
> think that some of Chris's arguments are bogus (and should be
> disregarded) and others are important, that is valuable information.
> It's even better if you can shed additional light on the matter
> (example below).
>
> Also, expression of agreement is often a prelude to a request for
> information.  "I agree with Z's post.  At least, I have never needed
> X.  *When* do you need X?  Let's look for a better way than X!"
>

That's what I thought too. I remember several times recently that I
have mentioned I agreed about something, then continuing to add more
to it, or even saying I disagree about something else. Part of the
reason to also state that I agree is an attempt to keep the overall
tone more positive. After all, the other person might be a highly
experienced core developer who just did not happen to have gone though
all the same thoughts regarding that specific question recently. I
hope that has not been interpreted as arrogance such as "I know better
than these people".

For me, as one of the (many?) newcomers, especially on -dev, it can
sometimes be difficult to tell whether not getting a reaction means
"Good point, I agree", "I did not understand so I'll just ignore it",
"I don't want to argue with you" or something else. Then again,
someone just saying essentially the same thing without a reference a
few posts later just feels strange. Also, if the only thing people
apparently do is disagree about things, it makes the overall tone of
the discussions at least *seem* very negative. From this point of view
there seems to be some good in positive comments.

> Unsupported (dis)agreement to statements about "needs" also may be
> taken as *rude*, because others may infer your arrogant claim to know
> what *they* do or don't need.  Admittedly there's a difficult
> distinction here between Chris's *idiom* where "you don't need to"
> translates to "In my understanding, it is generally not necessary to",
> and your *unsupported* agreement, which in my dialect of English
> changes the emphasis to imply you know better than those who disagree
> with you and Chris.  And, of course, the position that others are "too
> easily offended" is often reasonable, but you should be aware that
> there will be an impact on your reputation and ability to influence
> development of Python (even if it doesn't come near the point where
> a moderator invokes "Code of Conduct").
>
> "Me too" posts aren't entirely forbidden, but I feel that 

Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 12:39 Fred Drake  wrote:

> On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico  wrote:
> > Is that the intention, or should the exception catching be narrower? I
> > know it's clunky to write it in Python, but AIUI it's less so in C:
> >
> > try:
> > callme = path.__fspath__
> > except AttributeError:
> > pass
> > else:
> > path = callme()
>
> +1 for this variant; I really don't like masking errors inside the
> __fspath__ implementation.
>

Don't read too much into the code in that gist. I just did them quickly to
get the point across of the proposals in terms of str/bytes, not what will
be proposed in any final patch.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Chris Angelico
On Thu, Apr 14, 2016 at 5:46 AM, Random832  wrote:
> On Wed, Apr 13, 2016, at 15:24, Chris Angelico wrote:
>> Is that the intention, or should the exception catching be narrower? I
>> know it's clunky to write it in Python, but AIUI it's less so in C:
>
> How is it less so in C? You lose the ability to PyObject_CallMethod.

I might be wrong, then. Wasn't sure how it was all implemented.
Anyway, it's a correctness thing, not a simplicity one, so even if it
is clunkier, it ought to be the case.

And that is the intention, so we're fine.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Random832
On Wed, Apr 13, 2016, at 15:24, Chris Angelico wrote:
> Is that the intention, or should the exception catching be narrower? I
> know it's clunky to write it in Python, but AIUI it's less so in C:

How is it less so in C? You lose the ability to PyObject_CallMethod.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Alexander Walters

On 4/13/2016 13:49, Ethan Furman wrote:
Number 3: it allows bytes, but only when told it's okay to do so. 
Having code get a bytes object when one is not expected is not a 
headache we need to inflict on anyone. 


This is an artifact of the other needless restrictions I said I wouldn't 
rant about.  I think it is in the best interest not to perpetuate those 
needless restrictions.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Chris Angelico
On Thu, Apr 14, 2016 at 5:30 AM, Brett Cannon  wrote:
>
>
> On Wed, 13 Apr 2016 at 12:25 Chris Angelico  wrote:
>>
>> On Thu, Apr 14, 2016 at 3:10 AM, Brett Cannon  wrote:
>> > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has
>> > the
>> > four potential approaches implemented (although it doesn't follow the
>> > "separate functions" approach some are proposing and instead goes with
>> > the
>> > allow_bytes approach I originally proposed).
>>
>> All of them have this construct:
>>
>> try:
>> path = path.__fspath__()
>> except AttributeError:
>> pass
>>
>> Is that the intention, or should the exception catching be narrower? I
>> know it's clunky to write it in Python, but AIUI it's less so in C:
>>
>> try:
>> callme = path.__fspath__
>> except AttributeError:
>> pass
>> else:
>> path = callme()
>
>
> I'm assuming the C code will do what you're suggesting. My way is just
> faster to write in 2 minutes of coding. :)

Cool cool. Just checking!

You're already aware that my preference is for the first one,
str-only. I don't think the second one has much value (a path-like
object can only ever return a str, but a bytes can be passed through
unchanged?), and the fourth strikes me as a bad idea (just allowing
bytes any time). So my votes are +1, -0.5, +0, -1.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Fred Drake
On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico  wrote:
> Is that the intention, or should the exception catching be narrower? I
> know it's clunky to write it in Python, but AIUI it's less so in C:
>
> try:
> callme = path.__fspath__
> except AttributeError:
> pass
> else:
> path = callme()

+1 for this variant; I really don't like masking errors inside the
__fspath__ implementation.


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 12:25 Chris Angelico  wrote:

> On Thu, Apr 14, 2016 at 3:10 AM, Brett Cannon  wrote:
> > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1
> has the
> > four potential approaches implemented (although it doesn't follow the
> > "separate functions" approach some are proposing and instead goes with
> the
> > allow_bytes approach I originally proposed).
>
> All of them have this construct:
>
> try:
> path = path.__fspath__()
> except AttributeError:
> pass
>
> Is that the intention, or should the exception catching be narrower? I
> know it's clunky to write it in Python, but AIUI it's less so in C:
>
> try:
> callme = path.__fspath__
> except AttributeError:
> pass
> else:
> path = callme()
>

I'm assuming the C code will do what you're suggesting. My way is just
faster to write in 2 minutes of coding. :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Chris Angelico
On Thu, Apr 14, 2016 at 3:10 AM, Brett Cannon  wrote:
> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the
> four potential approaches implemented (although it doesn't follow the
> "separate functions" approach some are proposing and instead goes with the
> allow_bytes approach I originally proposed).

All of them have this construct:

try:
path = path.__fspath__()
except AttributeError:
pass

Is that the intention, or should the exception catching be narrower? I
know it's clunky to write it in Python, but AIUI it's less so in C:

try:
callme = path.__fspath__
except AttributeError:
pass
else:
path = callme()

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Antoine Pitrou
Brett Cannon  python.org> writes:
> In the spirit of Python 3 I feel like bytes might not be needed in practice,
> but something like this with defaults of False will allow people to easily
> test all the various options.
> 
> 
> 
> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has
the four potential approaches implemented (although it doesn't follow the
"separate functions" approach some are proposing and instead goes with the
allow_bytes approach I originally proposed). 

Either number 1 or number 3 for me (I don't think bytes path-like
objects are useful in Python).

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Ethan Furman

On 04/13/2016 10:22 AM, Alexander Walters wrote:

On 4/13/2016 13:10, Brett Cannon wrote:



https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1
has the four potential approaches implemented (although it doesn't
follow the "separate functions" approach some are proposing and
instead goes with the allow_bytes approach I originally proposed).


Number 4 is my personal favorite - it has a simple control flow path and
is the least needlessly restrictive.


Number 3: it allows bytes, but only when told it's okay to do so. 
Having code get a bytes object when one is not expected is not a 
headache we need to inflict on anyone.


--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Alexander Walters

On 4/13/2016 13:10, Brett Cannon wrote:
https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has 
the four potential approaches implemented (although it doesn't follow 
the "separate functions" approach some are proposing and instead goes 
with the allow_bytes approach I originally proposed). 


Number 4 is my personal favorite - it has a simple control flow path and 
is the least needlessly restrictive.


(I could rant about needless restrictions, but I am about a decade late 
for that, so I wont bother.)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 09:52 Random832  wrote:

> On Wed, Apr 13, 2016, at 11:28, Ethan Furman wrote:
> > On 04/13/2016 08:17 AM, Random832 wrote:
> > > On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote:
> >
> > >> I'd expect the main consumers to be os and os.path, and would honestly
> > >> be surprised if we needed many explicit invocations above that layer,
> > >> other than in pathlib itself.
> > >
> > > I made a toy implementation to try this out, and making os.open support
> > > it does not get you builtin open "for free" as I had suspected; builtin
> > > open has its own type checks in _iomodule.c.
> >
> > Yup, it will take some effort to make this work.
>
> A corner case just occurred to me...
>
> For functions that will continue to accept str/bytes (and functions that
> accept some other type such as Number or file-like objects), what should
> be done with an object that is one of these, *and* has an __fspath__
> method, *and* this method returns a value other than the object's own
> value? Basically, should the protocol check be done unconditionally
> (before attempting to use the argument as a string) or only if the
> argument is not a string (there's an efficiency argument for this). Or
> should it be left "unspecified", with the understanding that such
> objects are badly behaved and may not be handled consistently across
> different functions / python implementations / cpython versions?
>
> Also, should the os.fspath (or whatever we call it) function itself
> accept str/bytes, even if these are not going to implement the protocol?
>

All of this is demonstrated in
https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 by the
various possibilities. In the end it's not a corner case because the
definition of __fspath__ will be such that there's no ambiguity in what
os.fspath() will accept and what __fspath__ can return and the code will be
written to conform to what the PEP dictates (IOW I'm aware that this needs
to be considered in the implementation :) .
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Brett Cannon
On Tue, 12 Apr 2016 at 22:38 Michael Mysinger via Python-Dev <
python-dev@python.org> wrote:

> Ethan Furman  stoneleaf.us> writes:
>
> > Do we allow bytes to be returned from os.fspath()?  If yes, then do we
> > allow bytes from __fspath__()?
>
> De-lurking. Especially since the ultimate goal is better interoperability,
> I
> feel like an implementation that people can play with would help guide the
> few remaining decisions. To help test the various options you could
> temporarily add a _allow_bytes=GLOBAL_CONFIG_OPTION default argument to
> both
> pathlib.__fspath__() and os.fspath(), with distinct configurable defaults
> for
> each.
>
> In the spirit of Python 3 I feel like bytes might not be needed in
> practice,
> but something like this with defaults of False will allow people to easily
> test all the various options.
>

https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has
the four potential approaches implemented (although it doesn't follow the
"separate functions" approach some are proposing and instead goes with the
allow_bytes approach I originally proposed).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Ethan Furman
On 04/13/2016 09:58 AM, Brett Cannon wrote:> On Wed, 13 Apr 2016 at 
09:19 Fred Drake wrote:


>> I do the same, but... this is one of those cases where a caller will
>> usually be passing a constant directly. If passed as a positional
>> argument, it'll just be confusing ("what's True?" is my usual
>> reaction to a Boolean positional argument).
>
> It would be keyword-only so this isn't even a possibility.
>
>> If passed as a keyword argument
>> with a descriptive name, it'll be longer than I'd like to see:
>>
>>  path_str = os.fspath(path, allow_bytes=True)
>
> I think the expectation that the number of people actually directly
> calling this function with that argument specified is going to be
> rather small, so the common-case will simply be:
>
>  path_str = os.fspath(path)

That is certainly my expectation.  :)

>> Names like os.fspath() and os.fssyspath() seem good to me.

A single function is definitely my preference, but if that's not 
possible then I'm fine with that pair of names.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Nikolaus Rath
On Apr 13 2016, Ethan Furman  wrote:
> (I'm not very good at keeping similar sounding functions separate --
> what's the difference between shutil.copy and shutil.copy2?  I have to
> look it up every time).

Well, "2" is more than "" (or 1), so copy2() copies *more* than copy() -
it includes the metadata. That always helps me.


Best,
-Nikolaus
-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 09:19 Fred Drake  wrote:

> On Wed, Apr 13, 2016 at 11:09 AM, Ethan Furman  wrote:
> > - a single os.fspath() with an allow_bytes parameter
> >   (mostly True in os and os.path, mostly False everywhere
> >   else)
>
> -0
>
> > - a str-only os.fspathname() and a str/bytes os.fspath()
>
> +1 on using separate functions.
>
> > I'm partial to the first choice as it is simplicity itself to know when
> > looking at it if bytes might be coming back by the presence or absence
> of a
> > second argument to the call; otherwise one has to keep straight in one's
> > head which is str-only and which might allow bytes (I'm not very good at
> > keeping similar sounding functions separate -- what's the difference
> between
> > shutil.copy and shutil.copy2?  I have to look it up every time).
>
> I do the same, but... this is one of those cases where a caller will
> usually be passing a constant directly. If passed as a positional
> argument, it'll just be confusing ("what's True?" is my usual reaction
> to a Boolean positional argument).


It would be keyword-only so this isn't even a possibility.


> If passed as a keyword argument
> with a descriptive name, it'll be longer than I'd like to see:
>
> path_str = os.fspath(path, allow_bytes=True)
>

I think the expectation that the number of people actually directly calling
this function with that argument specified is going to be rather small, so
the common-case will simply be:

path_str = os.fspath(path)


>
> Names like os.fspath() and os.fssyspath() seem good to me.
>

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Random832
On Wed, Apr 13, 2016, at 11:28, Ethan Furman wrote:
> On 04/13/2016 08:17 AM, Random832 wrote:
> > On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote:
> 
> >> I'd expect the main consumers to be os and os.path, and would honestly
> >> be surprised if we needed many explicit invocations above that layer,
> >> other than in pathlib itself.
> >
> > I made a toy implementation to try this out, and making os.open support
> > it does not get you builtin open "for free" as I had suspected; builtin
> > open has its own type checks in _iomodule.c.
> 
> Yup, it will take some effort to make this work.

A corner case just occurred to me...

For functions that will continue to accept str/bytes (and functions that
accept some other type such as Number or file-like objects), what should
be done with an object that is one of these, *and* has an __fspath__
method, *and* this method returns a value other than the object's own
value? Basically, should the protocol check be done unconditionally
(before attempting to use the argument as a string) or only if the
argument is not a string (there's an efficiency argument for this). Or
should it be left "unspecified", with the understanding that such
objects are badly behaved and may not be handled consistently across
different functions / python implementations / cpython versions?

Also, should the os.fspath (or whatever we call it) function itself
accept str/bytes, even if these are not going to implement the protocol?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 06:14 Stefan Krah  wrote:

> Victor Stinner  gmail.com> writes:
> > Maybe it's time to move more 3.x buildbots to the "stable" category?
> > http://buildbot.python.org/all/waterfall?category=3.x.stable
>
> +1 I think anything that is actually stable should be in that category.
>
>
> > By the way, I don't understand why "AMD64 OpenIndiana 3.x" is
> > considered as stable since it's failing with multiple issues since
> > many months and nobody is working on these failures. I suggest to move
> > this buildbot back to the unstable category.
>
> +1 The bot was very stable and fast for some time but has been unstable
> for at least a year.
>
>
>
> > - PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers,
> > test_socket, test_distutils, test_asyncio, (...); random timeout
> > failure in test_eintr, etc. I don't have access to AIX and I'm not
> > interested to acquire an AIX license, nor to install it. I'm not sure
> > that it's useful to have an AIX buildbot and no core developer have
> > access to AIX, and nobody is working on AIX failures. Maybe HP wants
> > to help us to support AIX? (Provide manpower, access to AIX servers,
> > or something like that.)
>
> Well, I think in this case it's the gcc AIX maintainer running it, so...
>
>
> I think we should have a policy to stop reporting issues on unstable
> bots unless someone has a concrete fix OR the bot maintainers are
> known to fix issues fast (but that does not seem to be the case).
>

Official policy per
https://www.python.org/dev/peps/pep-0011/#supporting-platforms states that
there must be a core developer to maintain the compatibility, so if there's
no one helping to keep a particular buildbot green then I agree it should
be marked as unstable and thus not supported.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Paul Moore
On 13 April 2016 at 17:31, Ethan Furman  wrote:
> On 04/13/2016 09:27 AM, Paul Moore wrote:
>>
>> On 13 April 2016 at 17:18, Fred Drake wrote:
>
>
>>> Names like os.fspath() and os.fssyspath() seem good to me.
>>
>>
>> -1 on fssyspath - the "system" representation is bytes on POSIX, but
>> not on Windows. Let's be explicit and go with fsbytespath().
>
>
> It will be confusing that fsbytespath() can return a string.

Oh, wait, yes fssyspath is for allow_bytes=True which *may* be bytes,
but could still be a string. My mistake. On that basis, I could go
with fssyspath (thinking "sys" = "low level").

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Brett Cannon
On Wed, 13 Apr 2016 at 05:57 Tim Golden  wrote:

> On 13/04/2016 12:40, Victor Stinner wrote:
> > Last months, most 3.x buildbots failed randomly. Some of them were
> > always failing. I spent some time to fix almost all Windows and Linux
> > buildbots. There were a lot of different issues.
>
> Can I state the obvious and offer a huge vote of thanks for this work,
> which is often tedious and unrewarding?
>

Yep, big thanks from me as well!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Ethan Furman

On 04/13/2016 09:27 AM, Paul Moore wrote:

On 13 April 2016 at 17:18, Fred Drake wrote:



Names like os.fspath() and os.fssyspath() seem good to me.


-1 on fssyspath - the "system" representation is bytes on POSIX, but
not on Windows. Let's be explicit and go with fsbytespath().


It will be confusing that fsbytespath() can return a string.

--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Guido van Rossum
Nice work. I think that for CPython, speed is much more important than
memory use for the code. Disk space is practically free for anything
smaller than a video. :-)

On Wed, Apr 13, 2016 at 9:24 AM, Victor Stinner
 wrote:
> Hi,
>
> In the middle of recent discussions about Python performance, it was
> discussed to change the Python bytecode. Serhiy proposed to reuse
> MicroPython short bytecode to reduce the disk space and reduce the
> memory footprint.
>
> Demur Rumed proposes a different change to use a regular bytecode
> using 16-bit units: an instruction has always one 8-bit argument, it's
> zero if the instruction doesn't have an argument:
>
>http://bugs.python.org/issue26647
>
> According to benchmarks, it looks faster:
>
>   http://bugs.python.org/issue26647#msg263339
>
> IMHO it's a nice enhancement: it makes the code simpler. The most
> interesting change is made in Python/ceval.c:
>
> -if (HAS_ARG(opcode))
> -oparg = NEXTARG();
> +oparg = NEXTARG();
>
> This code is the very hot loop evaluating Python bytecode. I expect
> that removing a conditional branch here can reduce the CPU branch
> misprediction.
>
> I reviewed first versions of the change, and IMHO it's almost ready to
> be merged. But I would prefer to have a review from a least a second
> core reviewer.
>
> Can someone please review the change?
>
> --
>
> The side effect of wordcode is that arguments in 0..255 now uses 2
> bytes per instruction instead of 3, so it also reduce the size of
> bytecode for the most common case.
>
> Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead
> of 3. Arguments are supported up to 32-bit: 24-bit uses 3 units (6
> bytes), 32-bit uses 4 units (8 bytes). MAKE_FUNCTION uses 16-bit
> argument for keyword defaults and 24-bit argument for annotations.
> Other common instruction known to use large argument are jumps for
> bytecode longer than 256 bytes.
>
> --
>
> Right now, ceval.c still fetchs opcode and then oparg with two 8-bit
> instructions. Later, we can discuss if it would be possible to ensure
> that the bytecode is always aligned to 16-bit in memory to fetch the
> two bytes using a uint16_t* pointer.
>
> Maybe we can overallocate 1 byte in codeobject.c and align manually
> the memory block if needed. Or ceval.c should maybe copy the code if
> it's not aligned?
>
> Raymond Hettinger proposes something like that, but it looks like
> there are concerns about non-aligned memory accesses:
>
>http://bugs.python.org/issue25823
>
> The cost of non-aligned memory accesses depends on the CPU
> architecture, but it can raise a SIGBUS on some arch (MIPS and
> SPARC?).
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Ethan Furman

On 04/13/2016 09:18 AM, Fred Drake wrote:

On Wed, Apr 13, 2016 at 11:09 AM, Ethan Furman wrote:

- a single os.fspath() with an allow_bytes parameter
   (mostly True in os and os.path, mostly False everywhere
   else)


-0


- a str-only os.fspathname() and a str/bytes os.fspath()


+1 on using separate functions.



Names like os.fspath() and os.fssyspath() seem good to me.


Ooh, I like that!  I could probably keep those names separate in my 
head.  :)


--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Fred Drake
On Wed, Apr 13, 2016 at 12:27 PM, Paul Moore  wrote:
> -1 on fssyspath - the "system" representation is bytes on POSIX, but
> not on Windows. Let's be explicit and go with fsbytespath().

Depends on the semantics; if we're expecting it to return
str-or-bytes, os.fssyspath() seems fine.  If only returning bytes (not
sure that ever makes sense on Windows, since I don't use Windows),
then I'd be happy with os.fsbytespath().


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Paul Moore
On 13 April 2016 at 17:18, Fred Drake  wrote:
> Names like os.fspath() and os.fssyspath() seem good to me.

-1 on fssyspath - the "system" representation is bytes on POSIX, but
not on Windows. Let's be explicit and go with fsbytespath().

But agreed that always-constant boolean parameters are a bad idea. The
hard bit is good naming of the separate functions (100% agree that
shutil is a good example of how not to do it :-))

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Not receiving bug tracker emails

2016-04-13 Thread Brett Cannon
Glad it's working again! And it was a combination or R. David Murray, Ezio
Melotti, Mark Mangoba (
http://pyfound.blogspot.com/2016/04/the-psf-has-hired-it-manager.html in
case you don't know who Mark is), and myself along with Upfront (b.p.o
hosting provider).

On Tue, 12 Apr 2016 at 21:40 Terry Reedy  wrote:

> On 4/4/2016 5:05 PM, Terry Reedy wrote:
>
> Since a few days, I am getting bug tracker emails again, in my Inbox.  I
> just got a Rietveld review in the Inbox and I believe it went there
> directly instead of first to Junk.  Thank you to whoever made the
> improvements.
>
> --
> Terry Jan Reedy
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-13 Thread Victor Stinner
Hi,

In the middle of recent discussions about Python performance, it was
discussed to change the Python bytecode. Serhiy proposed to reuse
MicroPython short bytecode to reduce the disk space and reduce the
memory footprint.

Demur Rumed proposes a different change to use a regular bytecode
using 16-bit units: an instruction has always one 8-bit argument, it's
zero if the instruction doesn't have an argument:

   http://bugs.python.org/issue26647

According to benchmarks, it looks faster:

  http://bugs.python.org/issue26647#msg263339

IMHO it's a nice enhancement: it makes the code simpler. The most
interesting change is made in Python/ceval.c:

-if (HAS_ARG(opcode))
-oparg = NEXTARG();
+oparg = NEXTARG();

This code is the very hot loop evaluating Python bytecode. I expect
that removing a conditional branch here can reduce the CPU branch
misprediction.

I reviewed first versions of the change, and IMHO it's almost ready to
be merged. But I would prefer to have a review from a least a second
core reviewer.

Can someone please review the change?

--

The side effect of wordcode is that arguments in 0..255 now uses 2
bytes per instruction instead of 3, so it also reduce the size of
bytecode for the most common case.

Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead
of 3. Arguments are supported up to 32-bit: 24-bit uses 3 units (6
bytes), 32-bit uses 4 units (8 bytes). MAKE_FUNCTION uses 16-bit
argument for keyword defaults and 24-bit argument for annotations.
Other common instruction known to use large argument are jumps for
bytecode longer than 256 bytes.

--

Right now, ceval.c still fetchs opcode and then oparg with two 8-bit
instructions. Later, we can discuss if it would be possible to ensure
that the bytecode is always aligned to 16-bit in memory to fetch the
two bytes using a uint16_t* pointer.

Maybe we can overallocate 1 byte in codeobject.c and align manually
the memory block if needed. Or ceval.c should maybe copy the code if
it's not aligned?

Raymond Hettinger proposes something like that, but it looks like
there are concerns about non-aligned memory accesses:

   http://bugs.python.org/issue25823

The cost of non-aligned memory accesses depends on the CPU
architecture, but it can raise a SIGBUS on some arch (MIPS and
SPARC?).

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Fred Drake
On Wed, Apr 13, 2016 at 11:09 AM, Ethan Furman  wrote:
> - a single os.fspath() with an allow_bytes parameter
>   (mostly True in os and os.path, mostly False everywhere
>   else)

-0

> - a str-only os.fspathname() and a str/bytes os.fspath()

+1 on using separate functions.

> I'm partial to the first choice as it is simplicity itself to know when
> looking at it if bytes might be coming back by the presence or absence of a
> second argument to the call; otherwise one has to keep straight in one's
> head which is str-only and which might allow bytes (I'm not very good at
> keeping similar sounding functions separate -- what's the difference between
> shutil.copy and shutil.copy2?  I have to look it up every time).

I do the same, but... this is one of those cases where a caller will
usually be passing a constant directly. If passed as a positional
argument, it'll just be confusing ("what's True?" is my usual reaction
to a Boolean positional argument). If passed as a keyword argument
with a descriptive name, it'll be longer than I'd like to see:

path_str = os.fspath(path, allow_bytes=True)

Names like os.fspath() and os.fssyspath() seem good to me.


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Ethan Furman

On 04/13/2016 08:17 AM, Random832 wrote:

On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote:



I'd expect the main consumers to be os and os.path, and would honestly
be surprised if we needed many explicit invocations above that layer,
other than in pathlib itself.


I made a toy implementation to try this out, and making os.open support
it does not get you builtin open "for free" as I had suspected; builtin
open has its own type checks in _iomodule.c.


Yup, it will take some effort to make this work.


Probably anything not implemented in pure python that deals with
filenames is going to have to have its type checking revised.


Agreed.

You can see why there was no point in pursuing the conversation unless 
someone was willing to do the work.


--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Random832
On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote:
> I'd expect the main consumers to be os and os.path, and would honestly
> be surprised if we needed many explicit invocations above that layer,
> other than in pathlib itself.

I made a toy implementation to try this out, and making os.open support
it does not get you builtin open "for free" as I had suspected; builtin
open has its own type checks in _iomodule.c.

Probably anything not implemented in pure python that deals with
filenames is going to have to have its type checking revised.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Ethan Furman

On 04/13/2016 07:21 AM, Nick Coghlan wrote:

On 14 April 2016 at 00:11, Paul Moore wrote:

On 13 April 2016 at 14:51, Nick Coghlan wrote:



The potential SE-strings only come back when you pass str, and the
operating system data isn't properly encoded according to the nominal
filesystem encoding. They round trip nicely to other operating system
APIs, but can indeed be a problem if they escape to other parts of
your program


If the operating system APIs handle SE-strings correctly, is it not
acceptable to require the fspath protocol to return strings, and then
places like DirEntry or Ethan's module, when they want to return
bytes, can just SE-encode the bytes and return those?

Or will the fspath protocol be used at a low enough level that it's
*below* the point where SE-encoded strings are handled properly?


I'd expect the main consumers to be os and os.path, and would honestly
be surprised if we needed many explicit invocations above that layer,
other than in pathlib itself.

That's actually the main factor in my suggesting the two level API
design - from a protocol consumer perspective, bytes-or-str is a
natural fit for os and os.path, while str-only is a natural fit for
pathlib.

I also now believe it makes sense to postpone a final decision on this
aspect of the design until after a draft implementation has been put
together, as my and Ethan's assumption that os and os.path will be the
main consumers is exactly that: an assumption. Putting the draft
implementation together will let us know whether or not it's an
accurate one.


Sounds reasonable.

However, there is still one choice that needs to be made:

- a single os.fspath() with an allow_bytes parameter
  (mostly True in os and os.path, mostly False everywhere
  else)

- a str-only os.fspathname() and a str/bytes os.fspath()

I'm partial to the first choice as it is simplicity itself to know when 
looking at it if bytes might be coming back by the presence or absence 
of a second argument to the call; otherwise one has to keep straight in 
one's head which is str-only and which might allow bytes (I'm not very 
good at keeping similar sounding functions separate -- what's the 
difference between shutil.copy and shutil.copy2?  I have to look it up 
every time).


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Nick Coghlan
On 14 April 2016 at 00:11, Paul Moore  wrote:
> On 13 April 2016 at 14:51, Nick Coghlan  wrote:
>> The potentially SE-strings only come back when you pass str, and the
>> operating system data isn't properly encoded according to the nominal
>> filesystem encoding. They round trip nicely to other operating system
>> APIs, but can indeed be a problem if they escape to other parts of
>> your program
>
> If the operating system APIs handle SE-strings correctly, is it not
> acceptable to require the fspath protocol to return strings, and then
> places like DirEntry or Ethan's module, when they want to return
> bytes, can just SE-encode the bytes and return those?
>
> Or will the fspath protocol be used at a low enough level that it's
> *below* the point where SE-encoded strings are handled properly?

I'd expect the main consumers to be os and os.path, and would honestly
be surprised if we needed many explicit invocations above that layer,
other than in pathlib itself.

That's actually the main factor in my suggesting the two level API
design - from a protocol consumer perspective, bytes-or-str is a
natural fit for os and os.path, while str-only is a natural fit for
pathlib.

I also now believe it makes sense to postpone a final decision on this
aspect of the design until after a draft implementation has been put
together, as my and Ethan's assumption that os and os.path will be the
main consumers is exactly that: an assumption. Putting the draft
implementation together will let us know whether or not it's an
accurate one.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Paul Moore
On 13 April 2016 at 14:51, Nick Coghlan  wrote:
> The potentially SE-strings only come back when you pass str, and the
> operating system data isn't properly encoded according to the nominal
> filesystem encoding. They round trip nicely to other operating system
> APIs, but can indeed be a problem if they escape to other parts of
> your program

If the operating system APIs handle SE-strings correctly, is it not
acceptable to require the fspath protocol to return strings, and then
places like DirEntry or Ethan's module, when they want to return
bytes, can just SE-encode the bytes and return those?

Or will the fspath protocol be used at a low enough level that it's
*below* the point where SE-encoded strings are handled properly?

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-13 Thread Nick Coghlan
On 13 April 2016 at 02:19, Chris Barker  wrote:
> So: why use strings as the lingua franca of paths? i.e. the basis of the
> path protocol. maybe we should support only two path representations:
>
> 1) A "proper" path object -- i.e. pathlib.Path or anything else that
> supports the path protocol.
>
> 2) the bytes that the OS actually needs.
>
> this would mean that the protocol would be to have a __pathbytes__() method
> that woulde return the bytes that should be passed off to the OS.

The reason to favour strings over raw bytes for path manipulation is
the same reason to favour them anywhere else: to avoid having to worry
about encodings *while* you're manipulating things, and instead only
worry about the encoding when actually talking to the OS (which may be
UTF-16-LE to talk to a Windows API, or UTF-8 to talk to a *nix API, or
something else entirely if your OS is set up that way, or you're
writing the path to a file or network packet, rather than using it
locally).

Regardless of what we decide about os.fspath's return type, that
general principle won't change - if you're manipulating bytes paths
directly, you're doing something relatively specialised (like working
on CPython's own os module).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-13 Thread Nick Coghlan
On 13 April 2016 at 02:15, Ethan Furman  wrote:
> On 04/11/2016 04:43 PM, Victor Stinner wrote:
>>
>> Le 11 avr. 2016 11:11 PM, "Ethan Furman" a écrit :
>
>
>>> So my concern in such a case is what happens if we pass this SE
>>> string somewhere else: a UTF-8 file, or over a socket, or into a
>>> database? Does this have issues that we wouldn't face if we just used
>>> bytes?
>>
>>
>> "SE string" are returned by os.listdir(str), os.walk(str),
>> os.getenv(str), sys.argv[int], ... since Python 3.3. Nothing new under
>> the sun.
>
>
> So when we pass a bytes object in, Python (on posix) converts that to a
> string using surrogateescape, gets back strings from the os, and encodes
> them back to bytes, again using surrogateescape?

On POSIX, if you pass bytes to the os module, it will pass bytes to
the underlying system API, and then pass bytes back to your
application.

The potentially SE-strings only come back when you pass str, and the
operating system data isn't properly encoded according to the nominal
filesystem encoding. They round trip nicely to other operating system
APIs, but can indeed be a problem if they escape to other parts of
your program (hence ideas like
http://bugs.python.org/issue18814#msg251694 and the preceding
discussion in that issue)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Stefan Krah
Victor Stinner  gmail.com> writes:
> Maybe it's time to move more 3.x buildbots to the "stable" category?
> http://buildbot.python.org/all/waterfall?category=3.x.stable

+1 I think anything that is actually stable should be in that category.


> By the way, I don't understand why "AMD64 OpenIndiana 3.x" is
> considered as stable since it's failing with multiple issues since
> many months and nobody is working on these failures. I suggest to move
> this buildbot back to the unstable category.

+1 The bot was very stable and fast for some time but has been unstable
for at least a year.



> - PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers,
> test_socket, test_distutils, test_asyncio, (...); random timeout
> failure in test_eintr, etc. I don't have access to AIX and I'm not
> interested to acquire an AIX license, nor to install it. I'm not sure
> that it's useful to have an AIX buildbot and no core developer have
> access to AIX, and nobody is working on AIX failures. Maybe HP wants
> to help us to support AIX? (Provide manpower, access to AIX servers,
> or something like that.)

Well, I think in this case it's the gcc AIX maintainer running it, so...


I think we should have a policy to stop reporting issues on unstable
bots unless someone has a concrete fix OR the bot maintainers are
known to fix issues fast (but that does not seem to be the case).



Stefan Krah










___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Tim Golden
On 13/04/2016 12:40, Victor Stinner wrote:
> Last months, most 3.x buildbots failed randomly. Some of them were
> always failing. I spent some time to fix almost all Windows and Linux
> buildbots. There were a lot of different issues.

Can I state the obvious and offer a huge vote of thanks for this work,
which is often tedious and unrewarding?

Thank you

TJG

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Eric V. Smith
On 4/13/2016 7:40 AM, Victor Stinner wrote:
> Last months, most 3.x buildbots failed randomly. Some of them were
> always failing. I spent some time to fix almost all Windows and Linux
> buildbots. There were a lot of different issues.

Thanks for all of your work on this, Victor. It's much appreciated.

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Chris Angelico
On Wed, Apr 13, 2016 at 9:40 PM, Victor Stinner
 wrote:
> Maybe it's time to move more 3.x buildbots to the "stable" category?
> http://buildbot.python.org/all/waterfall?category=3.x.stable

Move the Bruces into stable, perhaps? The AMD64 Debian Root one. Been
fairly consistently green.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!

2016-04-13 Thread Victor Stinner
Hi,

Last months, most 3.x buildbots failed randomly. Some of them were
always failing. I spent some time to fix almost all Windows and Linux
buildbots. There were a lot of different issues.

So please try to not break buildbots again and remind to watch them sometimes:

  
http://buildbot.python.org/all/waterfall?category=3.x.stable&category=3.x.unstable

Next weeks, I will try to backport some fixes to Python 3.5 (if
needed) to make these buildbots more stable too.

Python 2.7 buildbots are also in a sad state (ex: test_marshal
segfaults on Windows, see issue #25264). But it's not easy to get a
Windows with the right compiler to develop on Python 2.7 on Windows.

--

Maybe it's time to move more 3.x buildbots to the "stable" category?
http://buildbot.python.org/all/waterfall?category=3.x.stable

By the way, I don't understand why "AMD64 OpenIndiana 3.x" is
considered as stable since it's failing with multiple issues since
many months and nobody is working on these failures. I suggest to move
this buildbot back to the unstable category.

--

We have many offline buildbots. What's the status of these buildbots?
Should we expect that they come back soon?

Or would it be possible to hide them? It would help to check the
status of all buildbots.

--

Failing buildbots:

- AMD64 FreeBSD CURRENT 3.x: http://bugs.python.org/issue26566 -- I
installed a fresh FreeBSD CURRENT in a VM and I'm unable to reproduce
failures. Maybe the buildbot slave is oudated and FreeBSD must be
upgraded?

- AMD64 OpenIndiana 3.x, x86 OpenIndiana 3.x: test_socket failures on
sendfile. Sorry but I'm not really interested by this OS.

- PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers,
test_socket, test_distutils, test_asyncio, (...); random timeout
failure in test_eintr, etc. I don't have access to AIX and I'm not
interested to acquire an AIX license, nor to install it. I'm not sure
that it's useful to have an AIX buildbot and no core developer have
access to AIX, and nobody is working on AIX failures. Maybe HP wants
to help us to support AIX? (Provide manpower, access to AIX servers,
or something like that.)

- x86 OpenBSD 3.x: 5 tests failed, test_crypt test_socket test_ssl
test_strptime test_time. This OS needs some love ;-)

- the 4 ICC buildbots are failing with stack overflow, segfault, etc.
Again, I'm not sure that these buildbots are useful since it looks
like we don't support this compiler yet. Or does it help to work on
supporting this compiler? Who is working on ICC support?

--

FYI I also made some enhancements on regrtest (our test runner for the
test suite), mostly to debug failures:

- display the duration of tests taking longer than 30 seconds
- new timestamp prefix, used to debug buildbot hangs
- when parallel tests are interrupted, display progress on waiting for
completion
- add timeout to main process when using -jN: it should help to debug
buildbot hang
- "Run tests in parallel using 3 child processes" or "Run tests
sequentially" message which helps to understand how tests are running.
There is the -j1 trap which has no effect: tests are still run
sequentially. By the way, I proposed to really use subprocesses when
-j1 is used: http://bugs.python.org/issue25285

The default timeout changed from 1 hour to 15 min, it's the maximum
duration to run a single test file (ex: test_os.py). On my Linux box,
running the whole test suite in parallel (10 child processes for my 4
CPU cores with hyperthreading) with Python compiled in debug mode
(slow) takes 4 min 37 sec.

Tell me if the default timeout is too low. It can be configured per
buildbot if needed (TESTTIMEOUT env var).

--

By the way, I'm always surprised by the huge difference of time needed
to run a build on the different slaves: from a few minutes to more
than 3 hours. The fatest Windows slave takes 28 minutes (run tests in
parallel using 4 child processes), whereas the 3 others (run tests
sequentially and) take between 2 hours and more than 3 hours! Why
running tests on Windows takes so long?

Maybe we should make sure that no buildbot run tests sequentially,
because it creates a lot of annoying side effects (even if sometimes
it helps to find tricky bugs, sometimes bugs restricted to the tests
themself) and because a lot of time simply wait a few seconds. So
running mutliple tests in parallel don't burn your CPU, it's just
faster. IMHO the risk of random timeout failures is low compared to
the speedup.

--

The most interesting bug was a deadlock in locale.setlocale() on
Windows 7: the bug made the buildbot to hang "sometimes" (randomly).
Jeremy Kloth identified the bug, but Steve Dower noticed us that it's
already fixed in Visual Studio 2015 Update 1: so please update VS if
it's not the case yet. Steve added a post-build test to check if the
ucrtbase/ucrtbased DLL has the known bug.
=> http://bugs.python.org/issue26624

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailm