Re: [Python-Dev] pathlib - current status of discussions
On 14 April 2016 at 14:05, Random832 wrote: > On Wed, Apr 13, 2016, at 23:27, Nick Coghlan wrote: >> In this kind of case, inheritance tends to trump protocol. For >> example, int subclasses can't override operator.index: > ... >> The reasons for that behaviour are more pragmatic than philosophical: >> builtins and their subclasses are extensively special-cased for speed >> reasons, and those shortcuts are encountered before the interpreter >> even considers using the general protocol. >> >> In cases where the magic method return types are polymorphic (so >> subclasses may want to override them) we'll use more restrictive exact >> type checks for the shortcuts, but that argument doesn't apply for >> typechecked protocols where the result is required to be an instance >> of a particular builtin type (but subclasses are considered >> acceptable). > > Then why aren't we doing it for str? Because "try: path = > path.__fspath__()" is more idiomatic than the alternative? The sketches Brett posted will bear little resemblance to the actual implementation - that will be in C and use similar idioms to those we use for other abstract protocols (such as shortcuts for instances of builtin types, and doing the method lookup via the passed in object's type, rather than on the instance). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 14 April 2016 at 13:54, Random832 wrote: > On Wed, Apr 13, 2016, at 23:17, Nick Coghlan wrote: > >> - os.fspath -> str (no coercion) >> - os.fsdecode -> str (with coercion from bytes) >> - os.fsencode -> bytes (with coercion from str) >> - os._raw_fspath -> str-or-bytes (no coercion) >> >> (with "coercion" referring to how the result of __fspath__ and any >> directly passed in str or bytes objects are handled) >> >> The leading underscore on _raw_fspath would be of the "this is a >> documented and stable API, but you probably don't want to use it >> unless you really know what you're doing" variety, rather than the >> "this is an undocumented and potentially unstable private API" >> variety. > > In this scenario could the protocol return bytes? Yes, that's desirable to handle DirEntry transparently regardless of type. > If the protocol can return bytes, then that means that types (DirEntry? > someone had an alternate path library with a bPath?) which return bytes > via the protocol will proliferate, and cannot be safely passed to > anything that uses os.fspath. Numerous copies of "def myfspath(x): > return os.fsdecode(os._raw_fspath(x))" will proliferate (or they'll just > monkey-patch os.fspath), and no-one actually uses os.fspath except toy > examples. If folks want coercion, they can just use os.fsdecode(x), as that already has a str -> str passthrough from the input to the output (unlike codecs.decode) and will presumably be updated to include an implicit call to os._raw_fspath() on the passed in object. > Why is it so objectionable for os.fspath to do coercion? The first problem is that binary paths on Windows basically don't work, so it's preferable for them to fail fast regardless of platform, rather than to have them implicitly work on *nix, only to fail for Windows users using non-ASCII paths later. The second is that it would make os.fspath and os.fsdecode functionally equivalent, so we'd have two different spellings for the same operation. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, Apr 13, 2016, at 23:27, Nick Coghlan wrote: > In this kind of case, inheritance tends to trump protocol. For > example, int subclasses can't override operator.index: ... > The reasons for that behaviour are more pragmatic than philosophical: > builtins and their subclasses are extensively special-cased for speed > reasons, and those shortcuts are encountered before the interpreter > even considers using the general protocol. > > In cases where the magic method return types are polymorphic (so > subclasses may want to override them) we'll use more restrictive exact > type checks for the shortcuts, but that argument doesn't apply for > typechecked protocols where the result is required to be an instance > of a particular builtin type (but subclasses are considered > acceptable). Then why aren't we doing it for str? Because "try: path = path.__fspath__()" is more idiomatic than the alternative? If some sort of reasoned decision has been made to require the protocol to trump the special case for str subclasses, it's unreasonable not to apply the same decision to bytes subclasses. The decision should be "always use the protocol first" or "always use the type match first". In other words, why not this: def fspath(path, *, allow_bytes=False): if isinstance(path, (bytes, str) if allow_bytes else str) return path try: m = path.__fspath__ except AttributeError: raise TypeError path = m() if isinstance(path, (bytes, str) if allow_bytes else str) return path raise TypeError ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Wed, Apr 13, 2016, at 23:17, Nick Coghlan wrote: > - os.fspath -> str (no coercion) > - os.fsdecode -> str (with coercion from bytes) > - os.fsencode -> bytes (with coercion from str) > - os._raw_fspath -> str-or-bytes (no coercion) > > (with "coercion" referring to how the result of __fspath__ and any > directly passed in str or bytes objects are handled) > > The leading underscore on _raw_fspath would be of the "this is a > documented and stable API, but you probably don't want to use it > unless you really know what you're doing" variety, rather than the > "this is an undocumented and potentially unstable private API" > variety. In this scenario could the protocol return bytes? If the protocol cannot return bytes, then _raw_fspath will only return bytes if directly passed bytes. This limits its utility for the functions that consume it (presumably path_convert (os.open and friends) and builtin open), since they already have to act specially based on the types of their arguments (builtin open can accept an integer; path_convert has to behave radically differently on str or bytes input) and there's no reason they couldn't simply accept bytes directly while they're doing that. If the protocol can return bytes, then that means that types (DirEntry? someone had an alternate path library with a bPath?) which return bytes via the protocol will proliferate, and cannot be safely passed to anything that uses os.fspath. Numerous copies of "def myfspath(x): return os.fsdecode(os._raw_fspath(x))" will proliferate (or they'll just monkey-patch os.fspath), and no-one actually uses os.fspath except toy examples. Why is it so objectionable for os.fspath to do coercion? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 14 April 2016 at 13:14, Ethan Furman wrote: > On 04/13/2016 07:57 PM, Nikolaus Rath wrote: >> Either I haven't understood your answer, or you haven't understood my >> question. I'm concerned about this case: >> >>class Special(bytes): >>def __fspath__(self): >> return 'str-val' >>obj = Special('bytes-val', 'utf8') >>path_obj = fspath(obj, allow_bytes=True) >> >> With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'. > > I misunderstood your question. That is... an interesting case. ;) In this kind of case, inheritance tends to trump protocol. For example, int subclasses can't override operator.index: >>> from operator import index >>> class NotAnInt(): ... def __index__(self): ... return 42 ... >>> index(NotAnInt()) 42 >>> class MyInt(int): ... def __index__(self): ... return 42 ... >>> index(MyInt(53)) 53 The reasons for that behaviour are more pragmatic than philosophical: builtins and their subclasses are extensively special-cased for speed reasons, and those shortcuts are encountered before the interpreter even considers using the general protocol. In cases where the magic method return types are polymorphic (so subclasses may want to override them) we'll use more restrictive exact type checks for the shortcuts, but that argument doesn't apply for typechecked protocols where the result is required to be an instance of a particular builtin type (but subclasses are considered acceptable). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 14 April 2016 at 12:49, Nick Coghlan wrote: > The API could be something like: > > - os.fspath -> str-or-bytes > - os.fsencode -> bytes (with coercion from str) > - os.fsdecode -> str (with coercion from bytes) > - os.strpath -> str (no coercion) There seems to be fairly broad opposition to the idea of defining the public API in terms of what os and os.path are likely to need, which reminded me of Koos's suggestion of using a private API for the str-or-bytes variant. That approach would give us something like: - os.fspath -> str (no coercion) - os.fsdecode -> str (with coercion from bytes) - os.fsencode -> bytes (with coercion from str) - os._raw_fspath -> str-or-bytes (no coercion) (with "coercion" referring to how the result of __fspath__ and any directly passed in str or bytes objects are handled) The leading underscore on _raw_fspath would be of the "this is a documented and stable API, but you probably don't want to use it unless you really know what you're doing" variety, rather than the "this is an undocumented and potentially unstable private API" variety. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 04/13/2016 07:57 PM, Nikolaus Rath wrote: On Apr 13 2016, Ethan Furman wrote: On 04/13/2016 03:45 PM, Nikolaus Rath wrote: When passing an object that is of type str and has a __fspath__ attribute, all approaches return the value of __fspath__(). However, when passing something of type bytes, the second approach returns the object, while the third returns the value of __fspath__(). Is this intentional? I think a __fspath__ attribute should always be preferred. Yes, it is intentional. The second approach assumes __fspath__ can only contain str, so there is no point in checking it for bytes. Either I haven't understood your answer, or you haven't understood my question. I'm concerned about this case: class Special(bytes): def __fspath__(self): return 'str-val' obj = Special('bytes-val', 'utf8') path_obj = fspath(obj, allow_bytes=True) With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'. I misunderstood your question. That is... an interesting case. ;) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
On 14 April 2016 at 08:26, Victor Stinner wrote: > 2016-04-14 0:11 GMT+02:00 Ryan Gonzalez : >> So code that depends on iterating through bytecode via HAS_ARG is going to >> break... > > Sure. This change is backward incompatible for applications parsing > bytecode in C or Python. That's why the patch also has to update the > dis module. > > I don't see how you plan to keep the backwad compatibility, since the > argument size changed from 2 bytes to 1 byte. You must update your > code (written in C or Python or whatever). > > Hopefully, the dis was enhanced in Python 3.4: get_instructions() now > gives nice Instructon objects rather than only pure text output. > > FYI I wrote my own library to decode and decode bytecode. It provides > abstract bytecode objects to easily modify bytecode: > https://bytecode.readthedocs.org/ > > I suggest to use such library (or simply the dis module for simple > needs) if you have to handle bytecode, rather than writing your own > code. > > I know a few other projects which handle directly bytecode: > > * https://pypi.python.org/pypi/codetransformer > * https://github.com/serprex/byteplay > * https://pypi.python.org/pypi/coverage > > IHMO it's not a big deal to update these projects for the future > Python 3.6. I can even help them to support the new bytecode format. +1 We've also had previous discussions on adding a "minimum viable bytecode editing" API to the standard library, and updating these third party modules to support wordcode instead of bytecode could provide a good use-case-driven opportunity for defining that (i.e. it wouldn't be about providing an end user facing API directly, but rather about letting CPython take care of the bookkeeping details for things like lnotab and sorting out jump targets). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Apr 13 2016, Ethan Furman wrote: > On 04/13/2016 03:45 PM, Nikolaus Rath wrote: > >> When passing an object that is of type str and has a __fspath__ >> attribute, all approaches return the value of __fspath__(). >> >> However, when passing something of type bytes, the second approach >> returns the object, while the third returns the value of __fspath__(). >> >> Is this intentional? I think a __fspath__ attribute should always be >> preferred. > > Yes, it is intentional. The second approach assumes __fspath__ can > only contain str, so there is no point in checking it for bytes. Either I haven't understood your answer, or you haven't understood my question. I'm concerned about this case: class Special(bytes): def __fspath__(self): return 'str-val' obj = Special('bytes-val', 'utf8') path_obj = fspath(obj, allow_bytes=True) With #2, path_obj == 'bytes-val'. With #3, path_obj == 'str-val'. I would expect that fspath(obj, allow_bytes=True) == 'str-val' (after all, it's allow_bytes, not require_bytes). Bu Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 14 April 2016 at 07:37, Victor Stinner wrote: > Le mercredi 13 avril 2016, Brett Cannon a écrit : >> >> All of this is demonstrated in >> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 by the >> various possibilities. In the end it's not a corner case because the >> definition of __fspath__ will be such that there's no ambiguity in what >> os.fspath() will accept and what __fspath__ can return and the code will be >> written to conform to what the PEP dictates (IOW I'm aware that this needs >> to be considered in the implementation :) . > > I'm not a big fan of a flag parameter to change the return type of a > function. Usually, two functions are preferred. In the os module we have > getcwd/getcwdb for example. I don't know if it's a good example It is, as one of the benefits of the "two separate functions" model is to improve type inference during static analysis - you don't necessarily know the values of parameters at analysis time, but you do know which function is being called. > Do you know other examples of Python functions taking a (flag) parameter to > change the result type? subprocess.Popen has a couple of flags that can do that (more precisely, they change the return type of some methods on the resulting object), but that's not an especially pretty API in general. String based type variations are more common (e.g. file mode flags, using the codec module registry), but they're still used only sparingly (since they make the code harder to reason about for both humans and static analysers). In terms of types for filesystem path APIs: 1. I assume we'll want a fast path for bytes & str to avoid performance regressions (especially in os.path, where we may be doing pure data manipulation without any IO operations) 2. I favour defining __fspath__ and os.fspath() in terms of what the os and os.path modules need to handle both DirEntry and pathlib (which I currently expect to be str-or-bytes) 3. For the benefit of higher level cross-platform code like pathlib, it likely makes sense to also have a str-only API that throws an exception rather than returning bytes However, I also suggest deferring a decision on 3 until 2 has been definitively answered by way of implementing the changes. If I'm right about 2, then the API could be something like: - os.fspath -> str-or-bytes - os.fsencode -> bytes (with coercion from str) - os.fsdecode -> str (with coercion from bytes) - os.strpath -> str (no coercion) It's also worth noting that os.fsencode and os.fsdecode are already idempotent - their current signatures are "str-or-bytes -> bytes" and "str-or-bytes -> str". With a str-or-bytes return type on os.fspath, adapting them to handle rich path objects should just be a matter of adding an os.fspath call as the first step. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Apr 13, 2016 19:06, Brett Cannon wrote: > On Wed, 13 Apr 2016 at 15:46 Nikolaus Rath wrote: >> When passing an object that is of type str and has a __fspath__ >> attribute, all approaches return the value of __fspath__(). >> >> However, when passing something of type bytes, the second approach >> returns the object, while the third returns the value of __fspath__(). >> >> Is this intentional? I think a __fspath__ attribute should always be >> preferred. > > > It's very much intentional. If we define __fspath__() to only return strings > but still want to minimize boilerplate of allowing bytes to simply pass > through without checking a path argument to see if it is bytes then approach > #2 is warranted. But if __fspath__() can return bytes then approach #3 allows > for it. Er, the difference comes in when the object passed to os.fspath is a subclass of bytes that, itself, has a __fspath__ method (which may return a str). It's unlikely to occur in the wild, but is a semantic difference between this case and all other objects with __fspath__ methods. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Apr 13, 2016 20:06, Chris Barker wrote: > > In this case, I don't know that we need to be tolerant of buggy __fspathname__() implementations -- they should be tested outside these checks, and not be buggy. So a buggy implementation may raise and may be ignored, depending on what Exception the bug triggers -- big deal. The only time it would matter is when the implementer is debugging the implementation. > > -CHB Yes but you can often, and can in this case, restrict the contents of the try block to a single operation - a name access, an attribute, a subscript - and that sharply limits the risk of such a thing happening. Sure, the object's __getattr(ibute)__ could still fail from something deep inside it missing a different attribute, but that's it. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 04/13/2016 05:06 PM, Chris Barker wrote: In this case, I don't know that we need to be tolerant of buggy __fspathname__() implementations -- they should be tested outside these checks, and not be buggy. So a buggy implementation may raise and may be ignored, depending on what Exception the bug triggers -- big deal. The only time it would matter is when the implementer is debugging the implementation. Yet the idea behind robust exception handling is to test as little as possible and only catch what you know how to correct. This code catches only one thing, only at one place, and we know how to deal with it: try: fsp = obj.__fspath__ except AttributeError: pass else: fsp = fsp() Contrarily, this next code catches the same error, but it could happen at the one place we know how to deal with it *or* anywhere further down the call stack where we have no clue what the proper course is to handle the problem... yet we suppress it anyway: try: fsp = obj.__fspath__() except AttributeError: pass Certainly not code I want to see in the stdlib. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, Apr 13, 2016 at 1:47 PM, Random832 wrote: > On Wed, Apr 13, 2016, at 16:39, Chris Barker wrote: > > so are we worried that __fspath__ will exist and be callable, but might > > raise an AttributeError somewhere inside itself? if so isn't it broken > > anyway, so should it be ignored? > > Well, if you're going to say "ignore the protocol because it's broken", > where do you stop? What if it raises some other exception? What if it > raises SystemExit? this is pretty much always the case with EAFTP coding: try: something() except SomeError: do_something_else() unless SomeError is a custom defined error that you know is never going to get raised anywhere else, then something() could raise SomeError for the reason you expect, or some code deep in the call stack could raise SomeError also, and you wouldn't know that. I had a student run into this and it took him a good while to debug it. But that was because the code in something() was pretty darn buggy. If he had tested something() by itself, there would have been no issue finding the problem. In this case, I don't know that we need to be tolerant of buggy __fspathname__() implementations -- they should be tested outside these checks, and not be buggy. So a buggy implementation may raise and may be ignored, depending on what Exception the bug triggers -- big deal. The only time it would matter is when the implementer is debugging the implementation. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, 13 Apr 2016 at 15:20 Victor Stinner wrote: > Oh, since others voted, I will also vote and explain my vote. > > I like choice 1, str only, because it's very well defined. In Python > 3, Unicode is simply the native type for text. It's accepted by almost > all functions. In other emails, I also explained that Unicode is fine > to store undecodable filenames on UNIX, it works as expected since > many years (since Python 3.3). > > -- > > If you cannot survive without bytes, I suggest to add two functions: > one for str only, another which can return str or bytes. > > Maybe you want in fact two protocols: __fspath__(str only) and > __fspathb__ (bytes only)? os.fspathb() would first try __fspathb__, or > fallback to os.fsencode(__fspath__). os.fspath() would first try > __fspath__, or fallback to os.fsdecode(__fspathb__). IMHO it's not > worth to have such complexity while Unicode handles all use cases. > Implementing two magic methods for this seems like overkill. Best I would be willing to do with automatic encode/decode is use os.fsencode()/os.fsdecode() on the argument or what __fspath__() returned. > > Or do you know functions implemented in Python accepting str *and* bytes? > On purpose, nothing off the top of my head. > > -- > > The C implementation of the os module has an important > path_converter() function: > > * path_converter accepts (Unicode) strings and their > * subclasses, and bytes and their subclasses. What > * it does with the argument depends on the platform: > * > * * On Windows, if we get a (Unicode) string we > * extract the wchar_t * and return it; if we get > * bytes we extract the char * and return that. > * > * * On all other platforms, strings are encoded > * to bytes using PyUnicode_FSConverter, then we > * extract the char * from the bytes object and > * return that. > > This function will implement something like os.fspath(). > > With os.fspath() only accepting str, we will return directly the > Unicode string on Windows. On UNIX, Unicode will be encoded, as it's > already done for Unicode strings. > > This specific function would benefit of the flavor 4 (os.fspath() can > return str and bytes), but it's more an exception than the rule. I > would be more a micro-optimization than a good reason to drive the API > design. > Yep, it's interesting to know but Chris and I won't let it drive the decision (I assume). -Brett > > Victor > > Le mercredi 13 avril 2016, Brett Cannon a écrit : > > > > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 > has the four potential approaches implemented (although it doesn't follow > the "separate functions" approach some are proposing and instead goes with > the allow_bytes approach I originally proposed). > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, 13 Apr 2016 at 15:46 Nikolaus Rath wrote: > On Apr 13 2016, Brett Cannon wrote: > > On Tue, 12 Apr 2016 at 22:38 Michael Mysinger via Python-Dev < > > python-dev@python.org> wrote: > > > >> Ethan Furman stoneleaf.us> writes: > >> > >> > Do we allow bytes to be returned from os.fspath()? If yes, then do we > >> > allow bytes from __fspath__()? > >> > >> De-lurking. Especially since the ultimate goal is better > interoperability, > >> I > >> feel like an implementation that people can play with would help guide > the > >> few remaining decisions. To help test the various options you could > >> temporarily add a _allow_bytes=GLOBAL_CONFIG_OPTION default argument to > >> both > >> pathlib.__fspath__() and os.fspath(), with distinct configurable > defaults > >> for > >> each. > >> > >> In the spirit of Python 3 I feel like bytes might not be needed in > >> practice, > >> but something like this with defaults of False will allow people to > easily > >> test all the various options. > >> > > > > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has > > the four potential approaches implemented (although it doesn't follow the > > "separate functions" approach some are proposing and instead goes with > the > > allow_bytes approach I originally proposed). > > > When passing an object that is of type str and has a __fspath__ > attribute, all approaches return the value of __fspath__(). > > However, when passing something of type bytes, the second approach > returns the object, while the third returns the value of __fspath__(). > > Is this intentional? I think a __fspath__ attribute should always be > preferred. > It's very much intentional. If we define __fspath__() to only return strings but still want to minimize boilerplate of allowing bytes to simply pass through without checking a path argument to see if it is bytes then approach #2 is warranted. But if __fspath__() can return bytes then approach #3 allows for it. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 04/13/2016 03:45 PM, Nikolaus Rath wrote: When passing an object that is of type str and has a __fspath__ attribute, all approaches return the value of __fspath__(). However, when passing something of type bytes, the second approach returns the object, while the third returns the value of __fspath__(). Is this intentional? I think a __fspath__ attribute should always be preferred. Yes, it is intentional. The second approach assumes __fspath__ can only contain str, so there is no point in checking it for bytes. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Apr 13 2016, Brett Cannon wrote: > On Tue, 12 Apr 2016 at 22:38 Michael Mysinger via Python-Dev < > python-dev@python.org> wrote: > >> Ethan Furman stoneleaf.us> writes: >> >> > Do we allow bytes to be returned from os.fspath()? If yes, then do we >> > allow bytes from __fspath__()? >> >> De-lurking. Especially since the ultimate goal is better interoperability, >> I >> feel like an implementation that people can play with would help guide the >> few remaining decisions. To help test the various options you could >> temporarily add a _allow_bytes=GLOBAL_CONFIG_OPTION default argument to >> both >> pathlib.__fspath__() and os.fspath(), with distinct configurable defaults >> for >> each. >> >> In the spirit of Python 3 I feel like bytes might not be needed in >> practice, >> but something like this with defaults of False will allow people to easily >> test all the various options. >> > > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has > the four potential approaches implemented (although it doesn't follow the > "separate functions" approach some are proposing and instead goes with the > allow_bytes approach I originally proposed). When passing an object that is of type str and has a __fspath__ attribute, all approaches return the value of __fspath__(). However, when passing something of type bytes, the second approach returns the object, while the third returns the value of __fspath__(). Is this intentional? I think a __fspath__ attribute should always be preferred. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
On 2016-04-13 12:24 PM, Victor Stinner wrote: Can someone please review the change? +1 for the change. I can take a look at the patch in a few days. Yury ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
2016-04-14 0:11 GMT+02:00 Ryan Gonzalez : > So code that depends on iterating through bytecode via HAS_ARG is going to > break... Sure. This change is backward incompatible for applications parsing bytecode in C or Python. That's why the patch also has to update the dis module. I don't see how you plan to keep the backwad compatibility, since the argument size changed from 2 bytes to 1 byte. You must update your code (written in C or Python or whatever). Hopefully, the dis was enhanced in Python 3.4: get_instructions() now gives nice Instructon objects rather than only pure text output. FYI I wrote my own library to decode and decode bytecode. It provides abstract bytecode objects to easily modify bytecode: https://bytecode.readthedocs.org/ I suggest to use such library (or simply the dis module for simple needs) if you have to handle bytecode, rather than writing your own code. I know a few other projects which handle directly bytecode: * https://pypi.python.org/pypi/codetransformer * https://github.com/serprex/byteplay * https://pypi.python.org/pypi/coverage IHMO it's not a big deal to update these projects for the future Python 3.6. I can even help them to support the new bytecode format. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
Oh, since others voted, I will also vote and explain my vote. I like choice 1, str only, because it's very well defined. In Python 3, Unicode is simply the native type for text. It's accepted by almost all functions. In other emails, I also explained that Unicode is fine to store undecodable filenames on UNIX, it works as expected since many years (since Python 3.3). -- If you cannot survive without bytes, I suggest to add two functions: one for str only, another which can return str or bytes. Maybe you want in fact two protocols: __fspath__(str only) and __fspathb__ (bytes only)? os.fspathb() would first try __fspathb__, or fallback to os.fsencode(__fspath__). os.fspath() would first try __fspath__, or fallback to os.fsdecode(__fspathb__). IMHO it's not worth to have such complexity while Unicode handles all use cases. Or do you know functions implemented in Python accepting str *and* bytes? -- The C implementation of the os module has an important path_converter() function: * path_converter accepts (Unicode) strings and their * subclasses, and bytes and their subclasses. What * it does with the argument depends on the platform: * * * On Windows, if we get a (Unicode) string we * extract the wchar_t * and return it; if we get * bytes we extract the char * and return that. * * * On all other platforms, strings are encoded * to bytes using PyUnicode_FSConverter, then we * extract the char * from the bytes object and * return that. This function will implement something like os.fspath(). With os.fspath() only accepting str, we will return directly the Unicode string on Windows. On UNIX, Unicode will be encoded, as it's already done for Unicode strings. This specific function would benefit of the flavor 4 (os.fspath() can return str and bytes), but it's more an exception than the rule. I would be more a micro-optimization than a good reason to drive the API design. Victor Le mercredi 13 avril 2016, Brett Cannon a écrit : > > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the > four potential approaches implemented (although it doesn't follow the > "separate functions" approach some are proposing and instead goes with the > allow_bytes approach I originally proposed). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
So code that depends on iterating through bytecode via HAS_ARG is going to break... Darn it. :/ -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/ On Apr 13, 2016 4:44 PM, "Victor Stinner" wrote: > Le mercredi 13 avril 2016, Ryan Gonzalez a écrit : > >> What is the value of HAS_ARG going to be now? >> > > I asked Demur to keep HAS_ARG(). Not really for backward compatibility, > but for the dis module: to keep a nice assembler. There are also debug > traces in ceval.c which use it. > > For ceval.c, we might use HAS_ARG() to micro-optimize oparg=0 (hardcode 0 > rather than reading the bytecode) for operators with no argument. Or maybe > it's completly useless :-) > > Victor > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
Le mercredi 13 avril 2016, Ryan Gonzalez a écrit : > What is the value of HAS_ARG going to be now? > I asked Demur to keep HAS_ARG(). Not really for backward compatibility, but for the dis module: to keep a nice assembler. There are also debug traces in ceval.c which use it. For ceval.c, we might use HAS_ARG() to micro-optimize oparg=0 (hardcode 0 rather than reading the bytecode) for operators with no argument. Or maybe it's completly useless :-) Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Tag-based buildmaster (was: Most 3.x buildbots are green again ... )
(Cross-posting to python-buildbots, discussion is probably best continued there) On Wed, Apr 13, 2016 at 3:37 PM, Brett Cannon wrote: > On Wed, 13 Apr 2016 at 13:17 Zachary Ware > wrote: >> After receiving a suggestion from koobs several months ago, I've been >> intermittently thinking about completely redoing our buildmaster setup >> such that instead of a single builder per version on each slave, we >> instead set up a series of builders with particular 'tags', and each >> builder attaches to each slave that satisfies the tags (running each >> build only on the first slave available). This would allow us to test >> some of the rarer options (such as --without-threads) significantly >> more often than 'never', and generally get a lot more >> customization/flexibility of builds. I haven't had a chance to sit >> down and think out all the edge cases of this idea, but what do people >> generally think of it? I think the GitHub switchover will be a good >> time to do this if it's generally seen as a decent idea, since there >> will need to be some work on the buildmaster to do the switch anyway. > > So we have slaves connect to multiple builders who have requirements of what > they are testing? So the --without-threads master would have all slaves able > to compile --without-threads connect to it and then do that build? And those > same slaves may also connect to the gcc and clang masters to do those builds > as well? So would that mean slaves could potentially do a bunch of builds > per change? That sounds nice to me as long as the slave maintainers are also > up to utilizing this by double/triple/quadrupling their builds. Basically, yes. I'm unsure as to whether the build would be done on all matching slaves on each change, or rotate between them (or use the next available) on each change; that would likely come down to which scheme we collectively want. I also have vague ideas about having 'daily' or even 'weekly' tags for builds that are deemed to not need a build for every changeset, which could alleviate some of the multiplying. -- Zach ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
Oops sorry, I forgot to add that I have no strong opinion on the type (I only have a minor preference for str only). Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
Le mercredi 13 avril 2016, Brett Cannon a écrit : > > All of this is demonstrated in > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 by > the various possibilities. In the end it's not a corner case because the > definition of __fspath__ will be such that there's no ambiguity in what > os.fspath() will accept and what __fspath__ can return and the code will be > written to conform to what the PEP dictates (IOW I'm aware that this needs > to be considered in the implementation :) . > I'm not a big fan of a flag parameter to change the return type of a function. Usually, two functions are preferred. In the os module we have getcwd/getcwdb for example. I don't know if it's a good example Do you know other examples of Python functions taking a (flag) parameter to change the result type? Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
The EXTENDED_ARG is included in the multibyte ops, I treat it just like any other operator. Here's a snippet of my hacked-dis.dis output, which made it clear to me that I could just count them as an "operator with word operand." Line 3000: x = x if x or not x and x is None else x 0001dc83 7c 00 00 LOAD_FAST x 0001dc86 91 01 00 EXTENDED_ARG1 0001dc89 70 9f dc JUMP_IF_TRUE_OR_POP L1dc9f 0001dc8c 7c 00 00 LOAD_FAST x 0001dc8f 0c UNARY_NOT 0001dc90 91 01 00 EXTENDED_ARG1 0001dc93 6f 9f dc JUMP_IF_FALSE_OR_POPL1dc9f 0001dc96 7c 00 00 LOAD_FAST x 0001dc99 74 01 00 LOAD_GLOBAL None 0001dc9c 6b 08 00 COMPARE_OP 'is' L1dc9f: 0001dc9f 91 01 00 EXTENDED_ARG1 0001dca2 72 ab dc POP_JUMP_IF_FALSE L1dcab 0001dca5 7c 00 00 LOAD_FAST x 0001dca8 6e 03 00 JUMP_FORWARDL1dcae (+3) L1dcab: 0001dcab 7c 00 00 LOAD_FAST x L1dcae: 0001dcae 7d 00 00 STORE_FAST x On Wed, Apr 13, 2016 at 2:23 PM, Victor Stinner wrote: > 2016-04-13 23:02 GMT+02:00 Eric Fahlgren : > > Percentage of 1-byte args= 96.80% > > Yeah, I expected such high ratio. Good news that you confirm it. > > > > Non-argument ops =53,719 > > One-byte args= 368,787 > > Multi-byte args =12,191 > > Again, only a very few arguments take multiple bytes. Good, the > bytecode will be smaller. > > IMHO it's more a nice side effect than a real goal. The runtime > performance matters more than the size of the bytecode, it's not like > a bytecode take 4 MB. It's probably closer to 1 KB and so can probably > benefit of the fatest CPU caches. > > > > Just for the record, here's my arithmetic: > > byteCodeSize = 1*nonArgumentOps + 3*oneByteArgs + 3*multiByteArgs > > wordCodeSize = 2*nonArgumentOps + 2*oneByteArgs + 4*multiByteArgs > > If multiByteArgs means any size > 1 byte, the wordCodeSize formula is > wrong: > > - no parameter: 2 bytes > - 8-bit parameter: 2 bytes > - 16-bit parameter: 4 bytes > - 24-bit parameter: 6 bytes > - 32-bit parameter: 8 bytes > > But you wrote that you didn't see EXTEND_ARG, so I guess that > multibyte means 16-bit in your case, and so your formula is correct. > > Hopefully, I don't expect 32-bit parameters in the wild, only 24-bit > parameter for function with annotation. > > > > (It is interesting to note that I have never encountered an EXTENDED_ARG > operator in the wild, only in my own synthetic examples.) > > As I wrote, EXTENDED_ARG can be seen when MAKE_FUNCTION is used with > annotations. > > Victor > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
What is the value of HAS_ARG going to be now? -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/ On Apr 13, 2016 11:26 AM, "Victor Stinner" wrote: > Hi, > > In the middle of recent discussions about Python performance, it was > discussed to change the Python bytecode. Serhiy proposed to reuse > MicroPython short bytecode to reduce the disk space and reduce the > memory footprint. > > Demur Rumed proposes a different change to use a regular bytecode > using 16-bit units: an instruction has always one 8-bit argument, it's > zero if the instruction doesn't have an argument: > >http://bugs.python.org/issue26647 > > According to benchmarks, it looks faster: > > http://bugs.python.org/issue26647#msg263339 > > IMHO it's a nice enhancement: it makes the code simpler. The most > interesting change is made in Python/ceval.c: > > -if (HAS_ARG(opcode)) > -oparg = NEXTARG(); > +oparg = NEXTARG(); > > This code is the very hot loop evaluating Python bytecode. I expect > that removing a conditional branch here can reduce the CPU branch > misprediction. > > I reviewed first versions of the change, and IMHO it's almost ready to > be merged. But I would prefer to have a review from a least a second > core reviewer. > > Can someone please review the change? > > -- > > The side effect of wordcode is that arguments in 0..255 now uses 2 > bytes per instruction instead of 3, so it also reduce the size of > bytecode for the most common case. > > Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead > of 3. Arguments are supported up to 32-bit: 24-bit uses 3 units (6 > bytes), 32-bit uses 4 units (8 bytes). MAKE_FUNCTION uses 16-bit > argument for keyword defaults and 24-bit argument for annotations. > Other common instruction known to use large argument are jumps for > bytecode longer than 256 bytes. > > -- > > Right now, ceval.c still fetchs opcode and then oparg with two 8-bit > instructions. Later, we can discuss if it would be possible to ensure > that the bytecode is always aligned to 16-bit in memory to fetch the > two bytes using a uint16_t* pointer. > > Maybe we can overallocate 1 byte in codeobject.c and align manually > the memory block if needed. Or ceval.c should maybe copy the code if > it's not aligned? > > Raymond Hettinger proposes something like that, but it looks like > there are concerns about non-aligned memory accesses: > >http://bugs.python.org/issue25823 > > The cost of non-aligned memory accesses depends on the CPU > architecture, but it can raise a SIGBUS on some arch (MIPS and > SPARC?). > > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
2016-04-13 23:02 GMT+02:00 Eric Fahlgren : > Percentage of 1-byte args= 96.80% Yeah, I expected such high ratio. Good news that you confirm it. > Non-argument ops =53,719 > One-byte args= 368,787 > Multi-byte args =12,191 Again, only a very few arguments take multiple bytes. Good, the bytecode will be smaller. IMHO it's more a nice side effect than a real goal. The runtime performance matters more than the size of the bytecode, it's not like a bytecode take 4 MB. It's probably closer to 1 KB and so can probably benefit of the fatest CPU caches. > Just for the record, here's my arithmetic: > byteCodeSize = 1*nonArgumentOps + 3*oneByteArgs + 3*multiByteArgs > wordCodeSize = 2*nonArgumentOps + 2*oneByteArgs + 4*multiByteArgs If multiByteArgs means any size > 1 byte, the wordCodeSize formula is wrong: - no parameter: 2 bytes - 8-bit parameter: 2 bytes - 16-bit parameter: 4 bytes - 24-bit parameter: 6 bytes - 32-bit parameter: 8 bytes But you wrote that you didn't see EXTEND_ARG, so I guess that multibyte means 16-bit in your case, and so your formula is correct. Hopefully, I don't expect 32-bit parameters in the wild, only 24-bit parameter for function with annotation. > (It is interesting to note that I have never encountered an EXTENDED_ARG > operator in the wild, only in my own synthetic examples.) As I wrote, EXTENDED_ARG can be seen when MAKE_FUNCTION is used with annotations. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
On Wednesday, April 13, 2016 09:25, Victor Stinner wrote: > The side effect of wordcode is that arguments in 0..255 now uses 2 bytes per > instruction instead of 3, so it also reduce the size of bytecode for the most > common case. > > Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead of 3. > Arguments are supported up to 32-bit: 24-bit uses 3 units (6 bytes), 32-bit > uses 4 > units (8 bytes). MAKE_FUNCTION uses 16-bit argument for keyword defaults and > 24-bit argument for annotations. > Other common instruction known to use large argument are jumps for bytecode > longer than 256 bytes. A couple months ago during an earlier discussion of wordcode, I got curious enough to instrument dis.dis so that I could calculate the actual size changes expected in practice. I ran it on a large chunk of our product code, here are the results (looks best with a fixed font). I suspect the fairly significant reduction in footprint will also give better cache hit characteristics, so we might see some "magic" speed ups from that, too. Code-generating source lines =70,792 Total bytes = 1,196,653 Argument-bearing operators = 380,978 Operands over 1 byte long=12,191 Extended arguments = 0 Percentage of 1-byte args= 96.80% Total operators = 434,697 Non-argument ops =53,719 One-byte args= 368,787 Multi-byte args =12,191 Byte code size = 1,196,653 Word code size = 893,776 Word:byte size = 74.69% Just for the record, here's my arithmetic: byteCodeSize = 1*nonArgumentOps + 3*oneByteArgs + 3*multiByteArgs wordCodeSize = 2*nonArgumentOps + 2*oneByteArgs + 4*multiByteArgs (It is interesting to note that I have never encountered an EXTENDED_ARG operator in the wild, only in my own synthetic examples.) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, Apr 13, 2016, at 16:39, Chris Barker wrote: > so are we worried that __fspath__ will exist and be callable, but might > raise an AttributeError somewhere inside itself? if so isn't it broken > anyway, so should it be ignored? Well, if you're going to say "ignore the protocol because it's broken", where do you stop? What if it raises some other exception? What if it raises SystemExit? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, 13 Apr 2016 at 13:40 Chris Barker wrote: > so are we worried that __fspath__ will exist and be callable, but might > raise an AttributeError somewhere inside itself? if so isn't it broken > anyway, so should it be ignored? > It should propagate instead of swallowing up the exception, otherwise it's hard to debug why __fspath__ seems to be ignored. > > and I know it's asking permission rather than forgiveness, but what's > wrong with: > > if hasattr(path, "__fspath__"): > path = path.__fspath__() > > if you really want to check for the existence of the attribute first? > > Nothing. > or even: > > path = path.__fspath__ if hasattr(path, "__fspath__") else path > > That also works. > > (OK, really a Pythonic style question now) > Yes, this is getting a bit side-tracked over some example code to just get a concept across. -Brett > > -CHB > > > > On Wed, Apr 13, 2016 at 12:54 PM, Brett Cannon wrote: > >> >> >> On Wed, 13 Apr 2016 at 12:39 Fred Drake wrote: >> >>> On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico >>> wrote: >>> > Is that the intention, or should the exception catching be narrower? I >>> > know it's clunky to write it in Python, but AIUI it's less so in C: >>> > >>> > try: >>> > callme = path.__fspath__ >>> > except AttributeError: >>> > pass >>> > else: >>> > path = callme() >>> >>> +1 for this variant; I really don't like masking errors inside the >>> __fspath__ implementation. >>> >> >> Don't read too much into the code in that gist. I just did them quickly >> to get the point across of the proposals in terms of str/bytes, not what >> will be proposed in any final patch. >> >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R(206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > chris.bar...@noaa.gov > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
so are we worried that __fspath__ will exist and be callable, but might raise an AttributeError somewhere inside itself? if so isn't it broken anyway, so should it be ignored? and I know it's asking poermission rather than forgiveness, but what's wrong with: if hasattr(path, "__fspath__"): path = path.__fspath__() if you really want to check for the existence of the attribute first? or even: path = path.__fspath__ if hasattr(path, "__fspath__") else path (OK, really a Pythonic style question now) -CHB On Wed, Apr 13, 2016 at 12:54 PM, Brett Cannon wrote: > > > On Wed, 13 Apr 2016 at 12:39 Fred Drake wrote: > >> On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico wrote: >> > Is that the intention, or should the exception catching be narrower? I >> > know it's clunky to write it in Python, but AIUI it's less so in C: >> > >> > try: >> > callme = path.__fspath__ >> > except AttributeError: >> > pass >> > else: >> > path = callme() >> >> +1 for this variant; I really don't like masking errors inside the >> __fspath__ implementation. >> > > Don't read too much into the code in that gist. I just did them quickly to > get the point across of the proposals in terms of str/bytes, not what will > be proposed in any final patch. > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
On Wed, 13 Apr 2016 at 13:17 Zachary Ware wrote: > [SNIP] > --- > > After receiving a suggestion from koobs several months ago, I've been > intermittently thinking about completely redoing our buildmaster setup > such that instead of a single builder per version on each slave, we > instead set up a series of builders with particular 'tags', and each > builder attaches to each slave that satisfies the tags (running each > build only on the first slave available). This would allow us to test > some of the rarer options (such as --without-threads) significantly > more often than 'never', and generally get a lot more > customization/flexibility of builds. I haven't had a chance to sit > down and think out all the edge cases of this idea, but what do people > generally think of it? I think the GitHub switchover will be a good > time to do this if it's generally seen as a decent idea, since there > will need to be some work on the buildmaster to do the switch anyway. > So we have slaves connect to multiple builders who have requirements of what they are testing? So the --without-threads master would have all slaves able to compile --without-threads connect to it and then do that build? And those same slaves may also connect to the gcc and clang masters to do those builds as well? So would that mean slaves could potentially do a bunch of builds per change? That sounds nice to me as long as the slave maintainers are also up to utilizing this by double/triple/quadrupling their builds. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
On Wed, Apr 13, 2016 at 6:40 AM, Victor Stinner wrote: > Hi, > > Last months, most 3.x buildbots failed randomly. Some of them were > always failing. I spent some time to fix almost all Windows and Linux > buildbots. There were a lot of different issues. Thank you for doing this! > Maybe it's time to move more 3.x buildbots to the "stable" category? > http://buildbot.python.org/all/waterfall?category=3.x.stable A few months ago, I put together a list of suggestions for updating the stable/unstable list, but never got around to implementing it. > We have many offline buildbots. What's the status of these buildbots? > Should we expect that they come back soon? My Windows 8.1 bot is a VM that resides on a machine that has been disturbingly unstable lately, and it's starting to seem like the instability is due to that VM. I hope to have it back up (and stable) again soon, but have no timetable for it. My Docs bot was off after losing power over the weekend, and I just hadn't noticed yet. It's back now. I'll ping the python-buildbots list about other offline bots. > Or would it be possible to hide them? It would help to check the > status of all buildbots. I'm not sure, but that would be a nice feature. > - the 4 ICC buildbots are failing with stack overflow, segfault, etc. > Again, I'm not sure that these buildbots are useful since it looks > like we don't support this compiler yet. Or does it help to work on > supporting this compiler? Who is working on ICC support? The Ubuntu ICC bot is generally quite stable. The OSX ICC bot is currently offline, but has only a couple of known issues. The Windows ICC bot is still a bit experimental, but has inched closer to producing a working build. R. David Murray and I have been working with Intel on ICC support. > By the way, I'm always surprised by the huge difference of time needed > to run a build on the different slaves: from a few minutes to more > than 3 hours. The fatest Windows slave takes 28 minutes (run tests in > parallel using 4 child processes), whereas the 3 others (run tests > sequentially and) take between 2 hours and more than 3 hours! Why > running tests on Windows takes so long? Most of that is down to debug mode; building Python in debug mode links with the debug CRT which also enables all manner of extra checks. When it's up, the non-debug Windows bot also runs the test suite in ~28 minutes, running sequentially. --- After receiving a suggestion from koobs several months ago, I've been intermittently thinking about completely redoing our buildmaster setup such that instead of a single builder per version on each slave, we instead set up a series of builders with particular 'tags', and each builder attaches to each slave that satisfies the tags (running each build only on the first slave available). This would allow us to test some of the rarer options (such as --without-threads) significantly more often than 'never', and generally get a lot more customization/flexibility of builds. I haven't had a chance to sit down and think out all the edge cases of this idea, but what do people generally think of it? I think the GitHub switchover will be a good time to do this if it's generally seen as a decent idea, since there will need to be some work on the buildmaster to do the switch anyway. -- Zach ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] List posting custom [was: current status of discussions]
On Wed, Apr 13, 2016 at 5:56 AM, Stephen J. Turnbull wrote: > The following is my opinion, as will become obvious, but it's based on > over a decade of observing these lists, and other open source > development lists. In a context where some core developers have > unsubscribed from these lists, and others regularly report muting > threads with a certain air of asperity, I think it's worth the risk of > seeming arrogant to explain some of the customs (which are complex and > subtle) around posting to Python developer lists. I'm posting > publicly because there are several new developers whose activity and > fresh perspective is very welcome, but harmony *is* being disturbed, > IMO unnecessarily. > Thank you for this thoughtful post. While none of the quotes you refer to are mine, I did try to find whether any of the advice is something I should learn from. While I didn't find a whole lot (please do correct me if you think otherwise), it is also valuable to hear these things from someone more experienced, even just to confirm what I may have thought or guessed. I can't really tell, but possibly some of the thoughts are interesting even to people significantly more experienced than me. I know you are not interested in discussing this further here, but I'll add some inexperienced points of view inline below, just in case someone is interested: > This particular post caught my eye, but it's only an example of one of > the most unharmonious posting styles that has become common recently. > Attribution deliberately removed. > > > Sorry for disturbing this thread's harmony. > > *sigh* There is way too much of this on Python-Ideas recently, and > there shouldn't be any on Python-Dev. So please don't. Specifically, > disagreement with an apparently developing consensus is fine but > please avoid this: > > > >> Path is an alternative to os.path -- you don't need to use both. > > > > I agree with that quote of Chris. > > It's a waste of time to post *what* you agree with.[1] Decisions are > not taken by vote in this community, except for the color of the > bikeshed, where it is agreed that *what* decision is taken doesn't > matter, but that some decision should be taken expeditiously.[2] > Chris already stated this position clearly and it's not a "color", so > there is no need to reiterate. It simply wastes others' time to read > it. (Whether it was a waste of the poster's time is not for me to > comment on.) > > What matters to the decision is *why* you agree (or disagree). If you > think that some of Chris's arguments are bogus (and should be > disregarded) and others are important, that is valuable information. > It's even better if you can shed additional light on the matter > (example below). > > Also, expression of agreement is often a prelude to a request for > information. "I agree with Z's post. At least, I have never needed > X. *When* do you need X? Let's look for a better way than X!" > That's what I thought too. I remember several times recently that I have mentioned I agreed about something, then continuing to add more to it, or even saying I disagree about something else. Part of the reason to also state that I agree is an attempt to keep the overall tone more positive. After all, the other person might be a highly experienced core developer who just did not happen to have gone though all the same thoughts regarding that specific question recently. I hope that has not been interpreted as arrogance such as "I know better than these people". For me, as one of the (many?) newcomers, especially on -dev, it can sometimes be difficult to tell whether not getting a reaction means "Good point, I agree", "I did not understand so I'll just ignore it", "I don't want to argue with you" or something else. Then again, someone just saying essentially the same thing without a reference a few posts later just feels strange. Also, if the only thing people apparently do is disagree about things, it makes the overall tone of the discussions at least *seem* very negative. From this point of view there seems to be some good in positive comments. > Unsupported (dis)agreement to statements about "needs" also may be > taken as *rude*, because others may infer your arrogant claim to know > what *they* do or don't need. Admittedly there's a difficult > distinction here between Chris's *idiom* where "you don't need to" > translates to "In my understanding, it is generally not necessary to", > and your *unsupported* agreement, which in my dialect of English > changes the emphasis to imply you know better than those who disagree > with you and Chris. And, of course, the position that others are "too > easily offended" is often reasonable, but you should be aware that > there will be an impact on your reputation and ability to influence > development of Python (even if it doesn't come near the point where > a moderator invokes "Code of Conduct"). > > "Me too" posts aren't entirely forbidden, but I feel that
Re: [Python-Dev] pathlib - current status of discussions
On Wed, 13 Apr 2016 at 12:39 Fred Drake wrote: > On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico wrote: > > Is that the intention, or should the exception catching be narrower? I > > know it's clunky to write it in Python, but AIUI it's less so in C: > > > > try: > > callme = path.__fspath__ > > except AttributeError: > > pass > > else: > > path = callme() > > +1 for this variant; I really don't like masking errors inside the > __fspath__ implementation. > Don't read too much into the code in that gist. I just did them quickly to get the point across of the proposals in terms of str/bytes, not what will be proposed in any final patch. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Thu, Apr 14, 2016 at 5:46 AM, Random832 wrote: > On Wed, Apr 13, 2016, at 15:24, Chris Angelico wrote: >> Is that the intention, or should the exception catching be narrower? I >> know it's clunky to write it in Python, but AIUI it's less so in C: > > How is it less so in C? You lose the ability to PyObject_CallMethod. I might be wrong, then. Wasn't sure how it was all implemented. Anyway, it's a correctness thing, not a simplicity one, so even if it is clunkier, it ought to be the case. And that is the intention, so we're fine. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, Apr 13, 2016, at 15:24, Chris Angelico wrote: > Is that the intention, or should the exception catching be narrower? I > know it's clunky to write it in Python, but AIUI it's less so in C: How is it less so in C? You lose the ability to PyObject_CallMethod. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 4/13/2016 13:49, Ethan Furman wrote: Number 3: it allows bytes, but only when told it's okay to do so. Having code get a bytes object when one is not expected is not a headache we need to inflict on anyone. This is an artifact of the other needless restrictions I said I wouldn't rant about. I think it is in the best interest not to perpetuate those needless restrictions. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Thu, Apr 14, 2016 at 5:30 AM, Brett Cannon wrote: > > > On Wed, 13 Apr 2016 at 12:25 Chris Angelico wrote: >> >> On Thu, Apr 14, 2016 at 3:10 AM, Brett Cannon wrote: >> > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has >> > the >> > four potential approaches implemented (although it doesn't follow the >> > "separate functions" approach some are proposing and instead goes with >> > the >> > allow_bytes approach I originally proposed). >> >> All of them have this construct: >> >> try: >> path = path.__fspath__() >> except AttributeError: >> pass >> >> Is that the intention, or should the exception catching be narrower? I >> know it's clunky to write it in Python, but AIUI it's less so in C: >> >> try: >> callme = path.__fspath__ >> except AttributeError: >> pass >> else: >> path = callme() > > > I'm assuming the C code will do what you're suggesting. My way is just > faster to write in 2 minutes of coding. :) Cool cool. Just checking! You're already aware that my preference is for the first one, str-only. I don't think the second one has much value (a path-like object can only ever return a str, but a bytes can be passed through unchanged?), and the fourth strikes me as a bad idea (just allowing bytes any time). So my votes are +1, -0.5, +0, -1. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, Apr 13, 2016 at 3:24 PM, Chris Angelico wrote: > Is that the intention, or should the exception catching be narrower? I > know it's clunky to write it in Python, but AIUI it's less so in C: > > try: > callme = path.__fspath__ > except AttributeError: > pass > else: > path = callme() +1 for this variant; I really don't like masking errors inside the __fspath__ implementation. -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Wed, 13 Apr 2016 at 12:25 Chris Angelico wrote: > On Thu, Apr 14, 2016 at 3:10 AM, Brett Cannon wrote: > > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 > has the > > four potential approaches implemented (although it doesn't follow the > > "separate functions" approach some are proposing and instead goes with > the > > allow_bytes approach I originally proposed). > > All of them have this construct: > > try: > path = path.__fspath__() > except AttributeError: > pass > > Is that the intention, or should the exception catching be narrower? I > know it's clunky to write it in Python, but AIUI it's less so in C: > > try: > callme = path.__fspath__ > except AttributeError: > pass > else: > path = callme() > I'm assuming the C code will do what you're suggesting. My way is just faster to write in 2 minutes of coding. :) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Thu, Apr 14, 2016 at 3:10 AM, Brett Cannon wrote: > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the > four potential approaches implemented (although it doesn't follow the > "separate functions" approach some are proposing and instead goes with the > allow_bytes approach I originally proposed). All of them have this construct: try: path = path.__fspath__() except AttributeError: pass Is that the intention, or should the exception catching be narrower? I know it's clunky to write it in Python, but AIUI it's less so in C: try: callme = path.__fspath__ except AttributeError: pass else: path = callme() ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
Brett Cannon python.org> writes: > In the spirit of Python 3 I feel like bytes might not be needed in practice, > but something like this with defaults of False will allow people to easily > test all the various options. > > > > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allow_bytes approach I originally proposed). Either number 1 or number 3 for me (I don't think bytes path-like objects are useful in Python). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 04/13/2016 10:22 AM, Alexander Walters wrote: On 4/13/2016 13:10, Brett Cannon wrote: https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allow_bytes approach I originally proposed). Number 4 is my personal favorite - it has a simple control flow path and is the least needlessly restrictive. Number 3: it allows bytes, but only when told it's okay to do so. Having code get a bytes object when one is not expected is not a headache we need to inflict on anyone. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 4/13/2016 13:10, Brett Cannon wrote: https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allow_bytes approach I originally proposed). Number 4 is my personal favorite - it has a simple control flow path and is the least needlessly restrictive. (I could rant about needless restrictions, but I am about a decade late for that, so I wont bother.) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Wed, 13 Apr 2016 at 09:52 Random832 wrote: > On Wed, Apr 13, 2016, at 11:28, Ethan Furman wrote: > > On 04/13/2016 08:17 AM, Random832 wrote: > > > On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote: > > > > >> I'd expect the main consumers to be os and os.path, and would honestly > > >> be surprised if we needed many explicit invocations above that layer, > > >> other than in pathlib itself. > > > > > > I made a toy implementation to try this out, and making os.open support > > > it does not get you builtin open "for free" as I had suspected; builtin > > > open has its own type checks in _iomodule.c. > > > > Yup, it will take some effort to make this work. > > A corner case just occurred to me... > > For functions that will continue to accept str/bytes (and functions that > accept some other type such as Number or file-like objects), what should > be done with an object that is one of these, *and* has an __fspath__ > method, *and* this method returns a value other than the object's own > value? Basically, should the protocol check be done unconditionally > (before attempting to use the argument as a string) or only if the > argument is not a string (there's an efficiency argument for this). Or > should it be left "unspecified", with the understanding that such > objects are badly behaved and may not be handled consistently across > different functions / python implementations / cpython versions? > > Also, should the os.fspath (or whatever we call it) function itself > accept str/bytes, even if these are not going to implement the protocol? > All of this is demonstrated in https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 by the various possibilities. In the end it's not a corner case because the definition of __fspath__ will be such that there's no ambiguity in what os.fspath() will accept and what __fspath__ can return and the code will be written to conform to what the PEP dictates (IOW I'm aware that this needs to be considered in the implementation :) . ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On Tue, 12 Apr 2016 at 22:38 Michael Mysinger via Python-Dev < python-dev@python.org> wrote: > Ethan Furman stoneleaf.us> writes: > > > Do we allow bytes to be returned from os.fspath()? If yes, then do we > > allow bytes from __fspath__()? > > De-lurking. Especially since the ultimate goal is better interoperability, > I > feel like an implementation that people can play with would help guide the > few remaining decisions. To help test the various options you could > temporarily add a _allow_bytes=GLOBAL_CONFIG_OPTION default argument to > both > pathlib.__fspath__() and os.fspath(), with distinct configurable defaults > for > each. > > In the spirit of Python 3 I feel like bytes might not be needed in > practice, > but something like this with defaults of False will allow people to easily > test all the various options. > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allow_bytes approach I originally proposed). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 04/13/2016 09:58 AM, Brett Cannon wrote:> On Wed, 13 Apr 2016 at 09:19 Fred Drake wrote: >> I do the same, but... this is one of those cases where a caller will >> usually be passing a constant directly. If passed as a positional >> argument, it'll just be confusing ("what's True?" is my usual >> reaction to a Boolean positional argument). > > It would be keyword-only so this isn't even a possibility. > >> If passed as a keyword argument >> with a descriptive name, it'll be longer than I'd like to see: >> >> path_str = os.fspath(path, allow_bytes=True) > > I think the expectation that the number of people actually directly > calling this function with that argument specified is going to be > rather small, so the common-case will simply be: > > path_str = os.fspath(path) That is certainly my expectation. :) >> Names like os.fspath() and os.fssyspath() seem good to me. A single function is definitely my preference, but if that's not possible then I'm fine with that pair of names. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Apr 13 2016, Ethan Furman wrote: > (I'm not very good at keeping similar sounding functions separate -- > what's the difference between shutil.copy and shutil.copy2? I have to > look it up every time). Well, "2" is more than "" (or 1), so copy2() copies *more* than copy() - it includes the metadata. That always helps me. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Wed, 13 Apr 2016 at 09:19 Fred Drake wrote: > On Wed, Apr 13, 2016 at 11:09 AM, Ethan Furman wrote: > > - a single os.fspath() with an allow_bytes parameter > > (mostly True in os and os.path, mostly False everywhere > > else) > > -0 > > > - a str-only os.fspathname() and a str/bytes os.fspath() > > +1 on using separate functions. > > > I'm partial to the first choice as it is simplicity itself to know when > > looking at it if bytes might be coming back by the presence or absence > of a > > second argument to the call; otherwise one has to keep straight in one's > > head which is str-only and which might allow bytes (I'm not very good at > > keeping similar sounding functions separate -- what's the difference > between > > shutil.copy and shutil.copy2? I have to look it up every time). > > I do the same, but... this is one of those cases where a caller will > usually be passing a constant directly. If passed as a positional > argument, it'll just be confusing ("what's True?" is my usual reaction > to a Boolean positional argument). It would be keyword-only so this isn't even a possibility. > If passed as a keyword argument > with a descriptive name, it'll be longer than I'd like to see: > > path_str = os.fspath(path, allow_bytes=True) > I think the expectation that the number of people actually directly calling this function with that argument specified is going to be rather small, so the common-case will simply be: path_str = os.fspath(path) > > Names like os.fspath() and os.fssyspath() seem good to me. > -Brett ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Wed, Apr 13, 2016, at 11:28, Ethan Furman wrote: > On 04/13/2016 08:17 AM, Random832 wrote: > > On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote: > > >> I'd expect the main consumers to be os and os.path, and would honestly > >> be surprised if we needed many explicit invocations above that layer, > >> other than in pathlib itself. > > > > I made a toy implementation to try this out, and making os.open support > > it does not get you builtin open "for free" as I had suspected; builtin > > open has its own type checks in _iomodule.c. > > Yup, it will take some effort to make this work. A corner case just occurred to me... For functions that will continue to accept str/bytes (and functions that accept some other type such as Number or file-like objects), what should be done with an object that is one of these, *and* has an __fspath__ method, *and* this method returns a value other than the object's own value? Basically, should the protocol check be done unconditionally (before attempting to use the argument as a string) or only if the argument is not a string (there's an efficiency argument for this). Or should it be left "unspecified", with the understanding that such objects are badly behaved and may not be handled consistently across different functions / python implementations / cpython versions? Also, should the os.fspath (or whatever we call it) function itself accept str/bytes, even if these are not going to implement the protocol? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
On Wed, 13 Apr 2016 at 06:14 Stefan Krah wrote: > Victor Stinner gmail.com> writes: > > Maybe it's time to move more 3.x buildbots to the "stable" category? > > http://buildbot.python.org/all/waterfall?category=3.x.stable > > +1 I think anything that is actually stable should be in that category. > > > > By the way, I don't understand why "AMD64 OpenIndiana 3.x" is > > considered as stable since it's failing with multiple issues since > > many months and nobody is working on these failures. I suggest to move > > this buildbot back to the unstable category. > > +1 The bot was very stable and fast for some time but has been unstable > for at least a year. > > > > > - PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers, > > test_socket, test_distutils, test_asyncio, (...); random timeout > > failure in test_eintr, etc. I don't have access to AIX and I'm not > > interested to acquire an AIX license, nor to install it. I'm not sure > > that it's useful to have an AIX buildbot and no core developer have > > access to AIX, and nobody is working on AIX failures. Maybe HP wants > > to help us to support AIX? (Provide manpower, access to AIX servers, > > or something like that.) > > Well, I think in this case it's the gcc AIX maintainer running it, so... > > > I think we should have a policy to stop reporting issues on unstable > bots unless someone has a concrete fix OR the bot maintainers are > known to fix issues fast (but that does not seem to be the case). > Official policy per https://www.python.org/dev/peps/pep-0011/#supporting-platforms states that there must be a core developer to maintain the compatibility, so if there's no one helping to keep a particular buildbot green then I agree it should be marked as unstable and thus not supported. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 13 April 2016 at 17:31, Ethan Furman wrote: > On 04/13/2016 09:27 AM, Paul Moore wrote: >> >> On 13 April 2016 at 17:18, Fred Drake wrote: > > >>> Names like os.fspath() and os.fssyspath() seem good to me. >> >> >> -1 on fssyspath - the "system" representation is bytes on POSIX, but >> not on Windows. Let's be explicit and go with fsbytespath(). > > > It will be confusing that fsbytespath() can return a string. Oh, wait, yes fssyspath is for allow_bytes=True which *may* be bytes, but could still be a string. My mistake. On that basis, I could go with fssyspath (thinking "sys" = "low level"). Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
On Wed, 13 Apr 2016 at 05:57 Tim Golden wrote: > On 13/04/2016 12:40, Victor Stinner wrote: > > Last months, most 3.x buildbots failed randomly. Some of them were > > always failing. I spent some time to fix almost all Windows and Linux > > buildbots. There were a lot of different issues. > > Can I state the obvious and offer a huge vote of thanks for this work, > which is often tedious and unrewarding? > Yep, big thanks from me as well! ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 04/13/2016 09:27 AM, Paul Moore wrote: On 13 April 2016 at 17:18, Fred Drake wrote: Names like os.fspath() and os.fssyspath() seem good to me. -1 on fssyspath - the "system" representation is bytes on POSIX, but not on Windows. Let's be explicit and go with fsbytespath(). It will be confusing that fsbytespath() can return a string. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units
Nice work. I think that for CPython, speed is much more important than memory use for the code. Disk space is practically free for anything smaller than a video. :-) On Wed, Apr 13, 2016 at 9:24 AM, Victor Stinner wrote: > Hi, > > In the middle of recent discussions about Python performance, it was > discussed to change the Python bytecode. Serhiy proposed to reuse > MicroPython short bytecode to reduce the disk space and reduce the > memory footprint. > > Demur Rumed proposes a different change to use a regular bytecode > using 16-bit units: an instruction has always one 8-bit argument, it's > zero if the instruction doesn't have an argument: > >http://bugs.python.org/issue26647 > > According to benchmarks, it looks faster: > > http://bugs.python.org/issue26647#msg263339 > > IMHO it's a nice enhancement: it makes the code simpler. The most > interesting change is made in Python/ceval.c: > > -if (HAS_ARG(opcode)) > -oparg = NEXTARG(); > +oparg = NEXTARG(); > > This code is the very hot loop evaluating Python bytecode. I expect > that removing a conditional branch here can reduce the CPU branch > misprediction. > > I reviewed first versions of the change, and IMHO it's almost ready to > be merged. But I would prefer to have a review from a least a second > core reviewer. > > Can someone please review the change? > > -- > > The side effect of wordcode is that arguments in 0..255 now uses 2 > bytes per instruction instead of 3, so it also reduce the size of > bytecode for the most common case. > > Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead > of 3. Arguments are supported up to 32-bit: 24-bit uses 3 units (6 > bytes), 32-bit uses 4 units (8 bytes). MAKE_FUNCTION uses 16-bit > argument for keyword defaults and 24-bit argument for annotations. > Other common instruction known to use large argument are jumps for > bytecode longer than 256 bytes. > > -- > > Right now, ceval.c still fetchs opcode and then oparg with two 8-bit > instructions. Later, we can discuss if it would be possible to ensure > that the bytecode is always aligned to 16-bit in memory to fetch the > two bytes using a uint16_t* pointer. > > Maybe we can overallocate 1 byte in codeobject.c and align manually > the memory block if needed. Or ceval.c should maybe copy the code if > it's not aligned? > > Raymond Hettinger proposes something like that, but it looks like > there are concerns about non-aligned memory accesses: > >http://bugs.python.org/issue25823 > > The cost of non-aligned memory accesses depends on the CPU > architecture, but it can raise a SIGBUS on some arch (MIPS and > SPARC?). > > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 04/13/2016 09:18 AM, Fred Drake wrote: On Wed, Apr 13, 2016 at 11:09 AM, Ethan Furman wrote: - a single os.fspath() with an allow_bytes parameter (mostly True in os and os.path, mostly False everywhere else) -0 - a str-only os.fspathname() and a str/bytes os.fspath() +1 on using separate functions. Names like os.fspath() and os.fssyspath() seem good to me. Ooh, I like that! I could probably keep those names separate in my head. :) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Wed, Apr 13, 2016 at 12:27 PM, Paul Moore wrote: > -1 on fssyspath - the "system" representation is bytes on POSIX, but > not on Windows. Let's be explicit and go with fsbytespath(). Depends on the semantics; if we're expecting it to return str-or-bytes, os.fssyspath() seems fine. If only returning bytes (not sure that ever makes sense on Windows, since I don't use Windows), then I'd be happy with os.fsbytespath(). -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 13 April 2016 at 17:18, Fred Drake wrote: > Names like os.fspath() and os.fssyspath() seem good to me. -1 on fssyspath - the "system" representation is bytes on POSIX, but not on Windows. Let's be explicit and go with fsbytespath(). But agreed that always-constant boolean parameters are a bad idea. The hard bit is good naming of the separate functions (100% agree that shutil is a good example of how not to do it :-)) Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Not receiving bug tracker emails
Glad it's working again! And it was a combination or R. David Murray, Ezio Melotti, Mark Mangoba ( http://pyfound.blogspot.com/2016/04/the-psf-has-hired-it-manager.html in case you don't know who Mark is), and myself along with Upfront (b.p.o hosting provider). On Tue, 12 Apr 2016 at 21:40 Terry Reedy wrote: > On 4/4/2016 5:05 PM, Terry Reedy wrote: > > Since a few days, I am getting bug tracker emails again, in my Inbox. I > just got a Rietveld review in the Inbox and I believe it went there > directly instead of first to Junk. Thank you to whoever made the > improvements. > > -- > Terry Jan Reedy > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Wordcode: new regular bytecode using 16-bit units
Hi, In the middle of recent discussions about Python performance, it was discussed to change the Python bytecode. Serhiy proposed to reuse MicroPython short bytecode to reduce the disk space and reduce the memory footprint. Demur Rumed proposes a different change to use a regular bytecode using 16-bit units: an instruction has always one 8-bit argument, it's zero if the instruction doesn't have an argument: http://bugs.python.org/issue26647 According to benchmarks, it looks faster: http://bugs.python.org/issue26647#msg263339 IMHO it's a nice enhancement: it makes the code simpler. The most interesting change is made in Python/ceval.c: -if (HAS_ARG(opcode)) -oparg = NEXTARG(); +oparg = NEXTARG(); This code is the very hot loop evaluating Python bytecode. I expect that removing a conditional branch here can reduce the CPU branch misprediction. I reviewed first versions of the change, and IMHO it's almost ready to be merged. But I would prefer to have a review from a least a second core reviewer. Can someone please review the change? -- The side effect of wordcode is that arguments in 0..255 now uses 2 bytes per instruction instead of 3, so it also reduce the size of bytecode for the most common case. Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead of 3. Arguments are supported up to 32-bit: 24-bit uses 3 units (6 bytes), 32-bit uses 4 units (8 bytes). MAKE_FUNCTION uses 16-bit argument for keyword defaults and 24-bit argument for annotations. Other common instruction known to use large argument are jumps for bytecode longer than 256 bytes. -- Right now, ceval.c still fetchs opcode and then oparg with two 8-bit instructions. Later, we can discuss if it would be possible to ensure that the bytecode is always aligned to 16-bit in memory to fetch the two bytes using a uint16_t* pointer. Maybe we can overallocate 1 byte in codeobject.c and align manually the memory block if needed. Or ceval.c should maybe copy the code if it's not aligned? Raymond Hettinger proposes something like that, but it looks like there are concerns about non-aligned memory accesses: http://bugs.python.org/issue25823 The cost of non-aligned memory accesses depends on the CPU architecture, but it can raise a SIGBUS on some arch (MIPS and SPARC?). Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Wed, Apr 13, 2016 at 11:09 AM, Ethan Furman wrote: > - a single os.fspath() with an allow_bytes parameter > (mostly True in os and os.path, mostly False everywhere > else) -0 > - a str-only os.fspathname() and a str/bytes os.fspath() +1 on using separate functions. > I'm partial to the first choice as it is simplicity itself to know when > looking at it if bytes might be coming back by the presence or absence of a > second argument to the call; otherwise one has to keep straight in one's > head which is str-only and which might allow bytes (I'm not very good at > keeping similar sounding functions separate -- what's the difference between > shutil.copy and shutil.copy2? I have to look it up every time). I do the same, but... this is one of those cases where a caller will usually be passing a constant directly. If passed as a positional argument, it'll just be confusing ("what's True?" is my usual reaction to a Boolean positional argument). If passed as a keyword argument with a descriptive name, it'll be longer than I'd like to see: path_str = os.fspath(path, allow_bytes=True) Names like os.fspath() and os.fssyspath() seem good to me. -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 04/13/2016 08:17 AM, Random832 wrote: On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote: I'd expect the main consumers to be os and os.path, and would honestly be surprised if we needed many explicit invocations above that layer, other than in pathlib itself. I made a toy implementation to try this out, and making os.open support it does not get you builtin open "for free" as I had suspected; builtin open has its own type checks in _iomodule.c. Yup, it will take some effort to make this work. Probably anything not implemented in pure python that deals with filenames is going to have to have its type checking revised. Agreed. You can see why there was no point in pursuing the conversation unless someone was willing to do the work. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On Wed, Apr 13, 2016, at 10:21, Nick Coghlan wrote: > I'd expect the main consumers to be os and os.path, and would honestly > be surprised if we needed many explicit invocations above that layer, > other than in pathlib itself. I made a toy implementation to try this out, and making os.open support it does not get you builtin open "for free" as I had suspected; builtin open has its own type checks in _iomodule.c. Probably anything not implemented in pure python that deals with filenames is going to have to have its type checking revised. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 04/13/2016 07:21 AM, Nick Coghlan wrote: On 14 April 2016 at 00:11, Paul Moore wrote: On 13 April 2016 at 14:51, Nick Coghlan wrote: The potential SE-strings only come back when you pass str, and the operating system data isn't properly encoded according to the nominal filesystem encoding. They round trip nicely to other operating system APIs, but can indeed be a problem if they escape to other parts of your program If the operating system APIs handle SE-strings correctly, is it not acceptable to require the fspath protocol to return strings, and then places like DirEntry or Ethan's module, when they want to return bytes, can just SE-encode the bytes and return those? Or will the fspath protocol be used at a low enough level that it's *below* the point where SE-encoded strings are handled properly? I'd expect the main consumers to be os and os.path, and would honestly be surprised if we needed many explicit invocations above that layer, other than in pathlib itself. That's actually the main factor in my suggesting the two level API design - from a protocol consumer perspective, bytes-or-str is a natural fit for os and os.path, while str-only is a natural fit for pathlib. I also now believe it makes sense to postpone a final decision on this aspect of the design until after a draft implementation has been put together, as my and Ethan's assumption that os and os.path will be the main consumers is exactly that: an assumption. Putting the draft implementation together will let us know whether or not it's an accurate one. Sounds reasonable. However, there is still one choice that needs to be made: - a single os.fspath() with an allow_bytes parameter (mostly True in os and os.path, mostly False everywhere else) - a str-only os.fspathname() and a str/bytes os.fspath() I'm partial to the first choice as it is simplicity itself to know when looking at it if bytes might be coming back by the presence or absence of a second argument to the call; otherwise one has to keep straight in one's head which is str-only and which might allow bytes (I'm not very good at keeping similar sounding functions separate -- what's the difference between shutil.copy and shutil.copy2? I have to look it up every time). -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 14 April 2016 at 00:11, Paul Moore wrote: > On 13 April 2016 at 14:51, Nick Coghlan wrote: >> The potentially SE-strings only come back when you pass str, and the >> operating system data isn't properly encoded according to the nominal >> filesystem encoding. They round trip nicely to other operating system >> APIs, but can indeed be a problem if they escape to other parts of >> your program > > If the operating system APIs handle SE-strings correctly, is it not > acceptable to require the fspath protocol to return strings, and then > places like DirEntry or Ethan's module, when they want to return > bytes, can just SE-encode the bytes and return those? > > Or will the fspath protocol be used at a low enough level that it's > *below* the point where SE-encoded strings are handled properly? I'd expect the main consumers to be os and os.path, and would honestly be surprised if we needed many explicit invocations above that layer, other than in pathlib itself. That's actually the main factor in my suggesting the two level API design - from a protocol consumer perspective, bytes-or-str is a natural fit for os and os.path, while str-only is a natural fit for pathlib. I also now believe it makes sense to postpone a final decision on this aspect of the design until after a draft implementation has been put together, as my and Ethan's assumption that os and os.path will be the main consumers is exactly that: an assumption. Putting the draft implementation together will let us know whether or not it's an accurate one. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 13 April 2016 at 14:51, Nick Coghlan wrote: > The potentially SE-strings only come back when you pass str, and the > operating system data isn't properly encoded according to the nominal > filesystem encoding. They round trip nicely to other operating system > APIs, but can indeed be a problem if they escape to other parts of > your program If the operating system APIs handle SE-strings correctly, is it not acceptable to require the fspath protocol to return strings, and then places like DirEntry or Ethan's module, when they want to return bytes, can just SE-encode the bytes and return those? Or will the fspath protocol be used at a low enough level that it's *below* the point where SE-encoded strings are handled properly? Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pathlib - current status of discussions
On 13 April 2016 at 02:19, Chris Barker wrote: > So: why use strings as the lingua franca of paths? i.e. the basis of the > path protocol. maybe we should support only two path representations: > > 1) A "proper" path object -- i.e. pathlib.Path or anything else that > supports the path protocol. > > 2) the bytes that the OS actually needs. > > this would mean that the protocol would be to have a __pathbytes__() method > that woulde return the bytes that should be passed off to the OS. The reason to favour strings over raw bytes for path manipulation is the same reason to favour them anywhere else: to avoid having to worry about encodings *while* you're manipulating things, and instead only worry about the encoding when actually talking to the OS (which may be UTF-16-LE to talk to a Windows API, or UTF-8 to talk to a *nix API, or something else entirely if your OS is set up that way, or you're writing the path to a file or network packet, rather than using it locally). Regardless of what we decide about os.fspath's return type, that general principle won't change - if you're manipulating bytes paths directly, you're doing something relatively specialised (like working on CPython's own os module). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
On 13 April 2016 at 02:15, Ethan Furman wrote: > On 04/11/2016 04:43 PM, Victor Stinner wrote: >> >> Le 11 avr. 2016 11:11 PM, "Ethan Furman" a écrit : > > >>> So my concern in such a case is what happens if we pass this SE >>> string somewhere else: a UTF-8 file, or over a socket, or into a >>> database? Does this have issues that we wouldn't face if we just used >>> bytes? >> >> >> "SE string" are returned by os.listdir(str), os.walk(str), >> os.getenv(str), sys.argv[int], ... since Python 3.3. Nothing new under >> the sun. > > > So when we pass a bytes object in, Python (on posix) converts that to a > string using surrogateescape, gets back strings from the os, and encodes > them back to bytes, again using surrogateescape? On POSIX, if you pass bytes to the os module, it will pass bytes to the underlying system API, and then pass bytes back to your application. The potentially SE-strings only come back when you pass str, and the operating system data isn't properly encoded according to the nominal filesystem encoding. They round trip nicely to other operating system APIs, but can indeed be a problem if they escape to other parts of your program (hence ideas like http://bugs.python.org/issue18814#msg251694 and the preceding discussion in that issue) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
Victor Stinner gmail.com> writes: > Maybe it's time to move more 3.x buildbots to the "stable" category? > http://buildbot.python.org/all/waterfall?category=3.x.stable +1 I think anything that is actually stable should be in that category. > By the way, I don't understand why "AMD64 OpenIndiana 3.x" is > considered as stable since it's failing with multiple issues since > many months and nobody is working on these failures. I suggest to move > this buildbot back to the unstable category. +1 The bot was very stable and fast for some time but has been unstable for at least a year. > - PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers, > test_socket, test_distutils, test_asyncio, (...); random timeout > failure in test_eintr, etc. I don't have access to AIX and I'm not > interested to acquire an AIX license, nor to install it. I'm not sure > that it's useful to have an AIX buildbot and no core developer have > access to AIX, and nobody is working on AIX failures. Maybe HP wants > to help us to support AIX? (Provide manpower, access to AIX servers, > or something like that.) Well, I think in this case it's the gcc AIX maintainer running it, so... I think we should have a policy to stop reporting issues on unstable bots unless someone has a concrete fix OR the bot maintainers are known to fix issues fast (but that does not seem to be the case). Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
On 13/04/2016 12:40, Victor Stinner wrote: > Last months, most 3.x buildbots failed randomly. Some of them were > always failing. I spent some time to fix almost all Windows and Linux > buildbots. There were a lot of different issues. Can I state the obvious and offer a huge vote of thanks for this work, which is often tedious and unrewarding? Thank you TJG ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
On 4/13/2016 7:40 AM, Victor Stinner wrote: > Last months, most 3.x buildbots failed randomly. Some of them were > always failing. I spent some time to fix almost all Windows and Linux > buildbots. There were a lot of different issues. Thanks for all of your work on this, Victor. It's much appreciated. Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
On Wed, Apr 13, 2016 at 9:40 PM, Victor Stinner wrote: > Maybe it's time to move more 3.x buildbots to the "stable" category? > http://buildbot.python.org/all/waterfall?category=3.x.stable Move the Bruces into stable, perhaps? The AMD64 Debian Root one. Been fairly consistently green. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Most 3.x buildbots are green again, please don't break them and watch them!
Hi, Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues. So please try to not break buildbots again and remind to watch them sometimes: http://buildbot.python.org/all/waterfall?category=3.x.stable&category=3.x.unstable Next weeks, I will try to backport some fixes to Python 3.5 (if needed) to make these buildbots more stable too. Python 2.7 buildbots are also in a sad state (ex: test_marshal segfaults on Windows, see issue #25264). But it's not easy to get a Windows with the right compiler to develop on Python 2.7 on Windows. -- Maybe it's time to move more 3.x buildbots to the "stable" category? http://buildbot.python.org/all/waterfall?category=3.x.stable By the way, I don't understand why "AMD64 OpenIndiana 3.x" is considered as stable since it's failing with multiple issues since many months and nobody is working on these failures. I suggest to move this buildbot back to the unstable category. -- We have many offline buildbots. What's the status of these buildbots? Should we expect that they come back soon? Or would it be possible to hide them? It would help to check the status of all buildbots. -- Failing buildbots: - AMD64 FreeBSD CURRENT 3.x: http://bugs.python.org/issue26566 -- I installed a fresh FreeBSD CURRENT in a VM and I'm unable to reproduce failures. Maybe the buildbot slave is oudated and FreeBSD must be upgraded? - AMD64 OpenIndiana 3.x, x86 OpenIndiana 3.x: test_socket failures on sendfile. Sorry but I'm not really interested by this OS. - PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers, test_socket, test_distutils, test_asyncio, (...); random timeout failure in test_eintr, etc. I don't have access to AIX and I'm not interested to acquire an AIX license, nor to install it. I'm not sure that it's useful to have an AIX buildbot and no core developer have access to AIX, and nobody is working on AIX failures. Maybe HP wants to help us to support AIX? (Provide manpower, access to AIX servers, or something like that.) - x86 OpenBSD 3.x: 5 tests failed, test_crypt test_socket test_ssl test_strptime test_time. This OS needs some love ;-) - the 4 ICC buildbots are failing with stack overflow, segfault, etc. Again, I'm not sure that these buildbots are useful since it looks like we don't support this compiler yet. Or does it help to work on supporting this compiler? Who is working on ICC support? -- FYI I also made some enhancements on regrtest (our test runner for the test suite), mostly to debug failures: - display the duration of tests taking longer than 30 seconds - new timestamp prefix, used to debug buildbot hangs - when parallel tests are interrupted, display progress on waiting for completion - add timeout to main process when using -jN: it should help to debug buildbot hang - "Run tests in parallel using 3 child processes" or "Run tests sequentially" message which helps to understand how tests are running. There is the -j1 trap which has no effect: tests are still run sequentially. By the way, I proposed to really use subprocesses when -j1 is used: http://bugs.python.org/issue25285 The default timeout changed from 1 hour to 15 min, it's the maximum duration to run a single test file (ex: test_os.py). On my Linux box, running the whole test suite in parallel (10 child processes for my 4 CPU cores with hyperthreading) with Python compiled in debug mode (slow) takes 4 min 37 sec. Tell me if the default timeout is too low. It can be configured per buildbot if needed (TESTTIMEOUT env var). -- By the way, I'm always surprised by the huge difference of time needed to run a build on the different slaves: from a few minutes to more than 3 hours. The fatest Windows slave takes 28 minutes (run tests in parallel using 4 child processes), whereas the 3 others (run tests sequentially and) take between 2 hours and more than 3 hours! Why running tests on Windows takes so long? Maybe we should make sure that no buildbot run tests sequentially, because it creates a lot of annoying side effects (even if sometimes it helps to find tricky bugs, sometimes bugs restricted to the tests themself) and because a lot of time simply wait a few seconds. So running mutliple tests in parallel don't burn your CPU, it's just faster. IMHO the risk of random timeout failures is low compared to the speedup. -- The most interesting bug was a deadlock in locale.setlocale() on Windows 7: the bug made the buildbot to hang "sometimes" (randomly). Jeremy Kloth identified the bug, but Steve Dower noticed us that it's already fixed in Visual Studio 2015 Update 1: so please update VS if it's not the case yet. Steve added a post-build test to check if the ucrtbase/ucrtbased DLL has the known bug. => http://bugs.python.org/issue26624 Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailm