Re: [Python-ideas] Verbatim names (allowing keywords as names)
On 16.05.2018 02:41, Steven D'Aprano wrote: Some examples: result = \except + 1 result = something.\except result = \except.\finally Maybe that could get combined with Guido's original suggestion by making the \ optional after a .? Example: class A (): \global = 'Hello' def __init__(self): self.except = 0 def \finally(self): return 'bye' print(A.global) a = A() a.except += 1 print(a.finally()) or with a module, in my_module.py: \except = 0 elsewhere: import my_module print(my_module.except) or from my_module import \except print(\except) Best, Wolfgang ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Have a "j" format option for lists
On 05/09/2018 02:39 PM, Facundo Batista wrote: This way, I could do: authors = ["John", "Mary", "Estela"] "Authors: {:, j}".format(authors) 'Authors: John, Mary, Estela' In this case the join can be made in the format yes, but this proposal would be very useful when the info to format comes inside a structure together with other stuff, like... info = { ... 'title': "A book", ... 'price': Decimal("2.34"), ... 'authors: ["John", "Mary", "Estela"], ... } ... print("{title!r} (${price}) by {authors:, j}".format(**info)) "A book" ($2.34) by John, Mary, Estela What do you think? For reference (first message of a rather long previous thread): https://mail.python.org/pipermail/python-ideas/2015-September/035787.html ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Official site-packages/test directory
On 01/19/2018 05:48 PM, Guido van Rossum wrote: On Fri, Jan 19, 2018 at 8:30 AM, Wolfgang Maier <mailto:wolfgang.ma...@biologie.uni-freiburg.de>> wrote: I think that's a really nice idea. With an official site-packages/test directory there could be pip support for optionally installing tests alongside a package if its layout allows it. So end users could just install things without tests, but developers could do: pip install --with-tests or something to get everything? Oh, I just realized there's another problem here. The existing 'test' package (which is not a namespace package) would hide the site-packages/test directory. Well, that shouldn't be a big obstacle since one could just as well choose another name ( __tests__ for example?). Alternatively, package-specific test directories could exist *inside* site-packages. So much like today's .dist-info directories there could be .test dirs? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Official site-packages/test directory
On 01/19/2018 03:27 PM, Stefan Krah wrote: Hello, I wonder if we could get an official site-packages/test directory. Currently it seems to be problematic to distribute tests if they are outside the package directory. Here is a nice overview of the two main layout possibilities: http://pytest.readthedocs.io/en/reorganize-docs/new-docs/user/directory_structure.html I like the outside-the-package approach, mostly for reasons described very eloquently here: http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html CPython itself of course also uses Lib/foo.py and Lib/test/test_foo.py, so it would make sense to have site-packages/foo.py and site-packages/test/test_foo.py. For me, this is the natural layout. I think that's a really nice idea. With an official site-packages/test directory there could be pip support for optionally installing tests alongside a package if its layout allows it. So end users could just install things without tests, but developers could do: pip install --with-tests or something to get everything? Wolfgang ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] tweaking the file system path protocol
On 05/29/2017 09:55 AM, Serhiy Storchaka wrote: 29.05.17 00:33, Wolfgang Maier пише: The path protocol does *not* use __fspath__ as an indicator that an object's str-representation is intended to be used as a path. If you had wanted this, the PEP should have defined __fspath__ not as a method, but as a flag and have the protocol check that flag, then call __str__ if appropriate. __fspath__ is a method because there is a need to support bytes paths. __fspath__() can return a bytes object, str() can't. That's certainly one reason, but again just shows that calling str(path_object) to get a path representation is wrong. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] tweaking the file system path protocol
On 28.05.2017 18:32, Steven D'Aprano wrote: On Sun, May 28, 2017 at 05:35:38PM +0300, Koos Zevenhoven wrote: Don't get me wrong, I like consistency very much. But regarding the __fspath__ case, there are not that many people *writing* fspath-enabled classes. Instead, there are many many many more people *using* such classes (and dealing with their compatibility issues in different ways). What sort of compatibility issues are you referring to? os.fspath is new in 3.6, and 3.7 isn't out yet, so I'm having trouble understanding what compatibility issues you mean. As far as I'm aware the only such issue people had was with building interfaces that could deal with regular strings and pathlib.Path (introduced in 3.4 if I remember correctly) instances alike. Because calling str on a pathlib.Path instance returns the path as a regular string it looked like it could become a (bad) habit to just always call str on any received object for "compatibility" with both types of path representations. The path protocol is a response to this that provides an explicit and safe alternative. For those people, the current behavior brings consistency That's a very unintuitive statement. How is it consistent for fspath to call the __fspath__ dunder method for some objects but ignore it for others? The path protocol brings a standard way of dealing with diverse path representations, but only if you use it. If people keep using str(path_object) as before, then they are doing things wrongly and are no better or safer off than they were before! The path protocol does *not* use __fspath__ as an indicator that an object's str-representation is intended to be used as a path. If you had wanted this, the PEP should have defined __fspath__ not as a method, but as a flag and have the protocol check that flag, then call __str__ if appropriate. With __fspath__ being a method that can return whatever its author sees fit, calling str to get a path from an arbitrary object is just as wrong as it always was - it will work for pathlib.Path objects and might or might not work for some other types. Importantly, this has nothing to do with this proposal, but is in the nature of the protocol as it is defined *now*. ---after all, it was of course designed by thinking about it from all angles and not just based on my or anyone else's own use cases only. Can explain the reasoning to us? I don't think it is explained in the PEP. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] tweaking the file system path protocol
On 05/24/2017 02:41 AM, Steven D'Aprano wrote: On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote: It seems to me that the purpose of this proposition is not performance, but the possibility to use __fspath__ in str or bytes subclasses. Currently defining __fspath__ in str or bytes subclasses doesn't have any effect. That's how I interpreted the proposal, with any performance issue being secondary. (I don't expect that converting path-like objects to strings would be the bottleneck in any application doing actual disk IO.) I don't know a reasonable use case for this feature. The __fspath__ method of str or bytes subclasses returning something not equivalent to self looks confusing to me. I can imagine at least two: - emulating something like DOS 8.3 versus long file names; - case normalisation but what would make this really useful is for debugging. For instance, I have used something like this to debug problems with int() being called wrongly: py> class MyInt(int): ... def __int__(self): ... print("__int__ called") ... return super().__int__() ... py> x = MyInt(23) py> int(x) __int__ called 23 It would be annoying and inconsistent if int(x) avoided calling __int__ on int subclasses. But that's exactly what happens with fspath and str. I see that as a bug, not a feature: I find it hard to believe that we would design an interface for string-like objects (paths) and then intentionally prohibit it from applying to strings. And if we did, surely its a misfeature. Why *shouldn't* subclasses of str get the same opportunity to customize the result of __fspath__ as they get to customize their __repr__ and __str__? py> class MyStr(str): ... def __repr__(self): ... return 'repr' ... def __str__(self): ... return 'str' ... py> s = MyStr('abcdef') py> repr(s) 'repr' py> str(s) 'str' This is almost exactly what I have been thinking (just that I couldn't have presented it so clearly)! Lets look at a potential usecase for this. Assume that in a package you want to handle several paths to different files and directories that are all located in a common package-specific parent directory. Then using the path protocol you could write this: class PackageBase (object): basepath = '/home/.package' class PackagePath (str, PackageBase): def __fspath__ (): return os.path.join(self.basepath, str(self)) config_file = PackagePath('.config') log_file = PackagePath('events.log') data_dir = PackagePath('data') with open(log_file) as log: log.write('package paths initialized.\n') Just that this wouldn't currently work because PackagePath inherits from str. Of course, there are other ways to achieve the above, but when you think about designing a Path-like object class str is just a pretty attractive base class to start from. Now lets look at compatibility of a class like PackagePath under this proposal: - if client code uses e.g. str(config_file) and proceeds to treat the resulting object as a path unexpected things will happen and, yes, that's bad. However, this is no different from any other Path-like object for which __str__ and __fspath__ don't define the same return value. - if client code uses the PEP-recommended backwards-compatible way of dealing with paths, path.__fspath__() if hasattr(path, "__fspath__") else path things will just work. Interstingly, this would *currently* produce an unexpected result namely that it would execute the__fspath__ method of the str-subclass - if client code uses instances of PackagePath as paths directly then in Python3.6 and below that would lead to unintended outcome, while in Python3.7 things would work. This is *really* bad. But what it means is that, under the proposal, using a str or bytes subclass with an __fspath__ method defined makes your code backwards-incompatible and the solution would be not to use such a class if you want to be backwards-compatible (and that should get documented somewhere). This restriction, of course, limits the usefulness of the proposal in the near future, but that disadvantage will vanish over time. In 5 years, not supporting Python3.6 anymore maybe won't be a big deal anymore (for comparison, Python3.2 was released 6 years ago and since last years pip is no longer supporting it). As Steven pointed out the proposal is *very* unlikely to break existing code. So to summarize, the proposal - avoids an up-front isinstance check in the protocol and thereby speeds up the processing of exact strings and bytes and of anything that follows the path protocol.* - slows down the processing of instances of regular str and bytes subclasses* - makes the "path.__fspath__() if hasattr(path, "__fspath__") else path" idiom consistent for subclasses of str and bytes that define __fspath__ - opens up the opportunity to write str/bytes subclasses that represent a path other than just their self in the future*
Re: [Python-ideas] tweaking the file system path protocol
On 05/23/2017 06:17 PM, Koos Zevenhoven wrote: On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier wrote: What do you think of this idea for a slight modification to os.fspath: the current version checks whether its arg is an instance of str, bytes or any subclass and, if so, returns the arg unchanged. In all other cases it tries to call the type's __fspath__ method to see if it can get str, bytes, or a subclass thereof this way. My proposal is to change this to: 1) check whether the type of the argument is str or bytes *exactly*; if so, return the argument unchanged 2) check wether __fspath__ can be called on the type and returns an instance of str, bytes, or any subclass (just like in the current version) 3) check whether the type is a subclass of str or bytes and, if so, return it unchanged Hi Koos and thanks for your detailed response, The reason why this was not done was that a str or bytes subclass that implements __fspath__(self) would work in both pre-3.6 and 3.6+ but behave differently. This would be also be incompatible with existing code using str(path) for compatibility with the stdlib (the old way, which people still use for pre-3.6 compatibility even in new code). I'm not sure that sounds very convincing because that exact problem exists, was discussed and accepted in your PEP 519 for all other classes. I do not really see why subclasses of str and bytes should require special backwards compatibility here. Is there a reason why you are thinking they should be treated specially? This would have the following implications: a) it would speed up the very common case when the arg is either a str or a bytes instance exactly To get the same performance benefit for str and bytes, but without changing functionality, there could first be the exact type check and then the isinstance check. This would add some performance penalty for PathLike objects. Removing the isinstance part of the __fspath__() return value, which I find less useful, would compensate for that. (3) would not be necessary in this version. Right, that was one thing I forgot to mention in my list. My proposal would also speed up processing of pathlike objects because it moves the __fspath__ call up in front of the isinstance check. Your alternative would speed up only str and bytes, but would slow down Path-like classes. In addition, I'm not sure that removing the isinstance check on the return value of __fspath__() is a good idea because that would mean giving up the guarantee that os.fspath returns an instance of str or bytes and would effectively force library code to do the isinstance check anyway even if the function may have performed it already, which would worsen performance further. Are you asking for other reasons, or because you actually have a use case where this matters? If this performance really matters somewhere, the version I describe above could be considered. It would have 100% backwards compatibility, or a little less (99% ?) if the isinstance check of the __fspath__() return value is removed for performance compensation. That use case question is somewhat difficult to answer. I had this idea when working on two bug tracker issues (one concerning fnmatch and a follow-up one on os.path.normcase, which is called by fnmatch.filter and, in turn, calls os.fspath. fnmatchfilter is a case where performance matters and the decision when and where to call the rather expensive os.path.normcase->os.fspath there is not entirely straightforward. So, yes, I was basically looking at this because of a potential use case, but I say potential because I'm far from sure that any speed gain in os.fspath will be big enough to be useful for fnmatch.filter in the end. b) user-defined classes that inherit from str or bytes could control their path representation just like any other class Again, this would cause differences in behavior between different Python versions, and based on whether str(path) is used or not. —Koos ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] tweaking the file system path protocol
On 05/23/2017 06:41 PM, Wolfgang Maier wrote: On 05/23/2017 06:17 PM, Koos Zevenhoven wrote: On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier wrote: What do you think of this idea for a slight modification to os.fspath: the current version checks whether its arg is an instance of str, bytes or any subclass and, if so, returns the arg unchanged. In all other cases it tries to call the type's __fspath__ method to see if it can get str, bytes, or a subclass thereof this way. My proposal is to change this to: 1) check whether the type of the argument is str or bytes *exactly*; if so, return the argument unchanged 2) check wether __fspath__ can be called on the type and returns an instance of str, bytes, or any subclass (just like in the current version) 3) check whether the type is a subclass of str or bytes and, if so, return it unchanged Hi Koos and thanks for your detailed response, The reason why this was not done was that a str or bytes subclass that implements __fspath__(self) would work in both pre-3.6 and 3.6+ but behave differently. This would be also be incompatible with existing code using str(path) for compatibility with the stdlib (the old way, which people still use for pre-3.6 compatibility even in new code). I'm not sure that sounds very convincing because that exact problem exists, was discussed and accepted in your PEP 519 for all other classes. I do not really see why subclasses of str and bytes should require special backwards compatibility here. Is there a reason why you are thinking they should be treated specially? Ah, sorry, I misunderstood what you were trying to say, but now I'm getting it! subclasses of str and bytes were of course usable as path arguments before simply because they were subclasses of them. Now they would be picked up based on their __fspath__ method, but old versions of Python executing code using them would still use them directly. Have to think about this one a bit, but thanks for pointing it out. This would have the following implications: a) it would speed up the very common case when the arg is either a str or a bytes instance exactly To get the same performance benefit for str and bytes, but without changing functionality, there could first be the exact type check and then the isinstance check. This would add some performance penalty for PathLike objects. Removing the isinstance part of the __fspath__() return value, which I find less useful, would compensate for that. (3) would not be necessary in this version. Right, that was one thing I forgot to mention in my list. My proposal would also speed up processing of pathlike objects because it moves the __fspath__ call up in front of the isinstance check. Your alternative would speed up only str and bytes, but would slow down Path-like classes. In addition, I'm not sure that removing the isinstance check on the return value of __fspath__() is a good idea because that would mean giving up the guarantee that os.fspath returns an instance of str or bytes and would effectively force library code to do the isinstance check anyway even if the function may have performed it already, which would worsen performance further. Are you asking for other reasons, or because you actually have a use case where this matters? If this performance really matters somewhere, the version I describe above could be considered. It would have 100% backwards compatibility, or a little less (99% ?) if the isinstance check of the __fspath__() return value is removed for performance compensation. That use case question is somewhat difficult to answer. I had this idea when working on two bug tracker issues (one concerning fnmatch and a follow-up one on os.path.normcase, which is called by fnmatch.filter and, in turn, calls os.fspath. fnmatchfilter is a case where performance matters and the decision when and where to call the rather expensive os.path.normcase->os.fspath there is not entirely straightforward. So, yes, I was basically looking at this because of a potential use case, but I say potential because I'm far from sure that any speed gain in os.fspath will be big enough to be useful for fnmatch.filter in the end. b) user-defined classes that inherit from str or bytes could control their path representation just like any other class Again, this would cause differences in behavior between different Python versions, and based on whether str(path) is used or not. —Koos ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] tweaking the file system path protocol
What do you think of this idea for a slight modification to os.fspath: the current version checks whether its arg is an instance of str, bytes or any subclass and, if so, returns the arg unchanged. In all other cases it tries to call the type's __fspath__ method to see if it can get str, bytes, or a subclass thereof this way. My proposal is to change this to: 1) check whether the type of the argument is str or bytes *exactly*; if so, return the argument unchanged 2) check wether __fspath__ can be called on the type and returns an instance of str, bytes, or any subclass (just like in the current version) 3) check whether the type is a subclass of str or bytes and, if so, return it unchanged This would have the following implications: a) it would speed up the very common case when the arg is either a str or a bytes instance exactly b) user-defined classes that inherit from str or bytes could control their path representation just like any other class c) subclasses of str/bytes that don't define __fspath__ would still work like they do now, but their processing would be slower d) subclasses of str/bytes that accidentally define a __fspath__ method would change their behavior I think cases c) and d) could be sufficiently rare that the pros outweigh the cons? Here's how the proposal could be implemented in the pure Python version (os._fspath): def _fspath(path): path_type = type(path) if path_type is str or path_type is bytes: return path # Work from the object's type to match method resolution of other magic # methods. try: path_repr = path_type.__fspath__(path) except AttributeError: if hasattr(path_type, '__fspath__'): raise elif issubclass(path_type, (str, bytes)): return path else: raise TypeError("expected str, bytes or os.PathLike object, " "not " + path_type.__name__) if isinstance(path_repr, (str, bytes)): return path_repr else: raise TypeError("expected {}.__fspath__() to return str or bytes, " "not {}".format(path_type.__name__, type(path_repr).__name__)) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] fnmatch.filter_false
On 19.05.2017 20:01, tritium-l...@sdamon.com wrote: -Original Message- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon@python.org] On Behalf Of Wolfgang Maier Sent: Friday, May 19, 2017 10:03 AM To: python-ideas@python.org Subject: Re: [Python-ideas] fnmatch.filter_false On 05/17/2017 07:55 PM, tritium-l...@sdamon.com wrote: Top posting, apologies. I'm sure there is a better way to do it, and there is a performance hit, but its negligible. This is also a three line delta of the function. from fnmatch import _compile_pattern, filter as old_filter import os import os.path import posixpath data = os.listdir() def filter(names, pat, *, invert=False): """Return the subset of the list NAMES that match PAT.""" result = [] pat = os.path.normcase(pat) match = _compile_pattern(pat) if os.path is posixpath: # normcase on posix is NOP. Optimize it away from the loop. for name in names: if bool(match(name)) == (not invert): result.append(name) else: for name in names: if bool(match(os.path.normcase(name))) == (not invert): result.append(name) return result if __name__ == '__main__': import timeit print(timeit.timeit( "filter(data, '__*')", setup="from __main__ import filter, data" )) print(timeit.timeit( "filter(data, '__*')", setup="from __main__ import old_filter as filter, data" )) The first test (modified code) timed at 22.492161903402575, where the second test (unmodified) timed at 19.31892032324 If you don't care about slow-downs in this range, you could use this pattern: excluded = set(filter(data, '__*')) result = [item for item in data if item not in excluded] It seems to take just as much longer although the slow-down is not constant but depends on the size of the set you need to generate. Wolfgang If I didn't care about performance, I wouldn't be using filter - the only reason to use filter over a list comprehension is performance. The standard library has a performant inclusion filter, but does not have a performant exclusion filter. I'm sorry, but then your statement above doesn't make any sense to me: "I'm sure there is a better way to do it, and there is a performance hit, but its negligible." I'm proposing an alternative to you which times in very similarly to your own suggestion without copying or modifying stdlib code. That said I still like your idea of adding the exclude functionality to fnmatch. I just thought you may be interested in a solution that works right now. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] fnmatch.filter_false
On 05/17/2017 07:55 PM, tritium-l...@sdamon.com wrote: Top posting, apologies. I'm sure there is a better way to do it, and there is a performance hit, but its negligible. This is also a three line delta of the function. from fnmatch import _compile_pattern, filter as old_filter import os import os.path import posixpath data = os.listdir() def filter(names, pat, *, invert=False): """Return the subset of the list NAMES that match PAT.""" result = [] pat = os.path.normcase(pat) match = _compile_pattern(pat) if os.path is posixpath: # normcase on posix is NOP. Optimize it away from the loop. for name in names: if bool(match(name)) == (not invert): result.append(name) else: for name in names: if bool(match(os.path.normcase(name))) == (not invert): result.append(name) return result if __name__ == '__main__': import timeit print(timeit.timeit( "filter(data, '__*')", setup="from __main__ import filter, data" )) print(timeit.timeit( "filter(data, '__*')", setup="from __main__ import old_filter as filter, data" )) The first test (modified code) timed at 22.492161903402575, where the second test (unmodified) timed at 19.31892032324 If you don't care about slow-downs in this range, you could use this pattern: excluded = set(filter(data, '__*')) result = [item for item in data if item not in excluded] It seems to take just as much longer although the slow-down is not constant but depends on the size of the set you need to generate. Wolfgang ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Way to repeat other than "for _ in range(x)"
On 03/30/2017 04:23 PM, Pavol Lisy wrote: On 3/30/17, Nick Coghlan wrote: On 30 March 2017 at 19:18, Markus Meskanen wrote: d = [[0] * 5 for _ in range(10)] d = [[0]*5]*10 # what about this? These are not quite the same when the repeated object is mutable. Compare: >>> matrix1 = [[0] * 5 for _ in range(10)] >>> matrix1[0].append(1) >>> matrix1 [[0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] >>> matrix2=[[0]*5]*10 >>> matrix2[0].append(1) >>> matrix2 [[0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1]] so the comprehension is usually necessary. Wolfgang ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] for/except/else
On 03/03/2017 04:36 AM, Nick Coghlan wrote: On 2 March 2017 at 21:06, Wolfgang Maier mailto:wolfgang.ma...@biologie.uni-freiburg.de>> wrote: - overall I looked at 114 code blocks that contain one or more breaks Thanks for doing that research :) Of the remaining 19 non-trivial cases - 9 are variations of your classical search idiom above, i.e., there's an else clause there and nothing more is needed - 6 are variations of your "nested side-effects" form presented above with debatable (see above) benefit from except break - 2 do not use an else clause currently, but have multiple breaks that do partly redundant things that could be combined in a single except break clause Those 8 cases could also be reviewed to see whether a flag variable might be clearer than relying on nested side effects or code repetition. [...] This is a case where a flag variable may be easier to read than loop state manipulations: may_have_common_prefix = True while may_have_common_prefix: prefix = None for item in items: if not item: may_have_common_prefix = False break if prefix is None: prefix = item[0] elif item[0] != prefix: may_have_common_prefix = False break else: # all subitems start with a common "prefix". # move it out of the branch for item in items: del item[0] subpatternappend(prefix) Although the whole thing could likely be cleaned up even more via itertools.zip_longest: for first_uncommon_idx, aligned_entries in enumerate(itertools.zip_longest(*items)): if not all_true_and_same(aligned_entries): break else: # Everything was common, so clear all entries first_uncommon_idx = None for item in items: del item[:first_uncommon_idx] (Batching the deletes like that may even be slightly faster than deleting common entries one at a time) Given the following helper function: def all_true_and_same(entries): itr = iter(entries) try: first_entry = next(itr) except StopIteration: return False if not first_entry: return False for entry in itr: if not entry or entry != first_entry: return False return True - finally, 1 is a complicated break dance to achieve sth that clearly would have been easier with except break; from typing.py: [...] I think is another case that is asking for the inner loop to be factored out to a named function, not for reasons of re-use, but for reasons of making the code more readable and self-documenting :) It's true that using a flag or factoring out redundant code is always a possibility. Having the except clause would clearly not let people do anything they couldn't have done before. On the other hand, the same is true for the else clause - it's only advantage here is that it's existing already - because a single flag could always distinguish between a break having occurred or not: brk = False for item in iterable: if some_condition: brk = True break if brk: do_stuff_upon_breaking_out() else: do_alternative_stuff() is a general pattern that would always work without except *and* else. However, the fact that else exists generates a regrettable asymmetry in that there is direct language support for detecting one outcome, but not the other. Stressing the analogy to try/except/else one more time, it's as if "else" wasn't available for try blocks. You could always use a flag to substitute for it: dealt_with_exception = False try: do_stuff() except: deal_with_exception() dealt_with_exception = True if dealt_with_exception: do_stuff_you_would_do_in_an_else_block() So IMO the real difference here is that the except clause after for would require adding it to the language, while the else clauses are there already. With that we're back at the high bar for adding new syntax :( A somewhat similar case that comes to mind here is PEP 315 -- Enhanced While Loop, which got rejected for two reasons, the first one being pretty much the same as the argument here, i.e., that instead of the proposed do .. while it's always possible to factor out or duplicate a line of code. However, the second reason was that it required the new "do" keyword, something not necessary for the current suggestion. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] for/except/else
On 03/02/2017 07:05 PM, Brett Cannon wrote: - overall I looked at 114 code blocks that contain one or more breaks I wanted to say thanks for taking the time to go through the stdlib and doing such a thorough analysis of the impact of your suggestion! It always helps to have real-world numbers to know whether an idea will be useful (or not). - 84 of these are trivial use cases that simply break out of a while True block or terminate a while/for loop prematurely (no use for any follow-up clause there) - 8 more are causing a side-effect before a single break, and it would be pointless to put this into an except break clause - 3 more cause different, non-redundant side-effects before different breaks from the same loop and, obviously, an except break clause would not help them either => So the vast majority of breaks does *not* need an except break *nor* an else clause, but that's just as expected. Of the remaining 19 non-trivial cases - 9 are variations of your classical search idiom above, i.e., there's an else clause there and nothing more is needed - 6 are variations of your "nested side-effects" form presented above with debatable (see above) benefit from except break - 2 do not use an else clause currently, but have multiple breaks that do partly redundant things that could be combined in a single except break clause - 1 is an example of breaking out of two loops; from sre_parse._parse_sub: [...] - finally, 1 is a complicated break dance to achieve sth that clearly would have been easier with except break; from typing.py: My summary: I do see use-cases for the except break clause, but, admittedly, they are relatively rare and may be not worth the hassle of introducing new syntax. IOW out of 114 cases, 4 may benefit from an 'except' block? If I'm reading those numbers correctly then ~3.5% of cases would benefit which isn't high enough to add the syntax and related complexity IMO. Hmm, I'm not sure how much sense it makes to express this in percent since the total your comparing to is rather arbitrary. The 114 cases include *any* for/while loop I could find that contains at least a single break. More than 90 of these loops do not use an "else" clause either showing that even this currently supported syntax is used rarely. I found only 19 cases that are complex enough to be candidates for an except clause (17 of these use the else clause). For 9 of these 19 (the ones using the classical search idiom) an except clause would not be applicable, but it could be used in the 10 remaining cases (though all of them could also make use of a flag or could be refactored instead). So depending on what you want to emphasize you could also say that the proposal could affect as much as 10/19 or 52.6% of cases. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] for/except/else
On 02.03.2017 06:46, Nick Coghlan wrote: On 1 March 2017 at 19:37, Wolfgang Maier mailto:wolfgang.ma...@biologie.uni-freiburg.de>> wrote: Now here's the proposal: allow an except (or except break) clause to follow for/while loops that will be executed if the loop was terminated by a break statement. Now while it's possible that Nick had a good reason not to do so, I never really thought about it, as I only use the "else:" clause for search loops where there aren't any side effects in the "break" case (other than the search result being bound to the loop variable), so while I find "except break:" useful as an explanatory tool, I don't have any practical need for it. I think you've made as strong a case for the idea as could reasonably be made :) However, Steven raises a good point that this would complicate the handling of loops in the code generator a fair bit, as it would add up to two additional jump targets in cases wherever the new clause was used. Currently, compiling loops only needs to track the start of the loop (for continue), and the first instruction after the loop (for break). With this change, they'd also need to track: - the start of the "except break" clause (for break when the clause is used) - the start of the "else" clause (for the non-break case when both trailing clauses are present) I think you could get away with only one additional jump target as I showed in my previous reply to Steven. The heavier burden would be on the parser, which would have to distinguish the existing and the two new loop variants (loop with except clause, loop with except and else clause) but, anyway, that's probably not really the point. What weighs heavier, I think, is your design argument. The design level argument against adding the clause is that it breaks the "one obvious way" principle, as the preferred form for search loops look like this: for item in iterable: if condition(item): break else: # Else clause either raises an exception or sets a default value item = get_default_value() # If we get here, we know "item" is a valid reference operation(item) And you can easily switch the `break` out for a suitable `return` if you move this into a helper function: def find_item_of_interest(iterable): for item in iterable: if condition(item): return item # The early return means we can skip using "else" return get_default_value() Given that basic structure as a foundation, you only switch to the "nested side effect" form if you have to: for item in iterable: if condition(item): operation(item) break else: # Else clause neither raises an exception nor sets a default value condition_was_never_true(iterable) This form is generally less amenable to being extracted into a reusable helper function, since it couples the search loop directly to the operation performed on the bound item, whereas decoupling them gives you a lot more flexibility in the eventual code structure. The proposal in this thread then has the significant downside of only covering the "nested side effect" case: for item in iterable: if condition(item): break except break: operation(item) else: condition_was_never_true(iterable) While being even *less* amenable to being pushed down into a helper function (since converting the "break" to a "return" would bypass the "except break" clause). I'm actually not quite buying this last argument. If you wanted to refactor this to "return" instead of "break", you could simply put the return into the except break block. In many real-world situations with multiple breaks from a loop this could actually make things easier instead of worse. Personally, the "nested side effect" form makes me uncomfortable every time I use it because the side effects on breaking or not breaking the loop don't end up at the same indentation level and not necessarily together. However, I'm gathering from the discussion so far that not too many people are thinking like me about this point, so maybe I should simply adjust my mind-set. All that said, this is a very nice abstract view on things! I really learned quite a bit from this, thank you :) As always though, reality can be expected to be quite a bit more complicated than theory so I decided to check the stdlib for real uses of break. This is quite a tedious task since break is used in many different ways and I couldn't come up with a good automated way of classifying them. So what I did is just go through stdlib code (in reverse alphabetical order) containing the break keyword and put it i
Re: [Python-ideas] for/except/else
On 01.03.2017 12:56, Steven D'Aprano wrote: - How is this implemented? Currently "break" is a simple unconditional GOTO which jumps past the for block. This will need to change to something significantly more complex. one way to implement this with unconditional GOTOs would be (in pseudocode): LOOP: on break GOTO EXCEPT ELSE: ... GOTO THEN EXCEPT: ... THEN: ... So at the byte-code level (but only there) the order of except and else would be reversed. Was that a reason why you were asking about the order of except and else in my proposal? Anyway, I'm sure there are people much more skilled at compiler programming than me here. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] for/except/else
On 01.03.2017 12:56, Steven D'Aprano wrote: On Wed, Mar 01, 2017 at 10:37:17AM +0100, Wolfgang Maier wrote: Now here's the proposal: allow an except (or except break) clause to follow for/while loops that will be executed if the loop was terminated by a break statement. Let me see if I understand the proposal in full. You would allow: for i in (1, 2, 3): print(i) if i == 2: break except break: # or just except assert i == 2 print("a break was executed") else: print("never reached") # this is never reached print("for loop is done") as an alternative to something like: broke_out = False for i in (1, 2, 3): print(i) if i == 2: broke_out = True break else: print("never reached") # this is never reached if broke_out: assert i == 2 print("a break was executed") print("for loop is done") correct. I must admit the suggestion seems a little bit neater than having to manage a flag myself, but on the other hand I can't remember the last time I've needed to manage a flag like that. And on the gripping hand, this is even simpler than both alternatives: for i in (1, 2, 3): print(i) if i == 2: assert i == 2 print("a break was executed") break else: print("never reached") # this is never reached print("for loop is done") Right, that's how you'd likely implement the behavior today, but see my argument about the two alternative code branches not ending up together at the same level of indentation. There are some significant unanswered questions: - Does it matter which order the for...except...else are in? Obviously the for block must come first, but apart from that? Just like in try/except/else, the order would be for (or while)/except/else with the difference that both except and else would be optional. - How is this implemented? Currently "break" is a simple unconditional GOTO which jumps past the for block. This will need to change to something significantly more complex. Yeah, I know that's why I listed this under cons. - There are other ways to exit a for-loop than just break. Which of them, if any, will also run the except block? None of them (though, honestly, I cannot think of anything but exceptions here; what do you have in mind?) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] for/except/else
I know what the regulars among you will be thinking (time machine, high bar for language syntax changes, etc.) so let me start by assuring you that I'm well aware of all of this, that I did research the topic before posting and that this is not the same as a previous suggestion using almost the same subject line. Now here's the proposal: allow an except (or except break) clause to follow for/while loops that will be executed if the loop was terminated by a break statement. The idea is certainly not new. In fact, Nick Coghlan, in his blog post http://python-notes.curiousefficiency.org/en/latest/python_concepts/break_else.html, uses it to provide a mental model for the meaning of the else following for/while, but, as far as I'm aware, he never suggested to make it legal Python syntax. Now while it's possible that Nick had a good reason not to do so, I think there would be three advantages to this: - as explained by Nick, the existence of "except break" would strengthen the analogy with try/except/else and help people understand what the existing else clause after a loop is good for. There has been much debate over the else clause in the past, most prominently, a long discussion on this list back in 2009 (I recommend interested people to start with Steven D'Aprano's Summary of it at https://mail.python.org/pipermail/python-ideas/2009-October/006155.html) that shows that for/else is misunderstood by/unknown to many Python programmers. - in some situations for/except/else would make code more readable by bringing logical alternatives closer together and to the same indentation level in the code. Consider a simple example (taken from the docs.python Tutorial: for n in range(2, 10): for x in range(2, n): if n % x == 0: print(n, 'equals', x, '*', n//x) break else: # loop fell through without finding a factor print(n, 'is a prime number') There are two logical outcomes of the inner for loop here - a given number can be either prime or not. However, the two code branches dealing with them end up at different levels of indentation and in different places, one inside and one outside the loop block. This second issue can become much more annoying in more complex code where the loop may contain additional code after the break statement. Now compare this to: for n in range(2, 10): for x in range(2, n): if n % x == 0: break except break: print(n, 'equals', x, '*', n//x) else: # loop fell through without finding a factor print(n, 'is a prime number') IMO, this reflects the logic better. - it could provide an elegant solution for the How to break out of two loops issue. This is another topic that comes up rather regularly (python-list, stackoverflow) and there is again a very good blog post about it, this time from Ned Batchelder at https://nedbatchelder.com/blog/201608/breaking_out_of_two_loops.html. Stealing his example, here's code (at least) a newcomer may come up with before realizing it can't work: s = "a string to examine" for i in range(len(s)): for j in range(i+1, len(s)): if s[i] == s[j]: answer = (i, j) break # How to break twice??? with for/except/else this could be written as: s = "a string to examine" for i in range(len(s)): for j in range(i+1, len(s)): if s[i] == s[j]: break except break: answer = (i, j) break So much for the pros. Of course there are cons, too. The classical one for any syntax change, of course, is: - burden on developers who have to implement and maintain the new syntax. Specifically, this proposal would make parsing/compiling of loops more complicated. Others include: - using except will make people think of exceptions and that may cause new confusion; while that's true, I would argue that, in fact, break and exceptions are rather similar features in that they are gotos in disguise, so except will still be used to catch an interruption in normal control flow. - the new syntax will not help people understand for/else if except is not used; importantly, I'm *not* proposing to disallow the use of for/else without except (if that would ever happen it would be in the *very* distant future) so that would indeed mean that people would encounter for/else, not only in legacy, but also in newly written code. However, I would expect that they would also start seeing for/except increasingly (not least because it solves the "break out of two loops" issue) so they would be nudged towards thinking of the else after for/while more like the else in try/except/else just as Nick proposes it. Interestingly, there has been another proposal on this list several years ago about allowing try/else without except, which I liked at the time and which would have made try/except/]else work exactly as my proposed for/except/else. Here it is: https://m
Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library
On 29.11.2016 10:39, Paul Moore wrote: On 28 November 2016 at 22:33, Steve Dower wrote: Given that, this wouldn't necessarily need to be an executable file. The finder could locate a "foo.missing" file and raise ModuleNotFoundError with the contents of the file as the message. No need to allow/require any Python code at all, and no risk of polluting sys.modules. I like this idea. Would it completely satisfy the original use case for the proposal? (Or, to put it another way, is there any specific need for arbitrary code execution in the missing.py file?) The only thing that I could think of so far would be cross-platform .missing.py files that query the system (e.g. using the platform module) to generate adequate messages for the specific platform or distro. E.g., correctly recommend to use dnf install or yum install or apt install, etc. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library
On 28.11.2016 23:52, Chris Angelico wrote: +1, because this also provides a coherent way to reword the try/except import idiom: # Current idiom # somefile.py try: import foo except ImportError: import subst_foo as foo # New idiom: # foo.missing.py import subst_foo as foo import sys; sys.modules["foo"] = foo #somefile.py import foo Hmm. I would rather take this example as an argument against the proposed behavior. It invites too many clever hacks. I thought that the idea was that .missing.py does *not* act as a replacement module, but, more or less, just as a message generator. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library
On 28.11.2016 23:19, Nathaniel Smith wrote: I'd suggest that we additional specify that if we find a foo.missing.py, then the code is executed but -- unlike a regular module load -- it's not automatically inserted into sys.modules["foo"]. That seems like it could only create confusion. And it doesn't restrict functionality, because if someone really wants to implement some clever shenanigans, they can always modify sys.modules["foo"] by hand. This also suggests that the overall error-handling flow for 'import foo' should look like: 1) run foo.missing.py 2) if it raises an exception: propagate that 3) otherwise, if sys.modules["foo"] is missing: raise some variety of ImportError. 4) otherwise, use sys.modules["foo"] as the object that should be bound to 'foo' in the original invoker's namespace I think this might make everyone who was worried about exception handling downthread happy -- it allows a .missing.py file to successfully import if it really wants to, but only if it explicitly fulfills 'import' requirement that the module should somehow be made available. A refined (from my previous post which may have ended up too nested) alternative: instead of triggering an immediate search for a .missing.py file, why not have the interpreter intercept any ModuleNotFoundError that bubbles up to the top without being caught, then uses the name attribute of the exception to look for the .missing.py file. Agreed, this is more complicated to implement, but it would avoid any performance loss in situations where running code knows how to deal with the missing module anyway. Wolfgang ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library
On 28.11.2016 22:26, Paul Moore wrote: On 28 November 2016 at 21:11, Ethan Furman wrote: One "successful" use-case that would be impacted is the fallback import idiom: try: # this would do two full searches before getting the error import BlahBlah except ImportError: import blahblah Under this proposal, the above idiom could potentially now fail. If there's a BlahBlah.missing.py, then that will get executed rather than an ImportError being raised, so the fallback wouldn't be executed. This could actually be a serious issue for code that currently protects against optional stdlib modules not being available like this. There's no guarantee that I can see that a .missing.py file would raise ImportError (even if we said that was the intended behaviour, there's nothing to enforce it). Could the proposal execute the .missing.py file and then raise ImportError? I could imagine that having problems of its own, though... How about addressing both concerns by triggering the search for .missing.py only if an ImportError bubbles up uncaught (a bit similar to StopIteration nowadays)? Wolfgang ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/