Re: [Python-ideas] Verbatim names (allowing keywords as names)

2018-05-16 Thread Wolfgang Maier

On 16.05.2018 02:41, Steven D'Aprano wrote:


Some examples:

 result = \except + 1

 result = something.\except

 result = \except.\finally



Maybe that could get combined with Guido's original suggestion by making 
the \ optional after a .?


Example:

class A ():
\global = 'Hello'
def __init__(self):
self.except = 0

def \finally(self):
return 'bye'

print(A.global)
a = A()
a.except += 1
print(a.finally())

or with a module, in my_module.py:

\except = 0

elsewhere:

import my_module
print(my_module.except)

or

from my_module import \except
print(\except)

Best,
Wolfgang

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Have a "j" format option for lists

2018-05-09 Thread Wolfgang Maier

On 05/09/2018 02:39 PM, Facundo Batista wrote:

This way, I could do:


authors = ["John", "Mary", "Estela"]
"Authors: {:, j}".format(authors)

'Authors: John, Mary, Estela'

In this case the join can be made in the format yes, but this proposal
would be very useful when the info to format comes inside a structure
together with other stuff, like...


info = {

...   'title': "A book",
...   'price': Decimal("2.34"),
...   'authors: ["John", "Mary", "Estela"],
... }
...

print("{title!r} (${price}) by {authors:, j}".format(**info))

"A book" ($2.34) by John, Mary, Estela

What do you think?



For reference (first message of a rather long previous thread): 
https://mail.python.org/pipermail/python-ideas/2015-September/035787.html


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Official site-packages/test directory

2018-01-19 Thread Wolfgang Maier

On 01/19/2018 05:48 PM, Guido van Rossum wrote:
On Fri, Jan 19, 2018 at 8:30 AM, Wolfgang Maier 
<mailto:wolfgang.ma...@biologie.uni-freiburg.de>> wrote:



I think that's a really nice idea.
With an official site-packages/test directory there could be pip
support for optionally installing tests alongside a package if its
layout allows it. So end users could just install things without
tests, but developers could do: pip install  --with-tests
or something to get everything?


Oh, I just realized there's another problem here. The existing 'test' 
package (which is not a namespace package) would hide the 
site-packages/test directory.




Well, that shouldn't be a big obstacle since one could just as well 
choose another name ( __tests__ for example?).
Alternatively, package-specific test directories could exist *inside* 
site-packages. So much like today's .dist-info directories 
there could be .test dirs?

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Official site-packages/test directory

2018-01-19 Thread Wolfgang Maier

On 01/19/2018 03:27 PM, Stefan Krah wrote:


Hello,

I wonder if we could get an official site-packages/test directory.  Currently
it seems to be problematic to distribute tests if they are outside the package
directory.  Here is a nice overview of the two main layout possibilities:

http://pytest.readthedocs.io/en/reorganize-docs/new-docs/user/directory_structure.html


I like the outside-the-package approach, mostly for reasons described very
eloquently here:

http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html


CPython itself of course also uses Lib/foo.py and Lib/test/test_foo.py, so it
would make sense to have site-packages/foo.py and 
site-packages/test/test_foo.py.

For me, this is the natural layout.



I think that's a really nice idea.
With an official site-packages/test directory there could be pip support 
for optionally installing tests alongside a package if its layout allows 
it. So end users could just install things without tests, but developers 
could do: pip install  --with-tests or something to get everything?


Wolfgang

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-29 Thread Wolfgang Maier

On 05/29/2017 09:55 AM, Serhiy Storchaka wrote:

29.05.17 00:33, Wolfgang Maier пише:
The path protocol does *not* use __fspath__ as an indicator that an 
object's str-representation is intended to be used as a path. If you 
had wanted this, the PEP should have defined __fspath__ not as a 
method, but as a flag and have the protocol check that flag, then call 
__str__ if appropriate.


__fspath__ is a method because there is a need to support bytes paths. 
__fspath__() can return a bytes object, str() can't.




That's certainly one reason, but again just shows that calling 
str(path_object) to get a path representation is wrong.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-28 Thread Wolfgang Maier

On 28.05.2017 18:32, Steven D'Aprano wrote:

On Sun, May 28, 2017 at 05:35:38PM +0300, Koos Zevenhoven wrote:


Don't get me wrong, I like consistency very much. But regarding the
__fspath__ case, there are not that many people *writing*
fspath-enabled classes. Instead, there are many many many more people
*using* such classes (and dealing with their compatibility issues in
different ways).


What sort of compatibility issues are you referring to? os.fspath is new
in 3.6, and 3.7 isn't out yet, so I'm having trouble understanding what
compatibility issues you mean.



As far as I'm aware the only such issue people had was with building 
interfaces that could deal with regular strings and pathlib.Path 
(introduced in 3.4 if I remember correctly) instances alike. Because 
calling str on a pathlib.Path instance returns the path as a regular 
string it looked like it could become a (bad) habit to just always call 
str on any received object for "compatibility" with both types of path 
representations. The path protocol is a response to this that provides 
an explicit and safe alternative.





For those people, the current behavior brings consistency


That's a very unintuitive statement. How is it consistent for fspath to
call the __fspath__ dunder method for some objects but ignore it for
others?



The path protocol brings a standard way of dealing with diverse path 
representations, but only if you use it. If people keep using 
str(path_object) as before, then they are doing things wrongly and are 
no better or safer off than they were before! The path protocol does 
*not* use __fspath__ as an indicator that an object's str-representation 
is intended to be used as a path. If you had wanted this, the PEP should 
have defined __fspath__ not as a method, but as a flag and have the 
protocol check that flag, then call __str__ if appropriate.
With __fspath__ being a method that can return whatever its author sees 
fit, calling str to get a path from an arbitrary object is just as wrong 
as it always was - it will work for pathlib.Path objects and might or 
might not work for some other types. Importantly, this has nothing to do 
with this proposal, but is in the nature of the protocol as it is 
defined *now*.





---after all, it was of course designed by thinking about
it from all angles and not just based on my or anyone else's own use
cases only.


Can explain the reasoning to us? I don't think it is explained in the
PEP.




___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-24 Thread Wolfgang Maier

On 05/24/2017 02:41 AM, Steven D'Aprano wrote:

On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote:


It seems to me that the purpose of this proposition is not performance,
but the possibility to use __fspath__ in str or bytes subclasses.
Currently defining __fspath__ in str or bytes subclasses doesn't have
any effect.


That's how I interpreted the proposal, with any performance issue being
secondary. (I don't expect that converting path-like objects to strings
would be the bottleneck in any application doing actual disk IO.)

  

I don't know a reasonable use case for this feature. The __fspath__
method of str or bytes subclasses returning something not equivalent to
self looks confusing to me.


I can imagine at least two:

- emulating something like DOS 8.3 versus long file names;
- case normalisation

but what would make this really useful is for debugging. For instance, I
have used something like this to debug problems with int() being called
wrongly:

py> class MyInt(int):
... def __int__(self):
... print("__int__ called")
... return super().__int__()
...
py> x = MyInt(23)
py> int(x)
__int__ called
23

It would be annoying and inconsistent if int(x) avoided calling __int__
on int subclasses. But that's exactly what happens with fspath and str.
I see that as a bug, not a feature: I find it hard to believe that we
would design an interface for string-like objects (paths) and then
intentionally prohibit it from applying to strings.

And if we did, surely its a misfeature. Why *shouldn't* subclasses of
str get the same opportunity to customize the result of __fspath__ as
they get to customize their __repr__ and __str__?

py> class MyStr(str):
... def __repr__(self):
... return 'repr'
... def __str__(self):
... return 'str'
...
py> s = MyStr('abcdef')
py> repr(s)
'repr'
py> str(s)
'str'



This is almost exactly what I have been thinking (just that I couldn't 
have presented it so clearly)!


Lets look at a potential usecase for this. Assume that in a package you 
want to handle several paths to different files and directories that are 
all located in a common package-specific parent directory. Then using 
the path protocol you could write this:


class PackageBase (object):
basepath = '/home/.package'

class PackagePath (str, PackageBase):
def __fspath__ ():
return os.path.join(self.basepath, str(self))

config_file = PackagePath('.config')
log_file = PackagePath('events.log')
data_dir = PackagePath('data')

with open(log_file) as log:
log.write('package paths initialized.\n')


Just that this wouldn't currently work because PackagePath inherits from 
str. Of course, there are other ways to achieve the above, but when you 
think about designing a Path-like object class str is just a pretty 
attractive base class to start from.


Now lets look at compatibility of a class like PackagePath under this 
proposal:


- if client code uses e.g. str(config_file) and proceeds to treat the 
resulting object as a path unexpected things will happen and, yes, 
that's bad. However, this is no different from any other Path-like 
object for which __str__ and __fspath__ don't define the same return value.


- if client code uses the PEP-recommended backwards-compatible way of 
dealing with paths,


path.__fspath__() if hasattr(path, "__fspath__") else path

things will just work. Interstingly, this would *currently* produce an 
unexpected result namely that it would execute the__fspath__ method of 
the str-subclass


- if client code uses instances of PackagePath as paths directly then in 
Python3.6 and below that would lead to unintended outcome, while in 
Python3.7 things would work. This is *really* bad.


But what it means is that, under the proposal, using a str or bytes 
subclass with an __fspath__ method defined makes your code 
backwards-incompatible and the solution would be not to use such a class 
if you want to be backwards-compatible (and that should get documented 
somewhere). This restriction, of course, limits the usefulness of the 
proposal in the near future, but that disadvantage will vanish over 
time. In 5 years, not supporting Python3.6 anymore maybe won't be a big 
deal anymore (for comparison, Python3.2 was released 6 years ago and 
since last years pip is no longer supporting it). As Steven pointed out 
the proposal is *very* unlikely to break existing code.


So to summarize, the proposal

- avoids an up-front isinstance check in the protocol and thereby speeds 
up the processing of exact strings and bytes and of anything that 
follows the path protocol.*


- slows down the processing of instances of regular str and bytes 
subclasses*


- makes the "path.__fspath__() if hasattr(path, "__fspath__") else path" 
idiom consistent for subclasses of str and bytes that define __fspath__


- opens up the opportunity to write str/bytes subclasses that represent 
a path other than just their self in the future*

Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Wolfgang Maier

On 05/23/2017 06:17 PM, Koos Zevenhoven wrote:

On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier
 wrote:

What do you think of this idea for a slight modification to os.fspath:
the current version checks whether its arg is an instance of str, bytes or
any subclass and, if so, returns the arg unchanged. In all other cases it
tries to call the type's __fspath__ method to see if it can get str, bytes,
or a subclass thereof this way.

My proposal is to change this to:
1) check whether the type of the argument is str or bytes *exactly*; if so,
return the argument unchanged
2) check wether __fspath__ can be called on the type and returns an instance
of str, bytes, or any subclass (just like in the current version)
3) check whether the type is a subclass of str or bytes and, if so, return
it unchanged




Hi Koos and thanks for your detailed response,


The reason why this was not done was that a str or bytes subclass that
implements __fspath__(self) would work in both pre-3.6 and 3.6+ but
behave differently. This would be also be incompatible with existing
code using str(path) for compatibility with the stdlib (the old way,
which people still use for pre-3.6 compatibility even in new code).



I'm not sure that sounds very convincing because that exact problem 
exists, was discussed and accepted in your PEP 519 for all other 
classes. I do not really see why subclasses of str and bytes should 
require special backwards compatibility here. Is there a reason why you 
are thinking they should be treated specially?



This would have the following implications:
a) it would speed up the very common case when the arg is either a str or a
bytes instance exactly


To get the same performance benefit for str and bytes, but without
changing functionality, there could first be the exact type check and
then the isinstance check. This would add some performance penalty for
PathLike objects. Removing the isinstance part of the __fspath__()
return value, which I find less useful, would compensate for that. (3)
would not be necessary in this version.



Right, that was one thing I forgot to mention in my list. My proposal 
would also speed up processing of pathlike objects because it moves the 
__fspath__ call up in front of the isinstance check. Your alternative 
would speed up only str and bytes, but would slow down Path-like classes.
In addition, I'm not sure that removing the isinstance check on the 
return value of __fspath__() is a good idea because that would mean 
giving up the guarantee that os.fspath returns an instance of str or 
bytes and would effectively force library code to do the isinstance 
check anyway even if the function may have performed it already, which 
would worsen performance further.



Are you asking for other reasons, or because you actually have a use
case where this matters? If this performance really matters somewhere,
the version I describe above could be considered. It would have 100%
backwards compatibility, or a little less (99% ?) if the isinstance
check of the __fspath__() return value is removed for performance
compensation.



That use case question is somewhat difficult to answer. I had this idea 
when working on two bug tracker issues (one concerning fnmatch and a 
follow-up one on os.path.normcase, which is called by fnmatch.filter 
and, in turn, calls os.fspath. fnmatchfilter is a case where performance 
matters and the decision when and where to call the rather expensive 
os.path.normcase->os.fspath there is not entirely straightforward. So, 
yes, I was basically looking at this because of a potential use case, 
but I say potential because I'm far from sure that any speed gain in 
os.fspath will be big enough to be useful for fnmatch.filter in the end.




b) user-defined classes that inherit from str or bytes could control their
path representation just like any other class


Again, this would cause differences in behavior between different
Python versions, and based on whether str(path) is used or not.

—Koos



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Wolfgang Maier

On 05/23/2017 06:41 PM, Wolfgang Maier wrote:

On 05/23/2017 06:17 PM, Koos Zevenhoven wrote:

On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier
 wrote:

What do you think of this idea for a slight modification to os.fspath:
the current version checks whether its arg is an instance of str, 
bytes or
any subclass and, if so, returns the arg unchanged. In all other 
cases it
tries to call the type's __fspath__ method to see if it can get str, 
bytes,

or a subclass thereof this way.

My proposal is to change this to:
1) check whether the type of the argument is str or bytes *exactly*; 
if so,

return the argument unchanged
2) check wether __fspath__ can be called on the type and returns an 
instance

of str, bytes, or any subclass (just like in the current version)
3) check whether the type is a subclass of str or bytes and, if so, 
return

it unchanged




Hi Koos and thanks for your detailed response,


The reason why this was not done was that a str or bytes subclass that
implements __fspath__(self) would work in both pre-3.6 and 3.6+ but
behave differently. This would be also be incompatible with existing
code using str(path) for compatibility with the stdlib (the old way,
which people still use for pre-3.6 compatibility even in new code).



I'm not sure that sounds very convincing because that exact problem 
exists, was discussed and accepted in your PEP 519 for all other 
classes. I do not really see why subclasses of str and bytes should 
require special backwards compatibility here. Is there a reason why you 
are thinking they should be treated specially?




Ah, sorry, I misunderstood what you were trying to say, but now I'm 
getting it! subclasses of str and bytes were of course usable as path 
arguments before simply because they were subclasses of them. Now they 
would be picked up based on their __fspath__ method, but old versions of 
Python executing code using them would still use them directly. Have to 
think about this one a bit, but thanks for pointing it out.



This would have the following implications:
a) it would speed up the very common case when the arg is either a 
str or a

bytes instance exactly


To get the same performance benefit for str and bytes, but without
changing functionality, there could first be the exact type check and
then the isinstance check. This would add some performance penalty for
PathLike objects. Removing the isinstance part of the __fspath__()
return value, which I find less useful, would compensate for that. (3)
would not be necessary in this version.



Right, that was one thing I forgot to mention in my list. My proposal 
would also speed up processing of pathlike objects because it moves the 
__fspath__ call up in front of the isinstance check. Your alternative 
would speed up only str and bytes, but would slow down Path-like classes.
In addition, I'm not sure that removing the isinstance check on the 
return value of __fspath__() is a good idea because that would mean 
giving up the guarantee that os.fspath returns an instance of str or 
bytes and would effectively force library code to do the isinstance 
check anyway even if the function may have performed it already, which 
would worsen performance further.



Are you asking for other reasons, or because you actually have a use
case where this matters? If this performance really matters somewhere,
the version I describe above could be considered. It would have 100%
backwards compatibility, or a little less (99% ?) if the isinstance
check of the __fspath__() return value is removed for performance
compensation.



That use case question is somewhat difficult to answer. I had this idea 
when working on two bug tracker issues (one concerning fnmatch and a 
follow-up one on os.path.normcase, which is called by fnmatch.filter 
and, in turn, calls os.fspath. fnmatchfilter is a case where performance 
matters and the decision when and where to call the rather expensive 
os.path.normcase->os.fspath there is not entirely straightforward. So, 
yes, I was basically looking at this because of a potential use case, 
but I say potential because I'm far from sure that any speed gain in 
os.fspath will be big enough to be useful for fnmatch.filter in the end.



b) user-defined classes that inherit from str or bytes could control 
their

path representation just like any other class


Again, this would cause differences in behavior between different
Python versions, and based on whether str(path) is used or not.

—Koos



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] tweaking the file system path protocol

2017-05-23 Thread Wolfgang Maier

What do you think of this idea for a slight modification to os.fspath:
the current version checks whether its arg is an instance of str, bytes 
or any subclass and, if so, returns the arg unchanged. In all other 
cases it tries to call the type's __fspath__ method to see if it can get 
str, bytes, or a subclass thereof this way.


My proposal is to change this to:
1) check whether the type of the argument is str or bytes *exactly*; if 
so, return the argument unchanged
2) check wether __fspath__ can be called on the type and returns an 
instance of str, bytes, or any subclass (just like in the current version)
3) check whether the type is a subclass of str or bytes and, if so, 
return it unchanged


This would have the following implications:
a) it would speed up the very common case when the arg is either a str 
or a bytes instance exactly
b) user-defined classes that inherit from str or bytes could control 
their path representation just like any other class
c) subclasses of str/bytes that don't define __fspath__ would still work 
like they do now, but their processing would be slower
d) subclasses of str/bytes that accidentally define a __fspath__ method 
would change their behavior


I think cases c) and d) could be sufficiently rare that the pros 
outweigh the cons?



Here's how the proposal could be implemented in the pure Python version 
(os._fspath):


def _fspath(path):
path_type = type(path)
if path_type is str or path_type is bytes:
return path

# Work from the object's type to match method resolution of other magic
# methods.
try:
path_repr = path_type.__fspath__(path)
except AttributeError:
if hasattr(path_type, '__fspath__'):
raise
elif issubclass(path_type, (str, bytes)):
return path
else:
raise TypeError("expected str, bytes or os.PathLike object, "
"not " + path_type.__name__)
if isinstance(path_repr, (str, bytes)):
return path_repr
else:
raise TypeError("expected {}.__fspath__() to return str or bytes, "
"not {}".format(path_type.__name__,
type(path_repr).__name__))

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] fnmatch.filter_false

2017-05-20 Thread Wolfgang Maier
On 19.05.2017 20:01, 
tritium-l...@sdamon.com wrote:




-Original Message-
From: Python-ideas [mailto:python-ideas-bounces+tritium-
list=sdamon@python.org] On Behalf Of Wolfgang Maier
Sent: Friday, May 19, 2017 10:03 AM
To: python-ideas@python.org
Subject: Re: [Python-ideas] fnmatch.filter_false

On 05/17/2017 07:55 PM,
tritium-l...@sdamon.com wrote:

Top posting, apologies.

I'm sure there is a better way to do it, and there is a performance hit,

but

its negligible.  This is also a three line delta of the function.

from fnmatch import _compile_pattern, filter as old_filter
import os
import os.path
import posixpath


data = os.listdir()

def filter(names, pat, *, invert=False):
  """Return the subset of the list NAMES that match PAT."""
  result = []
  pat = os.path.normcase(pat)
  match = _compile_pattern(pat)
  if os.path is posixpath:
  # normcase on posix is NOP. Optimize it away from the loop.
  for name in names:
  if bool(match(name)) == (not invert):
  result.append(name)
  else:
  for name in names:
  if bool(match(os.path.normcase(name))) == (not invert):
  result.append(name)
  return result

if __name__ == '__main__':
  import timeit
  print(timeit.timeit(
  "filter(data, '__*')",
  setup="from __main__ import filter, data"
   ))
  print(timeit.timeit(
  "filter(data, '__*')",
  setup="from __main__ import old_filter as filter, data"
  ))

The first test (modified code) timed at 22.492161903402575, where the

second

test (unmodified) timed at 19.31892032324



If you don't care about slow-downs in this range, you could use this
pattern:

excluded = set(filter(data, '__*'))
result = [item for item in data if item not in excluded]

It seems to take just as much longer although the slow-down is not
constant but depends on the size of the set you need to generate.

Wolfgang




If I didn't care about performance, I wouldn't be using filter - the only
reason to use filter over a list comprehension is performance.  The standard
library has a performant inclusion filter, but does not have a performant
exclusion filter.



I'm sorry, but then your statement above doesn't make any sense to me:
"I'm sure there is a better way to do it, and there is a performance 
hit, but its negligible."
I'm proposing an alternative to you which times in very similarly to 
your own suggestion without copying or modifying stdlib code.


That said I still like your idea of adding the exclude functionality to 
fnmatch. I just thought you may be interested in a solution that works 
right now.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] fnmatch.filter_false

2017-05-19 Thread Wolfgang Maier
On 05/17/2017 07:55 PM, 
tritium-l...@sdamon.com wrote:

Top posting, apologies.

I'm sure there is a better way to do it, and there is a performance hit, but
its negligible.  This is also a three line delta of the function.

from fnmatch import _compile_pattern, filter as old_filter
import os
import os.path
import posixpath


data = os.listdir()

def filter(names, pat, *, invert=False):
 """Return the subset of the list NAMES that match PAT."""
 result = []
 pat = os.path.normcase(pat)
 match = _compile_pattern(pat)
 if os.path is posixpath:
 # normcase on posix is NOP. Optimize it away from the loop.
 for name in names:
 if bool(match(name)) == (not invert):
 result.append(name)
 else:
 for name in names:
 if bool(match(os.path.normcase(name))) == (not invert):
 result.append(name)
 return result

if __name__ == '__main__':
 import timeit
 print(timeit.timeit(
 "filter(data, '__*')",
 setup="from __main__ import filter, data"
  ))
 print(timeit.timeit(
 "filter(data, '__*')",
 setup="from __main__ import old_filter as filter, data"
 ))

The first test (modified code) timed at 22.492161903402575, where the second
test (unmodified) timed at 19.31892032324



If you don't care about slow-downs in this range, you could use this 
pattern:


excluded = set(filter(data, '__*'))
result = [item for item in data if item not in excluded]

It seems to take just as much longer although the slow-down is not 
constant but depends on the size of the set you need to generate.


Wolfgang


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Way to repeat other than "for _ in range(x)"

2017-03-30 Thread Wolfgang Maier

On 03/30/2017 04:23 PM, Pavol Lisy wrote:

On 3/30/17, Nick Coghlan  wrote:

On 30 March 2017 at 19:18, Markus Meskanen 
wrote:


d = [[0] * 5 for _ in range(10)]


d = [[0]*5]*10  # what about this?



These are not quite the same when the repeated object is mutable. Compare:

>>> matrix1 = [[0] * 5 for _ in range(10)]
>>> matrix1[0].append(1)
>>> matrix1
[[0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], 
[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 
0, 0, 0, 0], [0, 0, 0, 0, 0]]


>>> matrix2=[[0]*5]*10
>>> matrix2[0].append(1)
>>> matrix2
[[0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 
0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], 
[0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1]]


so the comprehension is usually necessary.

Wolfgang

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] for/except/else

2017-03-03 Thread Wolfgang Maier

On 03/03/2017 04:36 AM, Nick Coghlan wrote:

On 2 March 2017 at 21:06, Wolfgang Maier
mailto:wolfgang.ma...@biologie.uni-freiburg.de>> wrote:


- overall I looked at 114 code blocks that contain one or more breaks



Thanks for doing that research :)



Of the remaining 19 non-trivial cases

- 9 are variations of your classical search idiom above, i.e.,
there's an else clause there and nothing more is needed

- 6 are variations of your "nested side-effects" form presented
above with debatable (see above) benefit from except break

- 2 do not use an else clause currently, but have multiple breaks
that do partly redundant things that could be combined in a single
except break clause



Those 8 cases could also be reviewed to see whether a flag variable
might be clearer than relying on nested side effects or code repetition.



[...]



This is a case where a flag variable may be easier to read than loop
state manipulations:

may_have_common_prefix = True
while may_have_common_prefix:
prefix = None
for item in items:
if not item:
may_have_common_prefix = False
break
if prefix is None:
prefix = item[0]
elif item[0] != prefix:
may_have_common_prefix = False
break
else:
# all subitems start with a common "prefix".
# move it out of the branch
for item in items:
del item[0]
subpatternappend(prefix)

Although the whole thing could likely be cleaned up even more via
itertools.zip_longest:

for first_uncommon_idx, aligned_entries in
enumerate(itertools.zip_longest(*items)):
if not all_true_and_same(aligned_entries):
break
else:
# Everything was common, so clear all entries
first_uncommon_idx = None
for item in items:
del item[:first_uncommon_idx]

(Batching the deletes like that may even be slightly faster than
deleting common entries one at a time)

Given the following helper function:

def all_true_and_same(entries):
itr = iter(entries)
try:
first_entry = next(itr)
except StopIteration:
return False
if not first_entry:
return False
for entry in itr:
if not entry or entry != first_entry:
return False
return True


- finally, 1 is a complicated break dance to achieve sth that
clearly would have been easier with except break; from typing.py:




[...]



I think is another case that is asking for the inner loop to be factored
out to a named function, not for reasons of re-use, but for reasons of
making the code more readable and self-documenting :)



It's true that using a flag or factoring out redundant code is always a 
possibility. Having the except clause would clearly not let people do 
anything they couldn't have done before.
On the other hand, the same is true for the else clause - it's only 
advantage here is that it's existing already - because a single flag 
could always distinguish between a break having occurred or not:


brk = False
for item in iterable:
if some_condition:
brk = True
break
if brk:
do_stuff_upon_breaking_out()
else:
do_alternative_stuff()

is a general pattern that would always work without except *and* else.

However, the fact that else exists generates a regrettable asymmetry in 
that there is direct language support for detecting one outcome, but not 
the other.


Stressing the analogy to try/except/else one more time, it's as if 
"else" wasn't available for try blocks. You could always use a flag to 
substitute for it:


dealt_with_exception = False
try:
do_stuff()
except:
deal_with_exception()
dealt_with_exception = True
if dealt_with_exception:
do_stuff_you_would_do_in_an_else_block()

So IMO the real difference here is that the except clause after for 
would require adding it to the language, while the else clauses are 
there already. With that we're back at the high bar for adding new syntax :(
A somewhat similar case that comes to mind here is PEP 315 -- Enhanced 
While Loop, which got rejected for two reasons, the first one being 
pretty much the same as the argument here, i.e., that instead of the 
proposed do .. while it's always possible to factor out or duplicate a 
line of code. However, the second reason was that it required the new 
"do" keyword, something not necessary for the current suggestion.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] for/except/else

2017-03-03 Thread Wolfgang Maier

On 03/02/2017 07:05 PM, Brett Cannon wrote:


- overall I looked at 114 code blocks that contain one or more breaks


I wanted to say thanks for taking the time to go through the stdlib and
doing such a thorough analysis of the impact of your suggestion! It
always helps to have real-world numbers to know whether an idea will be
useful (or not).



- 84 of these are trivial use cases that simply break out of a while
True block or terminate a while/for loop prematurely (no use for any
follow-up clause there)

- 8 more are causing a side-effect before a single break, and it would
be pointless to put this into an except break clause

- 3 more cause different, non-redundant side-effects before different
breaks from the same loop and, obviously, an except break clause would
not help them either

=> So the vast majority of breaks does *not* need an except break *nor*
an else clause, but that's just as expected.


Of the remaining 19 non-trivial cases

- 9 are variations of your classical search idiom above, i.e., there's
an else clause there and nothing more is needed

- 6 are variations of your "nested side-effects" form presented above
with debatable (see above) benefit from except break

- 2 do not use an else clause currently, but have multiple breaks that
do partly redundant things that could be combined in a single except
break clause

- 1 is an example of breaking out of two loops; from
sre_parse._parse_sub:



[...]


- finally, 1 is a complicated break dance to achieve sth that clearly
would have been easier with except break; from typing.py:

My summary: I do see use-cases for the except break clause, but,
admittedly, they are relatively rare and may be not worth the hassle of
introducing new syntax.


IOW out of 114 cases, 4 may benefit from an 'except' block? If I'm
reading those numbers correctly then ~3.5% of cases would benefit which
isn't high enough to add the syntax and related complexity IMO.


Hmm, I'm not sure how much sense it makes to express this in percent 
since the total your comparing to is rather arbitrary.
The 114 cases include *any* for/while loop I could find that contains at 
least a single break. More than 90 of these loops do not use an "else" 
clause either showing that even this currently supported syntax is used 
rarely.
I found only 19 cases that are complex enough to be candidates for an 
except clause (17 of these use the else clause). For 9 of these 19 (the 
ones using the classical search idiom) an except clause would not be 
applicable, but it could be used in the 10 remaining cases (though all 
of them could also make use of a flag or could be refactored instead). 
So depending on what you want to emphasize you could also say that the 
proposal could affect as much as 10/19 or 52.6% of cases.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] for/except/else

2017-03-02 Thread Wolfgang Maier

On 02.03.2017 06:46, Nick Coghlan wrote:

On 1 March 2017 at 19:37, Wolfgang Maier
mailto:wolfgang.ma...@biologie.uni-freiburg.de>>
wrote:

Now here's the proposal: allow an except (or except break) clause to
follow for/while loops that will be executed if the loop was
terminated by a break statement.

Now while it's possible that Nick had a good reason not to do so,


I never really thought about it, as I only use the "else:" clause for
search loops where there aren't any side effects in the "break" case
(other than the search result being bound to the loop variable), so
while I find "except break:" useful as an explanatory tool, I don't have
any practical need for it.

I think you've made as strong a case for the idea as could reasonably be
made :)

However, Steven raises a good point that this would complicate the
handling of loops in the code generator a fair bit, as it would add up
to two additional jump targets in cases wherever the new clause was used.

Currently, compiling loops only needs to track the start of the loop
(for continue), and the first instruction after the loop (for break).
With this change, they'd also need to track:

- the start of the "except break" clause (for break when the clause is used)
- the start of the "else" clause (for the non-break case when both
trailing clauses are present)



I think you could get away with only one additional jump target as I 
showed in my previous reply to Steven. The heavier burden would be on 
the parser, which would have to distinguish the existing and the two new 
loop variants (loop with except clause, loop with except and else 
clause) but, anyway, that's probably not really the point.

What weighs heavier, I think, is your design argument.


The design level argument against adding the clause is that it breaks
the "one obvious way" principle, as the preferred form for search loops
look like this:

for item in iterable:
if condition(item):
break
else:
# Else clause either raises an exception or sets a default value
item = get_default_value()

   # If we get here, we know "item" is a valid reference
   operation(item)

And you can easily switch the `break` out for a suitable `return` if you
move this into a helper function:

def find_item_of_interest(iterable):
for item in iterable:
if condition(item):
return item
# The early return means we can skip using "else"
return get_default_value()

Given that basic structure as a foundation, you only switch to the
"nested side effect" form if you have to:

for item in iterable:
if condition(item):
operation(item)
break
else:
# Else clause neither raises an exception nor sets a default value
condition_was_never_true(iterable)

This form is generally less amenable to being extracted into a reusable
helper function, since it couples the search loop directly to the
operation performed on the bound item, whereas decoupling them gives you
a lot more flexibility in the eventual code structure.

The proposal in this thread then has the significant downside of only
covering the "nested side effect" case:

for item in iterable:
if condition(item):
break
except break:
operation(item)
else:
condition_was_never_true(iterable)

While being even *less* amenable to being pushed down into a helper
function (since converting the "break" to a "return" would bypass the
"except break" clause).


I'm actually not quite buying this last argument. If you wanted to 
refactor this to "return" instead of "break", you could simply put the 
return into the except break block. In many real-world situations with 
multiple breaks from a loop this could actually make things easier 
instead of worse.
Personally, the "nested side effect" form makes me uncomfortable every 
time I use it because the side effects on breaking or not breaking the 
loop don't end up at the same indentation level and not necessarily 
together. However, I'm gathering from the discussion so far that not too 
many people are thinking like me about this point, so maybe I should 
simply adjust my mind-set.



All that said, this is a very nice abstract view on things! I really 
learned quite a bit from this, thank you :)


As always though, reality can be expected to be quite a bit more 
complicated than theory so I decided to check the stdlib for real uses 
of break. This is quite a tedious task since break is used in many 
different ways and I couldn't come up with a good automated way of 
classifying them. So what I did is just go through stdlib code (in 
reverse alphabetical order) containing the break keyword and put it i

Re: [Python-ideas] for/except/else

2017-03-01 Thread Wolfgang Maier

On 01.03.2017 12:56, Steven D'Aprano wrote:


- How is this implemented? Currently "break" is a simple
  unconditional GOTO which jumps past the for block. This will
  need to change to something significantly more complex.



one way to implement this with unconditional GOTOs would be (in pseudocode):

LOOP:
on break GOTO EXCEPT
ELSE:
...
GOTO THEN
EXCEPT:
...
THEN:
...

So at the byte-code level (but only there) the order of except and else 
would be reversed. Was that a reason why you were asking about the order 
of except and else in my proposal?


Anyway, I'm sure there are people much more skilled at compiler 
programming than me here.



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] for/except/else

2017-03-01 Thread Wolfgang Maier

On 01.03.2017 12:56, Steven D'Aprano wrote:

On Wed, Mar 01, 2017 at 10:37:17AM +0100, Wolfgang Maier wrote:


Now here's the proposal: allow an except (or except break) clause to
follow for/while loops that will be executed if the loop was terminated
by a break statement.


Let me see if I understand the proposal in full. You would allow:


for i in (1, 2, 3):
print(i)
if i == 2:
break
except break:  # or just except
assert i == 2
print("a break was executed")
else:
print("never reached")  # this is never reached
print("for loop is done")


as an alternative to something like:


broke_out = False
for i in (1, 2, 3):
print(i)
if i == 2:
broke_out = True
break
else:
print("never reached")  # this is never reached
if broke_out:
assert i == 2
print("a break was executed")
print("for loop is done")




correct.


I must admit the suggestion seems a little bit neater than having to
manage a flag myself, but on the other hand I can't remember the last
time I've needed to manage a flag like that.

And on the gripping hand, this is even simpler than both alternatives:

for i in (1, 2, 3):
print(i)
if i == 2:
assert i == 2
print("a break was executed")
break
else:
print("never reached")  # this is never reached
print("for loop is done")



Right, that's how you'd likely implement the behavior today, but see my 
argument about the two alternative code branches not ending up together 
at the same level of indentation.





There are some significant unanswered questions:

- Does it matter which order the for...except...else are in?
  Obviously the for block must come first, but apart from that?



Just like in try/except/else, the order would be for (or 
while)/except/else with the difference that both except and else would 
be optional.



- How is this implemented? Currently "break" is a simple
  unconditional GOTO which jumps past the for block. This will
  need to change to something significantly more complex.



Yeah, I know that's why I listed this under cons.


- There are other ways to exit a for-loop than just break. Which
  of them, if any, will also run the except block?



None of them (though, honestly, I cannot think of anything but 
exceptions here; what do you have in mind?)







___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] for/except/else

2017-03-01 Thread Wolfgang Maier
I know what the regulars among you will be thinking (time machine, high 
bar for language syntax changes, etc.) so let me start by assuring you 
that I'm well aware of all of this, that I did research the topic before 
posting and that this is not the same as a previous suggestion using 
almost the same subject line.


Now here's the proposal: allow an except (or except break) clause to 
follow for/while loops that will be executed if the loop was terminated 
by a break statement.


The idea is certainly not new. In fact, Nick Coghlan, in his blog post
http://python-notes.curiousefficiency.org/en/latest/python_concepts/break_else.html, 
uses it to provide a mental model for the meaning of the else following 
for/while, but, as far as I'm aware, he never suggested to make it legal 
Python syntax.


Now while it's possible that Nick had a good reason not to do so, I 
think there would be three advantages to this:


- as explained by Nick, the existence of "except break" would strengthen 
the analogy with try/except/else and help people understand what the 
existing else clause after a loop is good for.
There has been much debate over the else clause in the past, most 
prominently, a long discussion on this list back in 2009 (I recommend 
interested people to start with Steven D'Aprano's Summary of it at 
https://mail.python.org/pipermail/python-ideas/2009-October/006155.html) 
that shows that for/else is misunderstood by/unknown to many Python 
programmers.


- in some situations for/except/else would make code more readable by 
bringing logical alternatives closer together and to the same 
indentation level in the code. Consider a simple example (taken from the 
docs.python Tutorial:


for n in range(2, 10):
for x in range(2, n):
if n % x == 0:
print(n, 'equals', x, '*', n//x)
break
else:
# loop fell through without finding a factor
print(n, 'is a prime number')

There are two logical outcomes of the inner for loop here - a given 
number can be either prime or not. However, the two code branches 
dealing with them end up at different levels of indentation and in 
different places, one inside and one outside the loop block. This second 
issue can become much more annoying in more complex code where the loop 
may contain additional code after the break statement.


Now compare this to:

for n in range(2, 10):
for x in range(2, n):
if n % x == 0:
break
except break:
print(n, 'equals', x, '*', n//x)
else:
# loop fell through without finding a factor
print(n, 'is a prime number')

IMO, this reflects the logic better.


- it could provide an elegant solution for the How to break out of two 
loops issue. This is another topic that comes up rather regularly 
(python-list, stackoverflow) and there is again a very good blog post 
about it, this time from Ned Batchelder at 
https://nedbatchelder.com/blog/201608/breaking_out_of_two_loops.html.
Stealing his example, here's code (at least) a newcomer may come up with 
before realizing it can't work:


s = "a string to examine"
for i in range(len(s)):
for j in range(i+1, len(s)):
if s[i] == s[j]:
answer = (i, j)
break   # How to break twice???

with for/except/else this could be written as:

s = "a string to examine"
for i in range(len(s)):
for j in range(i+1, len(s)):
if s[i] == s[j]:
break
except break:
answer = (i, j)
break


So much for the pros. Of course there are cons, too. The classical one 
for any syntax change, of course, is:
- burden on developers who have to implement and maintain the new 
syntax. Specifically, this proposal would make parsing/compiling of 
loops more complicated.


Others include:
- using except will make people think of exceptions and that may cause 
new confusion; while that's true, I would argue that, in fact, break and 
exceptions are rather similar features in that they are gotos in 
disguise, so except will still be used to catch an interruption in 
normal control flow.


- the new syntax will not help people understand for/else if except is 
not used; importantly, I'm *not* proposing to disallow the use of 
for/else without except (if that would ever happen it would be in the 
*very* distant future) so that would indeed mean that people would 
encounter for/else, not only in legacy, but also in newly written code. 
However, I would expect that they would also start seeing for/except 
increasingly (not least because it solves the "break out of two loops" 
issue) so they would be nudged towards thinking of the else after 
for/while more like the else in try/except/else just as Nick proposes 
it. Interestingly, there has been another proposal on this list several 
years ago about allowing try/else without except, which I liked at the 
time and which would have made try/except/]else work exactly as my 
proposed for/except/else. Here it is:

https://m

Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-11-29 Thread Wolfgang Maier

On 29.11.2016 10:39, Paul Moore wrote:

On 28 November 2016 at 22:33, Steve Dower  wrote:

Given that, this wouldn't necessarily need to be an executable file. The
finder could locate a "foo.missing" file and raise ModuleNotFoundError with
the contents of the file as the message. No need to allow/require any Python
code at all, and no risk of polluting sys.modules.


I like this idea. Would it completely satisfy the original use case
for the proposal? (Or, to put it another way, is there any specific
need for arbitrary code execution in the missing.py file?)



The only thing that I could think of so far would be cross-platform 
.missing.py files that query the system (e.g. using the platform module) 
to generate adequate messages for the specific platform or distro. E.g., 
correctly recommend to use dnf install or yum install or apt install, etc.



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-11-28 Thread Wolfgang Maier

On 28.11.2016 23:52, Chris Angelico wrote:


+1, because this also provides a coherent way to reword the try/except
import idiom:

# Current idiom
# somefile.py
try:
import foo
except ImportError:
import subst_foo as foo

# New idiom:
# foo.missing.py
import subst_foo as foo
import sys; sys.modules["foo"] = foo
#somefile.py
import foo



Hmm. I would rather take this example as an argument against the 
proposed behavior. It invites too many clever hacks. I thought that the 
idea was that .missing.py does *not* act as a replacement module, but, 
more or less, just as a message generator.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-11-28 Thread Wolfgang Maier

On 28.11.2016 23:19, Nathaniel Smith wrote:


I'd suggest that we additional specify that if we find a
foo.missing.py, then the code is executed but -- unlike a regular
module load -- it's not automatically inserted into
sys.modules["foo"]. That seems like it could only create confusion.
And it doesn't restrict functionality, because if someone really wants
to implement some clever shenanigans, they can always modify
sys.modules["foo"] by hand.

This also suggests that the overall error-handling flow for 'import
foo' should look like:

1) run foo.missing.py
2) if it raises an exception: propagate that
3) otherwise, if sys.modules["foo"] is missing: raise some variety of
ImportError.
4) otherwise, use sys.modules["foo"] as the object that should be
bound to 'foo' in the original invoker's namespace

I think this might make everyone who was worried about exception
handling downthread happy -- it allows a .missing.py file to
successfully import if it really wants to, but only if it explicitly
fulfills 'import' requirement that the module should somehow be made
available.



A refined (from my previous post which may have ended up too nested) 
alternative: instead of triggering an immediate search for a .missing.py 
file, why not have the interpreter intercept any ModuleNotFoundError 
that bubbles up to the top without being caught, then uses the name 
attribute of the exception to look for the .missing.py file. Agreed, 
this is more complicated to implement, but it would avoid any 
performance loss in situations where running code knows how to deal with 
the missing module anyway.


Wolfgang

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-11-28 Thread Wolfgang Maier

On 28.11.2016 22:26, Paul Moore wrote:

On 28 November 2016 at 21:11, Ethan Furman  wrote:

One "successful" use-case that would be impacted is the fallback import
idiom:

try:
# this would do two full searches before getting the error
import BlahBlah
except ImportError:
import blahblah


Under this proposal, the above idiom could potentially now fail. If
there's a BlahBlah.missing.py, then that will get executed rather than
an ImportError being raised, so the fallback wouldn't be executed.
This could actually be a serious issue for code that currently
protects against optional stdlib modules not being available like
this. There's no guarantee that I can see that a .missing.py file
would raise ImportError (even if we said that was the intended
behaviour, there's nothing to enforce it).

Could the proposal execute the .missing.py file and then raise
ImportError? I could imagine that having problems of its own,
though...



How about addressing both concerns by triggering the search for 
.missing.py only if an ImportError bubbles up uncaught (a bit similar to 
StopIteration nowadays)?


Wolfgang

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/