from:"Dominik Vilsmeier"

On 07.07.20 19:41, Stephen J. Turnbull wrote:

Dominik Vilsmeier writes:

  > Well, the point is that this "except comparisons" is not quite true:
  >
  >      >>> i = {'a': []}.items()
  >      >>> s = {('a', 1)}
  >      >>> i == s
  >      TypeError: unhashable type: 'list'
  >
  > If passed a set as `other` operand, dict_items seems to decide to
  > convert itself to a set, for no obvious reasons since, as you
  > mentioned, it does know how to compare itself to another view
  > containing non-hashable values:

The obvious reason is that they didn't want to implement the
comparison if you didn't have to, the dict.items() view is a Set, so
the obvious implementation is s == set(i).

  > So if you're dealing with items views and want to compare them to a set
  > representing dict items, then you need an extra `try/except` in order to
  > handle non-hashable values in the items view.

Sounds like you have a change to propose here, then.  Put the
try/except in the __eq__ for the items view class when comparing
against a set.  I would expect it to be accepted, as comparing items
views is pretty expensive so the slight additional overhead would
likely be acceptable, and if you get the exception, you know the
equality comparison against a set is false since a set cannot contain
that element, so this possibility can't affect worst-case performance
by much, if at all.

Exactly, this seems like it would come at a small cost, and preventing
this TypeError during equality testing seems definitely worth it.

  > Surely [dict.values equality comparison having object equality
  > semantics] must be a relic from pre-3.7 days where dicts were
  > unordered and hence order-based comparison wouldn't be possible
  > (though PEP 3106 describes an O(n*m) algorithm). However the
  > current behavior is unfortunate because it might trick users

You mean "users might trick themselves".  We provide documentation for
exactly this reason.

Correct, and the Python docs are very good. But given the amount of
StackOverflow questions which are well covered by the docs, it seems
that users often go by their intuition when they encounter new
situations. The dict_values __eq__ might be one such scenario. But I
understand that any other implementation has to make its own
assumptions, so the current situation is fine (it's probably a rare use
case anyway).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7THLTDPJKGPJ5ZW4SB72OQDBL6DNN6YM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

On 07.07.20 19:09, Christopher Barker wrote:

On Tue, Jul 7, 2020 at 6:56 AM Dominik Vilsmeier
mailto:dominik.vilsme...@gmx.de>> wrote:

Well, the point is that this "except comparisons" is not quite true:

 >>> i = {'a': []}.items()
 >>> s = {('a', 1)}
 >>> i == s
 TypeError: unhashable type: 'list'

If passed a set as `other` operand, dict_items seems to decide to
convert itself to a set, for no obvious reasons since, as you
mentioned,
it does know how to compare itself to another view containing
non-hashable values:

 >>> i == {'a': {}}.items()
 False

So if you're dealing with items views and want to compare them to
a set
representing dict items, then you need an extra `try/except` in
order to
handle non-hashable values in the items view. Not only does this
require
an extra precautionary step, it also seems strange given that in
Python
you can compare all sorts of objects without exceptions being
raised. I
can't think of any another built-in type that would raise an exception
on equality `==` comparison. dict_items seems to make an exception to
that rule.

I think this really is a bug (well, missing feature). It surely,
*could* be implemented to work, but maybe not efficiently or easily,
so may well not be worth it -- is the use case of comparing a
dict_items with another dict_items really helpful?

I think it can be done both efficient and easy, since the current
implementation has chosen that converting the view to a set fulfills
these requirements. So all you'd have to do is to catch this TypeError
inside __eq__ and return False since a `set` can never contain
non-hashable elements. Note that it's the comparison with a `set` that
causes this TypeError, not the one between two dict views.

In fact, my first thought was that the way to do the comparison is to
convert to a dict, rather than a set, and then do the compare. And
then I realized that dict_items don't exist without a dict anyway, so
you really should be comparing the "host" dicts anyway. Which leaves
exactly no use cases for this operation.

I tested converting the `set` to an items view via `dict(a_set).items()`
and then use this for the __eq__ comparison but it is quite a bit slower
than comparing a view to the set directly (if it doesn't contain any
non-hashable values); after all it creates a new object including memory
allocation.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YADR4OLN54NUGK23J7IGT36C67ASANCG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice


On 07.07.20 17:37, Inada Naoki wrote:


On Tue, Jul 7, 2020 at 10:52 PM Dominik Vilsmeier
 wrote:

Surely that must be a relic from pre-3.7 days where dicts were unordered
and hence order-based comparison wouldn't be possible (though PEP 3106
describes an O(n*m) algorithm). However the current behavior is
unfortunate because it might trick users into believing that this is a
meaningful comparison between distinct objects (given that it works with
`dict.keys` and `dict.items`) when it isn't.

So why not make dict_values a Sequence, providing __getitem__ and
additionally order-based __eq__ comparison?

It was rejected in this thread.
https://mail.python.org/archives/list/python-...@python.org/thread/R2MPDTTMJXAF54SICFSAWPPCCEWAJ7WF/#K3SYX4DER3WAOWGQ4SPKCKXSXLXTIVAQ

All right, I see that having __eq__ for dict_values is not for debate.
But what about the other idea, making dict_values a Sequence? It does
provide some useful features, like getting the first value of a dict or
`.count` values.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NXYRTCNYBGHJ7743N2AKPI2HDISQD4XB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

On 05.07.20 16:56, Stephen J. Turnbull wrote:

Steven D'Aprano writes:

  > Regarding your observation that dict views behave poorly if they
  > have unhashable values, I agree, it is both odd and makes them less
  > useful. Possibly at some point between the PEP and the release of
  > the feature something changed, or perhaps it's just an oversight.

I'm not sure what you expect from views, though:

Python 3.8.3 (default, May 15 2020, 14:39:37)

set([[1]])

Traceback (most recent call last):
   File "", line 1, in 
TypeError: unhashable type: 'list'

{'a' : [1]}.keys() <= {'a' : [1], 'b' : 2}.keys()

True

{'a' : [1]}.values() <= {'a' : [1], 'b' : 2}.values()

Traceback (most recent call last):
   File "", line 1, in 
TypeError: '<=' not supported between instances of 'dict_values' and 
'dict_values'

{'a' : [1]}.items() <= {'a' : [1], 'b' : 2}.items()

True

So all of the above are consistent with the behavior of sets, except
that items views do some part of comparisons themselves to deal with
non-hashables which is an extension to set behavior. And values views
don't pretend to be sets.  The ValuesView ABC is not derived from Set,
presumably because dict.values returns something like a multiset.

Most set operations on key and item views seem to convert to set and
where appropriate return set (which makes sense, since returning a
view would require synthesizing a dict to be the view of!)  This means
you can't do set operations (except comparisons) on items views if any
values aren't hashable.

Well, the point is that this "except comparisons" is not quite true:

    >>> i = {'a': []}.items()
    >>> s = {('a', 1)}
    >>> i == s
    TypeError: unhashable type: 'list'

If passed a set as `other` operand, dict_items seems to decide to
convert itself to a set, for no obvious reasons since, as you mentioned,
it does know how to compare itself to another view containing
non-hashable values:

    >>> i == {'a': {}}.items()
    False

So if you're dealing with items views and want to compare them to a set
representing dict items, then you need an extra `try/except` in order to
handle non-hashable values in the items view. Not only does this require
an extra precautionary step, it also seems strange given that in Python
you can compare all sorts of objects without exceptions being raised. I
can't think of any another built-in type that would raise an exception
on equality `==` comparison. dict_items seems to make an exception to
that rule.

I'm not sure what I think about this:

{'a' : 1, 'b' : 2}.values() == {'b' : 2, 'a' : 1}.values()

False

That does seem less than useful.  But I guess a multiset comparison
requires an auxiliary data structure that can be sorted or a
complicated, possibly O(n^2), comparison in place.

dict_values seems to rely on object.__eq__ since they always compare
unequal (except when it is the same object); this behavior is mentioned
by the docs:

> An equality comparison between one `dict.values()` view and another
will always return `False`. This also applies when comparing
`dict.values()` to itself.

Surely that must be a relic from pre-3.7 days where dicts were unordered
and hence order-based comparison wouldn't be possible (though PEP 3106
describes an O(n*m) algorithm). However the current behavior is
unfortunate because it might trick users into believing that this is a
meaningful comparison between distinct objects (given that it works with
`dict.keys` and `dict.items`) when it isn't.

So why not make dict_values a Sequence, providing __getitem__ and
additionally order-based __eq__ comparison?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OVBZ5OXJ33XYAWC5HS4UEQYQHPD56TXS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-01 Thread Dominik Vilsmeier

On 01.07.20 13:32, Steven D'Aprano wrote:

On Wed, Jul 01, 2020 at 01:07:34PM +0200, Dominik Vilsmeier wrote:

What is the reason for `dict.items` to return a set-like object?

This is the third time I've linked to the PEP:

https://www.python.org/dev/peps/pep-3106/

Thanks for linking to the PEP (to be fair, the second link arrived after
my post).

However reading that PEP only increases the number of question marks.
The PEP speaks mostly about `dict.keys` and occasionally mentions that
similar things can be done for `dict.items`. For example:

> Because of the set behavior, it will be possible to check whether two
dicts have the same keys by simply testing:
> if a.keys() == b.keys(): ...

which makes perfectly sense. But then, immediately after that, it mentions:

> and similarly for .items().

So this means we can do `a.items() == b.items()`. But wait, if the
dicts' items are equal, aren't just the dicts themselves equal? So why
not just compare `a == b`? Why would I want to compare `a.items() ==
b.items()`?

Then, regarding the equality tests (`==`), the Specification section
mentions the following for `dict_keys`:

> To specify the semantics, we can specify x == y as:
> set(x) == set(y)   if both x and y are d_keys instances
> set(x) == y    if x is a d_keys instance
> x == set(y)    if y is a d_keys instance

This makes sense again. And then for `dict_items`:

    # As well as the set operations mentioned for d_keys above.
    # However the specifications suggested there will not work if
    # the values aren't hashable.  Fortunately, the operations can
    # still be implemented efficiently.  For example, this is how
    # intersection can be specified:
    # [...]
    # And here is equality:

    def __eq__(self, other):
    if isinstance(other, (set, frozenset, d_keys)):
    if len(self) != len(other):
    return False
    for item in other:
    if item not in self:
    return False
    return True
    # [...] handling other cases

This is what I expected how `dict_items` would handle equality tests.
However it doesn't seem to do that:

    >>> self = {'a': []}.items()
    >>> other = {'b'}
    >>> self == other
    TypeError: unhashable type: 'list'
    >>> def __eq__(self, other): ...  # paste the function here
    >>> d_keys = type({}.keys())
    >>> __eq__(self, other)
    False
    >>> self == self
    True

I don't know if it was always implemented like that but at least the PEP
mentions this caveat and I find it surprising as well.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NSTPE3SET5YVB7IZP2SUYXNNMCDEZDWK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

2020-07-01 Thread Dominik Vilsmeier


On 30.06.20 05:08, Steven D'Aprano wrote:


On Tue, Jun 30, 2020 at 11:10:20AM +0900, Inada Naoki wrote:

On Mon, Jun 29, 2020 at 9:12 PM Hans Ginzel  wrote:

What are the reasons, why object dict.items() is not subscriptable – 
dict.items()[0]?

Because dict is optimized for random access by key and iteration, but not for
random access by index.

But we're not talking about *dict*, we're talking about dict.items which
returns a set-like object:

 py> from collections.abc import Set
 py> isinstance({}.items(), Set)
 True

So dict.items isn't subscriptable because it's an unordered set, not a
sequence.


What is the reason for `dict.items` to return a set-like object? The
values can be non-hashable an in this case the behavior can be surprising:

    >>> {'a': []}.items() & {'b'}
    TypeError: unhashable type: 'list'

Furthermore [the
documentation](https://docs.python.org/3/library/stdtypes.html#dict-views)
states the following:

> If all values are hashable, so that |(key, value)| pairs are unique
and hashable, then the items view is also set-like.

This sounds like the return type of `dict.items` depended on the actual
values contained (which again would be surprising) but actually it
doesn't seem to be the case:

    >>> from collections.abc import Set
    >>> isinstance({'a': []}.items(), Set)
    True

`dict.items` could provide all of its "standard" behavior (like
membership testing, reversing, etc) without being set-like.

The fact that `==` with `dict.items` raises TypeError for non-hashable
values is also a little surprising since it supports membership testing
and hence could check `len(self) == len(other) and all(x in self for x
in other)` (though that drops the type comparison, but if you cannot
have a set, why would you compare them anyway):

    >>> self = {'a': []}.items()
    >>> other = {('a', 1)}
    >>> self == other
    TypeError: unhashable type: 'list'
    >>> len(self) == len(other) and all(x in self for x in other)
    False
    >>> other = [('a', [])]
    >>> len(self) == len(other) and all(x in self for x in other)
    True

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AQDMOL5JRYEQMGQSYKABAV2VW6U2M6N2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: String module name

2020-06-16 Thread Dominik Vilsmeier


On 16.06.20 10:00, redrad...@gmail.com wrote:


You cannot trust PyPi either ...

I think user should decide if it allows code from arbitrary URL to access 
filesystem, network or anything else as `wasmtime` and `deno` did


If you want to do this, you can still download the code and use
`importlib` to import it.

But usually you want to import a whole package (or parts of it), not a
stand-alone module. And this package might have dependencies on other
packages. And these dependencies might even conflict with the
dependencies of other packages that you are using. So this whole process
is fairly complex and is better resolved before application startup.
There exists a variety of tools that deal with package management (e.g.
pip, poetry, ...).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IQYYKZUQZS75IQYJIITP5R3PF3W7PP2X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: approximate equality operator ("PEP 485 follow-up")

2020-06-16 Thread Dominik Vilsmeier

On 14.06.20 17:52, David Mertz wrote:

On Sun, Jun 14, 2020, 10:22 AM Greg Ewing mailto:greg.ew...@canterbury.ac.nz>> wrote:

On 15/06/20 12:39 am, Sebastian M. Ernst wrote:
> It's such a common problem when dealing with floating point numbers

Is it really? I've done quite a lot of work with floating
point numbers, and I've very rarely needed to compare two
of them for almost-equality. When I do, I always want to
be in control of the tolerance rather than have a default
tolerance provided for me.

I've had occasion to use math.isclose(), np.isclose(), and
np.allclose() quite often. And most of the time, the default
tolerances are good enough for my purpose. Note that NumPy and math
use different algorithms to define closeness, moreover.

I never use `math.isclose` or `np.isclose` without specifying
tolerances, even if they happen to be the defaults (which is rare).
There is no such thing as "x and y are approximately equal"; the
question is always within what bounds. And this question must be
answered by the programmer and the answer should be stated explicitly.
Obviously these tolerances are application dependent, be it measurement
errors, limited precision of sensors, numerical errors, etc.

What makes the default values so special anyway? If I were to design
such a function, I wouldn't provide any defaults at all. Yes, I read
PEP-485, but I'm not convinced. The paragraph [Relative Tolerance
Default](https://www.python.org/dev/peps/pep-0485/#relative-tolerance-default)
starts with:

> The relative tolerance required for two values to be considered
"close" is entirely use-case dependent.

That doesn't call for a default value.

[`np.isclose`](https://numpy.org/doc/stable/reference/generated/numpy.isclose.html)
is even more extreme: They also specify (non-zero) defaults and because
of that they need to display a *warning* at their docs which reads:

> The default atol is not appropriate for comparing numbers that are
much smaller than one (see Notes).

Then in the notes there is:

> atol should be carefully selected for the use case at hand.

Sounds like it would've been more appropriate to not specify a default
in the first place.

Sure, some people might complain, who want a quick way to determine if
two numbers are approximately equal, but as mentioned above, this
question cannot be answered without specifying the bounds. All that the
default tolerances do is prevent people from thinking about the
appropriate values for their specific application.

Since an operator doesn't allow to specify any tolerances, it's not a
suitable replacement for the `isclose` functions.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/PEDZUPKJYKI5ZNLMEGJKDT7YGBZO4B36/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: the 'z' string escape

2020-06-16 Thread Dominik Vilsmeier

On 16.06.20 08:40, Paul Sokolovsky wrote:

Hello,

On Tue, 16 Jun 2020 14:37:39 +0900
"Stephen J. Turnbull"  wrote:

Soni L. writes:

  > so I propose a \z string escape which lets me write the above as
  > shown below:
  >
  >      """switches to toml config format. the old
  > 'repos' \z table is preserved as 'repos_old'"""

We already have that, if you don't care about left-alignment:

"""123456789\

... abcdefghi"""
'123456789abcdefghi'

And if you care about left-alignment, but don't care about extra
function call, we have
https://docs.python.org/3/library/textwrap.html#textwrap.dedent

That said, Java (version 13) puts Python to shame with its multi-line
strings, which work "as expected" re: leading indentation out of the
box: https://openjdk.java.net/jeps/355 .

So, I wouldn't "boo, hiss" someone proposing something like:

   s = _"""
   Just imagine,
  this works
   like you would expect!
   """

But my response would be my usual - what Python actually needs is
macro/AST preprocessing capability, e.g. support for handling
'"""' strings in user-defined manner. But we can ship some
predefined macros, sure. E.g. '_' as a string prefix (like above) would
run a string thru (analog of) textwrap.dedent().

Alternatively the compiler could run such pure functions if their
arguments consist solely of literals (i.e. perform advanced constant
folding).

This means the following

    msg = textwrap.dedent('''abc
    def
    ghi''')

would be converted already at compile-time. Surely it would be
convenient to maintain a list of compatible functions, limited to the
builtins and stdlib.

Another common use case is `str.split`, e.g. `colors = "red green
blue".split()`. This could also be converted at compile-time.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SYYVOQG6OZT3VG45RCXIRJPMCM5V6SEF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Delayed computation

2020-05-29 Thread Dominik Vilsmeier


On 29.05.20 20:38, David Mertz wrote:


On Fri, May 29, 2020 at 1:56 PM Rhodri James mailto:rho...@kynesim.co.uk>> wrote:

Presumably "delayed" is something that would be automatically
applied to
the actual parameter given, otherwise your call graphs might or might
not actually be call graphs depending on how the function was called.
What happens if I call "foo(y=0)" for instance?


I am slightly hijacking the thread.  I think the "solution" to the
narrow "problem" of mutable default arguments is not at all worth
having.  So yes, if that was the only, or even main, purpose of a
hypothetical 'delayed' keyword and 'DelayedType', it would absolutely
not be worthwhile.  It would just happen to solve that problem as a
side effect.

Where I think it is valuable is the idea of letting all the normal
operations work on EITHER a DelayedType or whatever type the operation
would otherwise operate on.  So no, `foo(y=0)` would pass in a
concrete type and do greedy calculations, nothing delayed, no
in-memory call graph (other than whatever is implicit in the bytecode).


I'm still struggling to imagine a real use case which can't already be
solved by generators. Usually the purpose of such computation graphs is
to execute on some specialized hardware or because you want to backtrack
through the graph (e.g. Tensorflow, PyTorch, etc). Dask seems to be
similar in a sense that the user can choose different execution models
for the graph.

With generators you also don't have the problem of "concretizing" the
result since any function that consumes an iterable naturally does this.
If you really want to delay such a computation it's easy to write a
custom type or generator function to do so and then use `next` (or even
`concretize = next` beforehand).

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JZXWI2FWLBNRWA7RTMM24IFCTA3DFHE3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-29 Thread Dominik Vilsmeier

On 29.05.20 15:09, Chris Angelico wrote:

On Fri, May 29, 2020 at 10:51 PM Steven D'Aprano  wrote:

On Thu, May 28, 2020 at 08:04:07PM +1000, Chris Angelico wrote:

If it's a
language feature, then the name 'x' must be in the state of "local
variable without a value".

Oh ho, I see what you are doing now :-)

I'm going to stick with my argument that Python variables have two
states: bound or unbound. But you want to talk about the *meta-state* of
what scope they are in:

 LEGB = Local, Enclosing (nonlocal), Global, Builtin

There's at least one other case not captured in that acronym, Class
scope. There may be other odd corner cases.

In any case, if you want to distinguish between "unbound locals" and
"unbound globals" and even "unbound builtins", then I acknowledge that
these are genuine, and important, distinctions to make, with real
semantic differences in Python.

The reason locals are special is that you can't have a module-level
name without a value, because it's exactly the same as simply not
having one; but you CAN have a local name that must be exactly that
local, and you can't look up a module or builtin name, but it still
doesn't have a value. I believe local scope is the only one that
behaves this way.

Indeed locals are special, but why was it designed this way? Why not
resolve such an unbound local name in the enclosing scopes?

It seems that there is no way to modify locals once the function is
compiled (this is probably due to the fact that locals are optimized as
a static array?). For example:

    >>> x = 1
    >>> def foo():
    ... exec('x = 2')
    ... print(x)
    ...
    >>> foo()
    1

However in Python 2.7 this is possible:

    >>> x = 1
    >>> def foo():
    ... exec('x = 2')
    ... print(x)
    ...
    >>> foo()
    2
    >>> x
    1
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NH4LP3YI5YDK7VNZKRZKZ4GOFFV7Z4XC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-28 Thread Dominik Vilsmeier


On 28.05.20 17:44, Christopher Barker wrote:


On Thu, May 28, 2020 at 3:50 AM Alex Hall mailto:alex.moj...@gmail.com>> wrote:

On Thu, May 28, 2020 at 12:38 PM Greg Ewing
mailto:greg.ew...@canterbury.ac.nz>>
wrote:


But I'm having trouble thinking of one. I can't remember ever
writing a function with a default argument value that *has* to
be mutable and *has* to have a new one created on each call
*unless* the caller provided one.


Actually, we need to one further: a default argument value that *has* to
be mutable and *has* to have a new one created on each call
*unless* the caller provided one ...

and *has* to treat None as valid value.


That's the scenario where you'd need to create a sentinel object to take
the role of None. However late binding of defaults won't save you from this.

The biggest advantage, as far as I understood, is that you can specify a
default (expression) as part of the function header and hence provide a
meaningful example value to the users rather than just None.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/P3XDOGYMUJQQWKNKL5O3IZLRFGJYMLNI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-26 Thread Dominik Vilsmeier

On 26.05.20 14:10, David Mertz wrote:

All of those uses, including those where you say otherwise, treat None
as a sentinel. In the iter() case, the optional seconds argument is
*called* 'sentinel'. Guido recently mentioned that he had forgotten
the two argument form of iter(), which is indeed funny... But useful.

Maybe we have a different understanding of "sentinel" in this context. I
understand it as an auxiliary object that is used to detect whether the
user has supplied an argument for a parameter or not. So if the set of
possible (meaningful) arguments is "A" then the sentinel must not be an
element of A. So in cases where None has meaning as an argument it can't
act as a sentinel. `iter` is probably implemented via varargs but if it
was designed to take a `sentinel=` keyword parameter then you'd need a
dedicated sentinel object since the user can supply *any* object as the
(user-defined) sentinel, including None:

    >>> list(iter([1, 2, None, 4, 5].pop, None))
    [5, 4]

Well, ok functions.reduce() really does make it's own sentinel in
order to show NONE as a "plain value". So I'll grant that one case is
slightly helped by a hypothetical 'undef'.

The NumPy, deque, and lru_cache cases are all ones where None is a
perfect sentinel and the hypothetical 'undef' syntax would have zero
value.

For both `deque` and `lru_cache` None is a sensible argument so it can't
act as a sentinel. It just happens that these two cases don't need to
check if an argument was supplied or not, so they don't need a sentinel.
For the Numpy cases, `np.sum` and `np.diff`, None does have a meaning
from user perspective, so they need a dedicated sentinel (which is
`np._NoValue`). If `keepdims` is not supplied, it won't be passed on to
sub-classes; if it is set to None then the sub-class receives
`keepdims=None` as well:

    >>> class Test(np.ndarray):
    ... def sum(self, **kwargs):
    ... return kwargs
    ...
    >>> a = Test(0)
    >>> np.sum(a)
    {'axis': None, 'out': None}
    >>> np.sum(a, keepdims=None)
    {'axis': None, 'out': None, 'keepdims': None}

For `np.diff`, if no argument is provided for `append` (or `prepend`)
then nothing is appended (prepended), otherwise the supplied value is
used (including None):

    >>> np.diff([1, 2])
    array([1])
    >>> np.diff([1, 2], append=None)
    TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

For `np.concatenate` None is a meaningful argument to `axis` since it
will flatten the arrays before concatenation.

I was wondering if anyone would mention Pandas, which is great, but in
many ways and abuse of Pythonic programming. There None in an
initializing collection (often) gets converted to NaN, both of which
mean "missing", which is something different. This is kind of an abuse
of both None and NaN... which they know, and introduced an
experimental pd.NA for exactly that reason... Unfortunately, so far,
actually using of.NA is cumbersome, but hopefully that gets better
next version.

I wouldn't say it's an abuse, it's an interpretation of these values.
Using NaN has the clear advantage that it fits into a float array so
it's memory efficient.

Within actual Pandas and function parameters, None is always a sentinel.

On Tue, May 26, 2020, 4:48 AM Dominik Vilsmeier
mailto:dominik.vilsme...@gmx.de>> wrote:

On 26.05.20 06:03, David Mertz wrote:

On Mon, May 25, 2020, 11:56 PM Christopher Barker

well, yes and no. this conversation was in the context of
"None" works fine most of the time.

How many functions take None as a non-sentinel value?! How many
of that tiny numbers do so only because they are poorly designed.

None already is an excellent sentinel. We really don't need
others. In the rare case where we really need to distinguish None
from "some other sentinel" we should create our own special one.

The only functions I can think of where None is appropriately
non-sentinel are print(), id(), type(), and maybe a couple other
oddball special ones.

Seriously, can you name a function from the standard library or
another popular library where None doesn't have a sentinel role
as a function argument (default or not)?

* From the builtins there is `iter` which accepts a sentinel as
second argument (including None).
* `dataclasses.field` can receive `default=None` so it needs a
sentinel.
* `functools.reduce` accepts None for its `initial` parameter
(https://github.com/python/cpython/blob/3.8/Lib/functools.py#L232).
* There is also

[`sched.scheduler.enterabs`](https://github.com/python/cpython/blob/v3.8.3/Lib/sched.py#L65)
where `kwargs=None` will be passed on to the underlying `Event`.

For the following ones None could be a sentinel but it's still a
valid (meaningful) argume

[Python-ideas] Re: Optional keyword arguments

2020-05-26 Thread Dominik Vilsmeier


On 26.05.20 06:03, David Mertz wrote:


On Mon, May 25, 2020, 11:56 PM Christopher Barker

well, yes and no. this conversation was in the context of "None"
works fine most of the time.


How many functions take None as a non-sentinel value?! How many of
that tiny numbers do so only because they are poorly designed.

None already is an excellent sentinel. We really don't need others. In
the rare case where we really need to distinguish None from "some
other sentinel" we should create our own special one.

The only functions I can think of where None is appropriately
non-sentinel are print(), id(), type(), and maybe a couple other
oddball special ones.

Seriously, can you name a function from the standard library or
another popular library where None doesn't have a sentinel role as a
function argument (default or not)?


* From the builtins there is `iter` which accepts a sentinel as second
argument (including None).
* `dataclasses.field` can receive `default=None` so it needs a sentinel.
* `functools.reduce` accepts None for its `initial` parameter
(https://github.com/python/cpython/blob/3.8/Lib/functools.py#L232).
* There is also
[`sched.scheduler.enterabs`](https://github.com/python/cpython/blob/v3.8.3/Lib/sched.py#L65)
where `kwargs=None` will be passed on to the underlying `Event`.

For the following ones None could be a sentinel but it's still a valid
(meaningful) argument (different from the default):

* `functools.lru_cache` -- `maxsize=None` means no bounds for the cache
(default is 128).
* `collections.deque` -- `maxlen=None` means no bounds for the deque
(though this is the default).

Other example functions from Numpy:

*
[`numpy.concatenate`](https://numpy.org/doc/1.18/reference/generated/numpy.concatenate.html)
-- here `axis=None` means to flatten the arrays before concatenation
(the default is `axis=0`).
* Any function performing a reduction, e.g.
[`np.sum`](https://numpy.org/doc/1.18/reference/generated/numpy.sum.html)
-- here if `keepdims=` is provided (including None) then it will passed
to the `sum` method of ndarray-sub-classes, otherwise not.
*
[`np.diff`](https://numpy.org/doc/1.18/reference/generated/numpy.diff.html)
supports prepending / appending values prior to the computation,
including None (though that application is probably rare).

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/U4U7X36COLQ776LRB4O6O4BEXDXFWHJK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-25 Thread Dominik Vilsmeier

On 25.05.20 18:29, Chris Angelico wrote:

On Tue, May 26, 2020 at 2:24 AM Dominik Vilsmeier
 wrote:

On 25.05.20 17:29, Ricky Teachey wrote:

On Mon, May 25, 2020, 6:49 AM Rob Cliffe via Python-ideas 
 wrote:

  (Possibly heretical) Thought:
ISTM that when the decision was made that arg default values should be evaluated
 once, at function definition time,
rather than
 every time the function is called and the default needs to be supplied
that that was the *wrong* decision.
There may have been what seemed good reasons for it at the time (can anyone 
point me
to any relevant discussions, or is this too far back in the Python primeval 
soup?).
But it is a constant surprise to newbies (and sometimes not-so-newbies).
As is attested to by the number of web pages on this topic.  (Many of them 
defend
the status quo and explain that it's really quite logical - but why does the 
status quo
*need* to be defended quite so vigorously?)

First of all: supplying a default object one time and having it start fresh at 
every call would require copying the object. But it is not clear what kind of 
copying of these default values should be done. The language doesn't inherently 
know how to arbitrarily make copies of every object; decisions have to be made 
to define what copying the object would MEAN in different contexts.

It wouldn't copy the provided default, it would just reevaluate the expression. 
Python has already a way of deferring evaluation, generator expressions:

 >>> x = 1
 >>> g = (x for __ in range(2))
 >>> next(g)
 1
 >>> x = 2
 >>> next(g)
 2

It's like using a generator expression as the default value and then if the 
argument is not provided Python would use `next(gen)` instead of the `gen` 
object itself to fill the missing value. E.g.:

 def foo(x = ([] for __ in it.count())):  # if `x` is not provided use 
`next` on that generator
 pass

Doing this today would use the generator itself to fill a missing `x`, so this 
doesn't buy anything without changing the language.

Well if you want to define the semantics that way, there's a way
cleaner form. Just talk about a lambda function:

def foo(x = lambda: []):
 pass

and then the function would be called and its return value assigned to
x, if the parameter isn't given.

Indeed, the above example originated from the idea of treating generator
expressions as default values in a special way, namely such that if the
corresponding parameter receives no argument then `next(gen)` would be
used instead of the `gen` object itself to supply a value (it would be a
breaking change but how many functions use generator expressions as
defaults?). But then the construct `([] for __ in it.count())` is worse
than `if x is None:` so there's no point in doing that.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LTR754HLGKNXUX7LWH4DLW43QJE6Z75U/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-25 Thread Dominik Vilsmeier

On 25.05.20 17:29, Ricky Teachey wrote:

On Mon, May 25, 2020, 6:49 AM Rob Cliffe via Python-ideas
mailto:python-ideas@python.org>> wrote:

 (Possibly heretical) Thought:
ISTM that when the decision was made that arg default values
should be evaluated
        once, at function definition time,
rather than
        every time the function is called and the default needs to
be supplied
that that was the *wrong* decision.
There may have been what seemed good reasons for it at the time
(can anyone point me
to any relevant discussions, or is this too far back in the Python
primeval soup?).
But it is a constant surprise to newbies (and sometimes
not-so-newbies).
As is attested to by the number of web pages on this topic.  (Many
of them defend
the status quo and explain that it's really quite logical - but
why does the status quo
*need* to be defended quite so vigorously?)

_First of all_: supplying a default object one time and having it
start fresh at every call would /require copying the object/. But it
is not clear what kind of copying of these default values should be
done. The language doesn't inherently know how to arbitrarily make
copies of every object; decisions have to be made to define what
copying the object would MEAN in different contexts.

It wouldn't copy the provided default, it would just reevaluate the
expression. Python has already a way of deferring evaluation, generator
expressions:

    >>> x = 1
    >>> g = (x for __ in range(2))
    >>> next(g)
    1
    >>> x = 2
    >>> next(g)
    2

It's like using a generator expression as the default value and then if
the argument is not provided Python would use `next(gen)` instead of the
`gen` object itself to fill the missing value. E.g.:

    def foo(x = ([] for __ in it.count())):  # if `x` is not provided
use `next` on that generator
    pass

Doing this today would use the generator itself to fill a missing `x`,
so this doesn't buy anything without changing the language.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YFKI6SMQGOEI334GP5IYLRIQVZ7NNT4T/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-25 Thread Dominik Vilsmeier


On 25.05.20 03:03, Rob Cliffe via Python-ideas wrote:



On 24/05/2020 21:03, Dominik Vilsmeier wrote:


On 24.05.20 18:34, Alex Hall wrote:



OK, let's forget the colon. The point is just to have some kind of
'modifier' on the default value to say 'this is evaluated on each
function call', while still having something that looks like
`arg=`. Maybe something like:

     def func(options=from {}):


It looks like the most common use case for this is to deal with
mutable defaults, so what is needed is some way to specify a default
factory, similar to `collections.defaultdict(list)` or
`dataclasses.field(default_factory=list)`. This can be handled by a
decorator, e.g. by manually supplying the factories or perhaps
inferring them from type annotations:

    @supply_defaults
    def foo(x: list = None, y: dict = None):
    print(x, y)  # [], {}


    @supply_defaults(x=list, y=dict)
    def bar(x=None, y=None):
    print(x, y)  # [], {}



That's very clever, but if you compare it with the status quo:

    def bar(x=None, y=None):
        if x is None: x = []
        if y is None: y={}

it doesn't save a lot of typing and will be far more obscure to newbies
who may not know about decorators.


Actually it was intended to use type annotations a la PEP 585 (using
builtin types directly) and hence not requiring explicit specification
of the factory:

    @supply_defaults
    def bar(x: list = None, y: dict = None):
    pass

This also has the advantage that the types are visible in the function
signature. Sure this works only in a limited number of cases, but the
case of mutable defaults seems to be quite prominent, and this solves it
at little cost.


No, forget fudges.
I think what is needed is to take the bull by the horns and add some
*new syntax*
that says "this default value should be (re)calculated every time it
is needed".
Personally I don't think the walrus operator is too bad:
    def bar(x:=[], y:={}):


What about using `~` instead of `:=`. As a horizontal symbol it has some
similarity to `=` and usually "~" denotes proportionality which also
allows to make a connection to the use case. For proportionality "x ~ y"
means there's a non-zero constant "k" such that "x = k*y" and in the
case of defaults it would mean, there's a non-trivial step such that `x
= step(y)` (where `step=eval` roughly).

    def bar(x=1, y~[], z ~ {}):

It looks better with spaces around "~" but that's probably a matter of
being used to it.

A disadvantage is that `~` is already a unary operator, so one could do
this: `def foo(x~~y)`. But how often does this occur anyway?

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/O644JM442V5K4J5ZTJPRFTQ6YY4XMI74/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-24 Thread Dominik Vilsmeier


On 24.05.20 19:38, David Mertz wrote:


As syntax, I presume this would be something like:

output = []
for x in data:
    a = delayed inc(x)
    b = delayed double(x)
    c = delayed add(a, b)
output.append(c)

total = sum(outputs)  # concrete answer here.

Obviously the simple example of adding scalars isn't worth the delay
thing.  But if those were expensive operations that built up a call
graph, it could be useful laziness.


Do you have an example which can't be solved by using generator
expressions and itertools? As far as I understand the Dask docs the
purpose of this is to execute in parallel which wouldn't be the case for
pure Python I suppose? The above example can be written as:

    a = (inc(x) for x in data)
    b = (double(x) for x in data)
    c = (add(x, y) for x, y in zip(a, b))
    total = sum(c)

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4556SWPB4UNBZBXD4LM57UCA7ESVENVM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Optional keyword arguments

2020-05-24 Thread Dominik Vilsmeier


On 24.05.20 18:34, Alex Hall wrote:



OK, let's forget the colon. The point is just to have some kind of 
'modifier' on the default value to say 'this is evaluated on each 
function call', while still having something that looks like 
`arg=`. Maybe something like:


     def func(options=from {}):

It looks like the most common use case for this is to deal with mutable 
defaults, so what is needed is some way to specify a default factory, 
similar to `collections.defaultdict(list)` or 
`dataclasses.field(default_factory=list)`. This can be handled by a 
decorator, e.g. by manually supplying the factories or perhaps inferring 
them from type annotations:


    @supply_defaults
    def foo(x: list = None, y: dict = None):
    print(x, y)  # [], {}


    @supply_defaults(x=list, y=dict)
    def bar(x=None, y=None):
    print(x, y)  # [], {}


This doesn't require any change to the syntax and should serve most 
purposes. A rough implementation of such a decorator:



    import functools
    import inspect


    def supply_defaults(*args, **defaults):
    def decorator(func):
    signature = inspect.signature(func)
    defaults.update(
    (name, param.annotation) for name, param in 
signature.parameters.items()
    if param.default is None and param.annotation is not 
param.empty

    )

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
    bound = signature.bind(*args, **kwargs)
    kwargs.update(
    (name, defaults[name]())
    for name in defaults.keys() - bound.arguments.keys()
    )
    return func(*args, **kwargs)

    return wrapper

    if args:
    return decorator(args[0])
    return decorator

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KHFG4FKL6XZHIPRYDQE4W6D4OU6ZPLNA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Equality between some of the indexed collections



On 09.05.20 22:16, Andrew Barnert wrote:

On May 9, 2020, at 02:58, Dominik Vilsmeier  wrote:


Initially I assumed that the reason for this new functionality was
concerned with cases where the types of two objects are not precisely
known and hence instead of converting them to a common type such as
list, a direct elementwise comparison is preferable (that's probably
uncommon though). Instead in the case where two objects are known to
have different types but nevertheless need to be compared
element-by-element, the performance argument makes sense of course.

So as a practical step forward, what about providing a wrapper type
which performs all operations elementwise on the operands. So for example:

 if all(elementwise(chars) == string):
 ...

Here the `elementwise(chars) == string` part returns a generator which
performs the `==` comparison element-by-element.

This doesn't perform any length checks yet, so as a bonus one could add
an `all` property:

 if elementwise(chars).all == string:
 ...

There’s an obvious use for the .all, but do you ever have a use for the 
elementwise itself? When do you need to iterate all the individual comparisons? 
(In numpy, an array of bools has all kinds of uses, starting with indexing or 
selecting with it, but I don’t think any of them are doable here.)

I probably took too much inspiration from Numpy :-) Also I thought it
would nicely fit with the builtin `all` and `any`, but you are right,
there's probably not much use for the elementwise iterator itself. So
one could use `elementwise` as a namespace for `elementwise.all(chars)
== string` and `elementwise.any(chars) == string` which automatically
reduce the elementwise comparisons and the former also performs a length
check prior to that. This would still leave the option of having
`elementwise(x) == y` return an iterator without reducing (if desired).

And obviously this would be a lot simpler if it was just the all object rather 
than the elementwise object—and even a little simpler to use:

 element_compare(chars) == string

(In fact, I think someone submitted effectively that under a different name for 
more-itertools and it was rejected because it seemed really useful but 
more-itertools didn’t seem like the right place for it. I have a similar 
“lexicompare” in my toolbox, but it has extra options that YAGNI. Anyway, even 
if I’m remembering right, you probably don’t need to dig up the more-itertools 
PR because it’s easy enough to redo from scratch.)


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2FSDNFSNBPKP6Y67CRU7W46JO3TUNS74/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Equality between some of the indexed collections


On 09.05.20 14:16, Dominik Vilsmeier wrote:


On 09.05.20 12:18, Alex Hall wrote:


On Sat, May 9, 2020 at 11:57 AM Dominik Vilsmeier
mailto:dominik.vilsme...@gmx.de>> wrote:

So as a practical step forward, what about providing a wrapper type
which performs all operations elementwise on the operands. So for
example:

 if all(elementwise(chars) == string):
 ...

Here the `elementwise(chars) == string` part returns a generator
which
performs the `==` comparison element-by-element.


Now `==` has returned an object that's always truthy, which is pretty
dangerous.


That can be resolved by returning a custom generator type which
implements `def __bool__(self): raise TypeError('missing r.h.s.
operand')`.


After reading this again, I realized the error message is nonsensical in
this context. It should be rather something like: `TypeError('The truth
value of an elementwise comparison is ambiguous')` (again taking some
inspiration from Numpy).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/X2N77J2I5KGNADQJ7GKLL3Z6NJ3RKGPC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Equality between some of the indexed collections


On 09.05.20 12:18, Alex Hall wrote:


On Sat, May 9, 2020 at 11:57 AM Dominik Vilsmeier
mailto:dominik.vilsme...@gmx.de>> wrote:

So as a practical step forward, what about providing a wrapper type
which performs all operations elementwise on the operands. So for
example:

 if all(elementwise(chars) == string):
 ...

Here the `elementwise(chars) == string` part returns a generator which
performs the `==` comparison element-by-element.


Now `==` has returned an object that's always truthy, which is pretty
dangerous.


That can be resolved by returning a custom generator type which
implements `def __bool__(self): raise TypeError('missing r.h.s. operand')`.



This doesn't perform any length checks yet, so as a bonus one
could add
an `all` property:

 if elementwise(chars).all == string:
 ...


This is now basically numpy.

```
In[14]: eq = numpy.array([1, 2, 3]) == [1, 2, 4]
In[15]: eq
Out[15]: array([ True,  True, False])
In[16]: eq.all()
Out[16]: False
In[17]: eq.any()
Out[17]: True
In[18]: bool(eq)
Traceback (most recent call last):
...
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
```

I've used number instead of strings because numpy treats strings as
units instead of iterables for this kind of purpose, so you'd have to
do some extra wrapping in lists to explicitly ask for character
comparisons.



Actually I took some inspiration from Numpy but the advantage is of
course not having to install Numpy. The thus provided functionality is
only a very small subset of what Numpy provides.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BDUA3W47HMXVHWPI5XTFUP2JYNBR5M5J/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Equality between some of the indexed collections


On 08.05.20 19:01, Steven D'Aprano wrote:


All this proposal adds is *duck-typing* to the comparison, for when
it doesn't matter what the container type is, you care only about the
values in the container. Why be forced to do a possibly expensive (and
maybe very expensive!) manual coercion to a common type just to check
the values for equality element by element, and then throw away the
coerced object?

If you have ever written `a == list(b)` or similar, then You Already
Needed It :-)


Initially I assumed that the reason for this new functionality was
concerned with cases where the types of two objects are not precisely
known and hence instead of converting them to a common type such as
list, a direct elementwise comparison is preferable (that's probably
uncommon though). Instead in the case where two objects are known to
have different types but nevertheless need to be compared
element-by-element, the performance argument makes sense of course.

So as a practical step forward, what about providing a wrapper type
which performs all operations elementwise on the operands. So for example:

    if all(elementwise(chars) == string):
    ...

Here the `elementwise(chars) == string` part returns a generator which
performs the `==` comparison element-by-element.

This doesn't perform any length checks yet, so as a bonus one could add
an `all` property:

    if elementwise(chars).all == string:
    ...

This first checks the lengths of the operands and only then compares for
equality. This wrapper type has the advantage that it can also be used
with any other operator, not just equality.

Here's a rough implementation of such a type:

    import functools
    import itertools
    import operator


    class elementwise:
    def __init__(self, obj, *, zip_func=zip):
    self.lhs = obj
    self.zip_func = zip_func

    def __eq__(self, other): return self.apply_op(other,
op=operator.eq)
    def __lt__(self, other): return self.apply_op(other,
op=operator.lt)
    ...  # define other operators here

    def apply_op(self, other, *, op):
    return self.make_generator(other, op=op)

    def make_generator(self, other, *, op):
    return itertools.starmap(op, self.zip_func(self.lhs, other))

    @property
    def all(self):
    zip_func = functools.partial(itertools.zip_longest,
fillvalue=object())
    return elementwise_all(self.lhs, zip_func=zip_func)


    class elementwise_all(elementwise):
    def apply_op(self, other, *, op):
    try:
    length_check = len(self.lhs) == len(other)
    except TypeError:
    length_check = True
    return length_check and all(self.make_generator(other, op=op))
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CCPJWQ5TYCJHEUVZD554EEBYUIPIJIKP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: zip() as a class constructor (meta) [was: ... Length-Checking To zip]

2020-05-07 Thread Dominik Vilsmeier

On 07.05.20 09:38, Stephen J. Turnbull wrote:

Christopher Barker writes:

  > So while yes, alternate constructors are a common pattern, I don't
  > think they are a common pattern for classes like zip.

That's a matter of programming style, I think.  There's no real
difference between

 zip(a, b, length='checksame')

and

 zip.checksame(a, b)

They just initialize an internal attribute differently, which takes
one of a very few values.

The big difference between these two versions is about usability. A flag
is convenient if the actual value is expected to be determined only at
runtime so you can write `zip(a, b, length=length)`. A distinct function
on the other hand emphasizes the expectation that this behavior is
usually determined when the code is written; it would be awkward to
write `getattr(zip, length)(a, b)`. Both this and the different behavior
of zip-flavors speak in favor of the second, `zip.` version. One
concern however is that `zip.checksame` looks like `checksame` is a
convenience function of `zip` that doesn't necessarily perform any
zipping; it could only perform the length check and return True or
False. Sure, for general iterators this would not be very useful because
they get consumed in the process but for someone who doesn't know the
details this might not be as obvious. Maybe people start writing code
like this:

    if not zip.checksame(a, b):
    raise ValueError()
    for stuff in zip(a, b):
    ...
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VOLDNSQUXHCDYTPUQZBBMBKNZE7HKCFJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Equality between some of the indexed collections

2020-05-07 Thread Dominik Vilsmeier


On 07.05.20 11:11, Steven D'Aprano wrote:


On Sat, May 02, 2020 at 05:12:58AM -, Ahmed Amr wrote:


Currently, when comparing a list of items to an array of the same
items for equality (==) it returns False, I'm thinking that it would
make sense to return True in that context, as we're comparing item
values and we have the same way of indexing both collections, so we
can compare item values.


Perhaps we ought to add a second "equals" operator? To avoid
bikeshedding over syntax, I'm initially going to use the ancient 1960s
Fortran syntax and spell it `.EQ.`.

[...]

We could define this .EQ. operate as *sequence equality*, defined very
roughly as:

 def .EQ. (a, b):
 return len(a) == len(b) and all(x==y for x, y in zip(a, b))


But why do we even need a new operator when this simple function does
the job (at least for sized iterables)?

How common is it to compare two objects where you cannot determine
whether one or the other is a tuple or a list already from the
surrounding context? In the end these objects must come from somewhere
and usually functions declare either list or tuple as their return type.

Since for custom types you can already define `__eq__` this really comes
down to the builtin types, among which the theoretical equality between
tuple and list has been debated in much detail but is it used in practice?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/A45PRNTU63YDHSN3VMPF7HPCOUWMPWNL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Passing Arguments Through Thin Wrappers (was : Auto-assign attributes from init arguments)

2020-05-05 Thread Dominik Vilsmeier


On 05.05.20 15:35, Dan Sommers wrote:


On Tue, 5 May 2020 23:06:39 +1000
Steven D'Aprano  wrote:


... help me solve the DRY problem for module-level functions:

 def function(spam, eggs, cheese, aardvark):
 do stuff
 call _private_function(spam, eggs, cheese, aardvark)

since this bites me about twice as often as the `self.spam = spam`
issue.

(That's not me being snarky by the way, it's a genuine question:
dataclasses are a mystery to me, so I don't know what they can and can't
do.)

Lisp macros have a "" feature that captures the entire collection
of arguments to the macro:

 http://clhs.lisp.se/Body/03_dd.htm

Perhaps Python could adopt something similar?  Unlike *args and
**kwargs,  captures all of the parameters, not just the
non-positional, non-named ones.  The idea would be something like this:

 def function(spam, eggs, cheese, aardvark, ):
 do_stuff
 _private_function()

which would call _private_function as function was called.


What about a way for overloading function signatures? The arguments are
then bound to both signatures and the function has access to all the
parameters. For example:

    def function(spam, eggs, cheese, aardvark) with (*args):
    ...  # do stuff
    _private_function(*args)

Calling `function(1, 2, 3, 4)` results in `spam, eggs, cheese, aardvark
= 1, 2, 3, 4` and `args = (1, 2, 3, 4)`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OHGIK5YHET6YUANFDT7RRXRT47AIAYM5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 618: Add Optional Length-Checking To zip

On 04.05.20 16:14, Steven D'Aprano wrote:

On Sun, May 03, 2020 at 09:41:00PM -0700, Christopher Barker wrote:

On Sun, May 3, 2020 at 6:17 PM Steven D'Aprano wrote:

map(func, x, y, strict=True) # ?

Admittedly the word "strict" in the context of `map` would be rather
confusing.

This a really good argument for "equal" rather than "strict".

Sorry, I'm not seeing why this would be confusing for `map` but not
`zip`. And "equal" might suggest that x and y need to be equal.

of course it would be confusing for zip.

Dominik seems to think that it is acceptable for zip but confusing for
map, so I don't think that any confusion deserves to be described with
"of course". At least, it's not obvious to me why it is confusing.

I think it's acceptable for `zip` because it's possible to infer its
meaning without much ambiguity. I think it's reasonable to associate
`strict=True` with the process of zipping (i.e. neither with the
arguments nor the result directly). And then the only thing it could
possibly be strict about is the length of the iterables. On the other
hand my perception is probably biased by having participated in this
discussion and hence knowing already what the meaning is without having
to think about it. It would be interesting to hear from a person who
hasn't participated in this discussion what they expect from `strict=True`.

However for `map` I wouldn't associate `strict=True` with the
zipping-functionality that it provides. This is a nice feature but not
the main purpose of the function -- it would equally work without it
(requiring manual zipping). I'd associate this keyword either with the
process of mapping or the produced values. Regarding the former I could
imagine an alternative interpretation: currently `map` stops on a
`StopIteration` which is leaked by the mapping function (unlike
generators which convert this into a `RuntimeError`). So I could imagine
that `strict=True` activates this conversion also for `map`, i.e. being
strict in a sense that it accepts a `StopIteration` only from the mapped
iterator and not from the mapping function.

Perhaps "truncate" or even "trunc" is a better keyword than either
strict or equal. Not that I'm arguing for a keyword here.

But it wouldn't be truncating anything.

`truncate=True` would be the current behaviour, which truncates at the
shortest input:

py> list(zip('a', range(10)))
[('a', 0)]

`truncate=True` seems to be quite clear about its behavior, but for
`truncate=False` I think it's not obvious that this raises an exception.
It just says it won't truncate on the shortest iterator but otherwise is
not explicit about its behavior. `strict` on the other hand implies an
"intervening reaction" when not playing by the rules, like an exception
being raised.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/YTCIKQTT4F3X2BK4SKOGH4QH7C4DECX5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 618: Add Optional Length-Checking To zip


On 04.05.20 19:43, Steve Barnes wrote:


How about "balanced" or "exact" as possible names. The main thing that I think 
is vital is for the docstring(s) to mention that they all exist - the current zip (in 3.8) doesn't 
mention zip_longest so if you don't already know about it.


What about "equilen" as an analogy to "equidistant" combined with "len".
Or simply "samelength". "exact" is also appealing but it doesn't say
what's being exact. It might also be insightful to look at the negated
flag, e.g. `exact=False` -- which could suggest that once an iterator is
exhausted it just continues to yield tuples of reduced length, i.e.
`list(zip('ab', range(3), exact=False))` could result in `[('a', 0),
('b', 1), (2,)]`.


-Original Message-
From: MRAB 
Sent: 04 May 2020 17:55
To: python-ideas@python.org
Subject: [Python-ideas] Re: PEP 618: Add Optional Length-Checking To zip

On 2020-05-04 13:17, Dominik Vilsmeier wrote:

"strict" doesn't say what it's being strict about. That information
has to be inferred by the reader.

[snip]

And "equal" doesn't say what it's equal.

What we need is a word that means "same length", much as "shorter" and "longer" 
are about length.

There's "coextensive", but that'll probably get a -1.
___
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an 
email to python-ideas-le...@python.org 
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/K3OGDVFTB46BQVGWXU2Q3K2V24MBUQIZ/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CEXXJOSBDTOOP74QH6XHFZJF4QXEURXP/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/36IIEWDRLQTUFBUX5GN62P5JR5K5XLU3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 618: Add Optional Length-Checking To zip


There could be other modes, such as `mode="repeat"` which reuses the
last value of each iterator as a fillvalue, or `mode="wrap"` which is
similar to `zip(*(it.cycle(x) for x in its))`.

So indeed a binary flag protects from additional requests to further
overload that function. This can be a good thing but it could also miss
on (future) opportunities.


On 04.05.20 16:55, Ricky Teachey wrote:

I'm wondering if a `mode` (or similar) keyword argument, with multiple
possible options, should be included in the "rejected" section of the PEP.

zip(*args, mode="longest")  <-- default
zip(*args, mode="equal")  <-- or "even"

An advantage of this way is if the option to zip in different ways
proves popular, you could later add zip shortest as an option, and
maybe others I'm not smart enough to think of:

zip(*args, mode="shortest")

The PEP should argue against doing this since `strict= True or False`
forever limits you to only two modes of zipping.

---
Ricky.

"I've never met a Kentucky man who wasn't either thinking about going
home or actually going home." - Happy Chandler



___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/35KD5VWNOOYOKGE7M2B7RTBTFQ2N7ZNM/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FKRV5Q6QZSNZZQFBIOS6HLMIV2H6GWGH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 618: Add Optional Length-Checking To zip

"strict" doesn't say what it's being strict about. That information has
to be inferred by the reader. As a keyword argument I'd expect it to
relate to the function's main purpose, so for `zip` I can understand how
this refers to the arguments (since their items end up in the resulting
tuples). However the main purpose of `map` is to produce new values,
that depend on the provided function, so here the focus shifts from the
input to the result. Hence I'd expect that `strict=True` refers to the
produced values somehow (perhaps asserting that no value is produced
twice).

So if `zip` gets `strict=True` then I think it's clearer if `map` got
`zip_strict=True`, as it's being explicit about its relation to the
arguments.

On 04.05.20 06:57, Guido van Rossum wrote:

I should really stay out of this (hundreds of messages and still
bickering^Wbikeshedding :-), but I personally find strict=True *less*
confusing than equal=True, both for zip() and for map(). If I didn't
know what was going on, seeing equal=True would make me wonder about
whether equality between the elements might be involved somehow.

On Sun, May 3, 2020 at 9:42 PM Christopher Barker mailto:python...@gmail.com>> wrote:

On Sun, May 3, 2020 at 6:17 PM Steven D'Aprano
mailto:st...@pearwood.info>> wrote:

> >  map(func, x, y, strict=True)  # ?
> >
> > Admittedly the word "strict" in the context of `map` would
be rather
> > confusing.
> >
>
> This a really good argument for "equal" rather than "strict".

Sorry, I'm not seeing why this would be confusing for `map`
but not
`zip`. And "equal" might suggest that x and y need to be equal.

of course it would be confusing for zip. I and others have been
advocating for "equal" over "strict" for a whiie. this is yet
another argument. Since I never liked "strict", I'm not sure I can
argue why it might be more confusing or map than zip :-)

Perhaps "truncate" or even "trunc" is a better keyword than
either
strict or equal. Not that I'm arguing for a keyword here.

But it wouldn't be truncating anything. If we want to be wordy,
equal_length would do it -- but I wouldn't want to be that wordy.

-CHB

--
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org

To unsubscribe send an email to python-ideas-le...@python.org

https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at

https://mail.python.org/archives/list/python-ideas@python.org/message/DK3PG4ITHWSSCN4S4KW5EDPEBP26OSXF/
Code of Conduct: http://python.org/psf/codeofconduct/

--
--Guido van Rossum (python.org/~guido )
/Pronouns: he/him //(why is my pronoun here?)/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OID2HSI2GLYXIG6KINKT7WP4WHJCQIPY/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/K2BK567LX7PMFYVT4S43LJ6UUGU4MD3C/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 618: Add Optional Length-Checking To zip

2020-05-03 Thread Dominik Vilsmeier


If `zip` gets a `strict` keyword-only parameter, a slightly related
question is whether `map` should also receive one?

`map` can be used as zip + transform:

    map(func, x, y)
    (func(a, b) for a, b in zip(x, y))  # similar

Now if I'm using the first option and I want to enable the strict check,
I need to switch either to the second one or use `itertools.starmap`
with `zip`:

    it.starmap(func, zip(x, y, strict=True))
    (func(a, b) for a, b in zip(x, y, strict=True))

    map(func, x, y, strict=True)  # ?

Admittedly the word "strict" in the context of `map` would be rather
confusing.

On 01.05.20 20:10, Brandt Bucher wrote:

I have pushed a first draft of PEP 618:

https://www.python.org/dev/peps/pep-0618

Please let me know what you think – I'd love to hear any *new* feedback that 
hasn't yet been addressed in the PEP!

Brandt
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZBB5L2I45PNLTRW7CCV4FDJO5DB7M5UT/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6FN6DJOJ4PIMWLSSP5XS5MCWB7T3C4WR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Equality between some of the indexed collections

2020-05-02 Thread Dominik Vilsmeier


`frozenset` and `set` make a counterexample:

>>> frozenset({1}) == {1}
True

On 02.05.20 22:36, Guido van Rossum wrote:

It does look like that would violate a basic property of `==` -- if
two values compare equal, they should be equally usable as dict keys.
I can't think of any counterexamples.

On Sat, May 2, 2020 at 1:33 PM Alex Hall mailto:alex.moj...@gmail.com>> wrote:

On Sat, May 2, 2020 at 9:51 PM Serhiy Storchaka
mailto:storch...@gmail.com>> wrote:

02.05.20 21:34, Ahmed Amr пише:
> I see there are ways to compare them item-wise, I'm
suggesting to bake
> that functionality inside the core implementation of such
indexed
> structures.
> Also those solutions are direct with tuples and lists, but
it wouldn't
> be as direct with arrays-lists/tuples comparisons for example.

If make `(1, 2, 3) == [1, 2, 3]` we would need to make
`hash((1, 2, 3))
== hash([1, 2, 3])`.


Would we? Is the contract `x == y => hash(x) == hash(y)` still
required if hash(y) is an error? What situation involving dicts
could lead to a bug if `(1, 2, 3) == [1, 2, 3]` but `hash((1, 2,
3))` is defined and `hash([1, 2, 3])` isn't?

The closest example I can think of is that you might think you can
do `{(1, 2, 3): 4}[[1, 2, 3]]`, but once you get `TypeError:
unhashable type: 'list'` it'd be easy to fix.
___
Python-ideas mailing list -- python-ideas@python.org

To unsubscribe send an email to python-ideas-le...@python.org

https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at

https://mail.python.org/archives/list/python-ideas@python.org/message/BMSP5BQP2UURBKV5LPLQXO6PZDP5PQGX/
Code of Conduct: http://python.org/psf/codeofconduct/



--
--Guido van Rossum (python.org/~guido )
/Pronouns: he/him //(why is my pronoun here?)/


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OPT6D6COYSMATTARQTUVVMAOPP6LEGHN/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KMKPS3XCES5T5J4TY2PX3UQ7XPWE5AOB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: zip should return partial results in StopIteration

2020-04-22 Thread Dominik Vilsmeier


On 22.04.20 11:19, Steven D'Aprano wrote:


On Wed, Apr 22, 2020 at 10:52:44AM +0200, Dominik Vilsmeier wrote:


You can basically use the code from this StackOverflow answer (code
attached below) to cache the last object yielded by each iterator:
https://stackoverflow.com/a/61126744

Caching the result of iterators is unsafe if the value yielded depends
on the environment at the time. It can leave you vulnerable to Time Of
Check To Time Of Use bugs, or inaccurate results.




Good point, but in this case it only caches the most recently yielded
value. It's the value that would have ended up in a tuple during `zip`
if not one of the other iterators had stopped the iteration. So whatever
state the environment is in, it still corresponds to those cached
values. How this partial result is then used is a different question.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/27K2DBLAVAO4VXWUQFRPWLZHU54N2T6T/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: zip should return partial results in StopIteration

2020-04-22 Thread Dominik Vilsmeier


On 22.04.20 06:43, Soni L. wrote:




On 2020-04-21 7:12 p.m., Steven D'Aprano wrote:

On Tue, Apr 21, 2020 at 05:33:24PM -0300, Soni L. wrote:

> 1. see the other thread (strict zip), especially the parts where 
they > brought up the lack of peekable/unput iterators in the context 
of > getting a count out of an iterator.


I've seen it, as far as I can tell there are already good solutions
for getting a count out of an iterator.


are there *any* solutions for getting partial results out of zip with 
different-length iterators of unknown origin?




You can basically use the code from this StackOverflow answer (code 
attached below) to cache the last object yielded by each iterator: 
https://stackoverflow.com/a/61126744


    iterators = [
    iter([0, 1, 2]),
    iter([3, 4]),
    iter([5]),
    iter([6, 7, 8])
    ]

    iterators = [cache_last(i) for i in iterators]
    print(list(zip(*iterators)))   # [(0, 3, 5, 6)]

    partial = []
    for i in iterators:
    try:
    partial.append(i.last)
    except StopIteration:
    break
    print(partial)   # [1, 4]


This is the code for `cache_last`:

    class cache_last:
    def __init__(self, iterable):
    self.obj = iterable
    self._iter = iter(iterable)
    self._sentinel = object()

    @property
    def last(self):
    if self.exhausted:
    raise StopIteration
    return self._last

    @property
    def exhausted(self):
    if not hasattr(self, '_last'):
    raise ValueError('Not started!')
    return self._last is self._sentinel

    def __next__(self):
    try:
    self._last = next(self._iter)
    except StopIteration:
    self._last = self._sentinel
    raise
    return self._last

    def __iter__(self):
    return self

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/37OI7YBIGCW3Y5JD347XWY3RM3I3SJZC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Proposal: Keyword Unpacking Shortcut [was Re: Keyword arguments self-assignment]

2020-04-20 Thread Dominik Vilsmeier


On 20.04.20 12:52, Steven D'Aprano wrote:


On Mon, Apr 20, 2020 at 11:15:32AM +0200, Alex Hall wrote:

On Mon, Apr 20, 2020 at 2:48 AM Steven D'Aprano  wrote:

I have an actual, concrete possible enhancement in mind: relaxing the
restriction on parameter order.


What? Do you think that the current restriction is bad, and we should just
drop it? Why?

No, I have no opinion at the moment on whether we should relax that
restriction. I'm saying that the mode-shift suggestion:

 func(arg, name=value,
  *,  # change to auto-fill mode
  alpha, beta, gamma,
  )

will rule out any further relaxation on that restriction, and that is a
point against it. That's a concrete enhancement that we might allow some
time. Whether *I personally* want that enhancement is irrelevant.

You on the other hand, claim that my suggestion:

 func(arg, name=value,
  **{alpha, beta, gamma},
  )

will also rule out some unspecified, unknown, unimagined future
enhancements. I'm saying that's a weak argument, unless you have a
specific enhancement in mind.


This rules out the possibility to treat sets as mappings from their
elements to `True`. Unlikely, but so are positional arguments following
keyword arguments.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PJ4WWVMMLCP2UXF5ZP2YQHUEWP7WC76X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Proposal: Keyword Unpacking Shortcut [was Re: Keyword arguments self-assignment]

2020-04-19 Thread Dominik Vilsmeier


On 19.04.20 12:57, Steven D'Aprano wrote:


On Sat, Apr 18, 2020 at 09:13:44PM +0200, Dominik Vilsmeier wrote:


     func(foo, **, bar)  vs.  func(foo, **{bar})

It's still a mode switch, only the beginning and end markers have
changed. Instead of `**,` (or `**mapping,`) we then have `**{` as the
opening marker and instead of `)` (the parenthesis that closes the
function call) we have `}` as the closing marker.

How do you define a mode switch?

I don't have a clear definition, instead my point was to show that there
is no substantial difference between `**` and `**{...}` (though you seem
to think differently).

Is a list display a mode? Is a string a mode? Is a float a mode?

I'd say yes, technically they are. There's a substantial difference
between writing `sys.exit()` and `"sys.exit()"`. Same for 'comment
mode': `# sys.exit()` , and most IDEs have shortcuts for switching it on
and off. However I think a major difference is that all these can exist
in different contexts, also in isolation, and hence we think of them as
self-contained entities. The modal aspect is only relevant to the compiler.

In some sense, maybe, but to me the critical factor is that nobody talks
about "list mode", "string mode", let alone "float mode". Its about the
mental model.

With `func(foo, **, bar, baz, quux)` if I use `**` as a pseudo-argument,
the interpreter switches to "auto-fill" mode and everything that follows
that (until the end of the function call) has to be interpreted
according to the mode.

With your proposal if I use `**{` then the interpreter similarly
switches to auto-fill mode and everything that follows until the next
`}` is affected by that mode. There's no big difference. The idea behind
`**` is that you could also use `**kwargs` instead and `**` is just the
case when you don't have anything to unpack (as an analogy to `*args`
vs. `*` in a function definition when you don't need to consume varargs).

A few people immediately started describing this as a mode, without
prompting. I think it is a very natural way of thinking about it.

I think what obscures the modal aspect of `**{...}` is the fact that it
somehow looks like an expression, but it isn't. It's special syntax that
can only be used in function calls, to tell the compiler that it should
autofill the parameter names. It's a mode in disguise.

And we have no way of turning the mode off. So if there is every a
proposal to allow positional arguments to follow keyword arguments, it
won't be compatible with auto-fill mode.

That is true, it doesn't have an explicit end token. However I'm not
convinced that ruling out the positional-after-keyword-arguments option
is a relevant argument against it.

With `func(foo, **{bar, baz, quux})` the mental model is closer to
ordinary argument or dict unpacking. Nobody refers to this:

 func(spam, *[eggs, cheese, aardvark], hovercraft)

as "list mode" or "argument unpacking mode". It's just "unpacking a
list" or similar. No-one thinks about the interpreter entering a special
"collect list mode" even if that's what the parser actually does, in
some sense. We read the list as an entity, which then gets unpacked.


In this example `*` and `[eggs, cheese, aardvark]` are distinct
entities, the latter can exist without the former and it has the exact
same meaning, independent of context. So we think about it as a list
that gets unpacked (and the list being a concept that can exist in
isolation, without unpacking).

With the proposed syntax we have `**{eggs, cheese, aardvark}` and here
the `**` and `{...}` parts are inseparable. Even though the latter could
exist in isolation but then it means something completely different. In
the `**{...}` listing of names these not only refer to their objects as
usual but also serve the purpose of identifying keyword parameter names.
This unusual extension of identifier meaning is limited by `**{` and `}`
and hence I consider it a mode, just like `**,` and `)`.


Likewise for dict unpacking: nobody thinks of `{'a': expr}` as entering
"dict mode". You just make a dict, then unpack it.

And nobody (I hope...) will think of keyword shortcut as a mode:

 func(foo, **{bar, baz}, quux=1)`

It's just unpacking an autofilled set of parameter names. Not a mode at
all. And notice that there is absolutely no difficulty with some future
enhancement to allow positional arguments after keyword arguments.

"unpacking an autofilled set of parameter names" implies that two
distinct actions take place. First the set of parameter names is
autofilled and converted into something that can be unpacked, and then
it actually gets unpacked. But that's not what is happening, you can't
have `**({bar, baz})`. What you *describe* is another proposal, namely
using e.g. `{:bar, :baz}` to construct the mapping and then `**` to
unpack it; `**({:bar, :baz})` works without problem. So

[Python-ideas] Re: Proposal: Keyword Unpacking Shortcut [was Re: Keyword arguments self-assignment]

2020-04-18 Thread Dominik Vilsmeier


On 18.04.20 08:14, Steven D'Aprano wrote:


This proposal is an alternative to Rodrigo's "Keyword arguments
self-assignment" thread.

Rodrigo, please feel free to mine this for useful nuggets in your PEP.

(I don't claim to have invented the syntax -- I think it might have been
Alex Hall?)


Keyword Unpacking Shortcut
--

Inside function calls, the syntax

 **{identifier [, ...]}

expands to a set of `identifier=identifier` argument bindings.


I don't see how this proposal is significantly different from the `**`
version:

    func(foo, **, bar)  vs.  func(foo, **{bar})

It's still a mode switch, only the beginning and end markers have
changed. Instead of `**,` (or `**mapping,`) we then have `**{` as the
opening marker and instead of `)` (the parenthesis that closes the
function call) we have `}` as the closing marker.

You have criticized the use of modal interfaces in this message from
which I quote
(https://mail.python.org/archives/list/python-ideas@python.org/message/TPNFSJHPWQWZFO7VM6COQN6ZPOWWKG2X/):


Modes are okay when they are really big (e.g. "I'm in Python programming
mode now") but otherwise should be minimized, with an obvious beginning
and end. If you have to have a mode, they should be *really obvious*,
and this isn't. There's a really fine difference between modes:

 my_super_function_with_too_many_parameters(
 args, bufsize, executable, stdin, stdout, stderr,
 preexec_fn, close_fds, shell, cwd, env, kwargs,
 universal_newlines, startupinfo, creationflags,
 restore_signals, start_new_session, *, pass_fds,
 encoding, errors, text, file, mode, buffering,
 newline, closefd, opener, meta, private, dunder,
 invert, ignorecase, ascii_only, seed, bindings,
 callback, log, font, size, style, justify, pumpkin,
 )

If you miss the star in the middle of the call, there's no hint in the
rest of the call to clue you in to the fact that you changed modes.


(The above example uses `*` as the marker, while in the meantime `**`
has been proposed.)

I'm not convinced that `**{` and `}` make the beginning and end of the
mode "really obvious":

my_super_function_with_too_many_parameters(
args, bufsize, executable, stdin, stdout, stderr,
preexec_fn, close_fds, shell, cwd, env, kwargs,
universal_newlines, startupinfo, creationflags,
restore_signals, start_new_session, **{pass_fds,
encoding, errors, text, file, mode, buffering,
newline, closefd, opener, meta, private, dunder,
invert, ignorecase, ascii_only, seed, bindings,
callback, log, font, size, style, justify, pumpkin})

For really long argument lists you can expect the mode switch to be
placed on a separate line for the sake of readability and then it's hard
to miss either way:

my_super_function_with_too_many_parameters(
args, bufsize, executable, stdin, stdout, stderr,
preexec_fn, close_fds, shell, cwd, env, kwargs,
universal_newlines, startupinfo, creationflags,
restore_signals, start_new_session,
**,
pass_fds, encoding, errors, text, file, mode, buffering,
newline, closefd, opener, meta, private, dunder,
invert, ignorecase, ascii_only, seed, bindings,
callback, log, font, size, style, justify, pumpkin,
)

In addition to that, more and more advanced IDEs are available and those
could easily highlight the "autofill" part that follows the `**` (or
`**{`) to help readability.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BH46T7WKJI5LDPC35AVWSEBZLZX5LM3S/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Keyword arguments self-assignment


On 17.04.20 10:53, Steven D'Aprano wrote:


I think that, as little as I like the original proposal and am not
really convinced it is necessary, I think that it is better to have the
explicit token (the key/parameter name) on the left, and the implicit
token (blank) on the right:

 key=

I suspect it may be because we read left-to-right in English and Python,
so having the implicit blank come first is mentally like running into a
pothole at high speed :-)


For function calls, I think it is much easier to infer the parameter
name than the argument name. That's because usually functions are meant
to generalize, so their parameter names denote a more general concept.
Variable names on the other hand refer to some specific data that's used
throughout the program. For example imagine the function

    def call(*, phone_number, ...)  # here '...' means any number of
additional parameters

Then in the code we might have the following variables:

    phone_number_jane = '123'
    phone_number_john = '456'

Using any of these variables to invoke the `call` function, it is pretty
obvious which parameter it should be assigned to (for the human reader
at least, not the compiler):

    call(=phone_number_jane)

If on the other hand the parameter was specified it would be ambiguous
which variable to use:

    call(phone_number=)  # should we call Jane or John?

Now this doesn't relate to the original proposal yet, but imagine this
is inside some other function and we happen to have another variable
`phone_number` around:

    def create_conference_call_with_jane_and_john(phone_number):
    """`phone_number` will be placed in a conference call with Jane
and John."""

        phone_number_jane = '123'
        phone_number_john = '456'

    call(phone_number=)  # Whom do we call here?

Yes, there is a variable `phone_number` around but why use this
specifically? `phone_number_jane` and `phone_number_john` are also phone
numbers and `call` only asks for a phone number in general, so it's
ambiguous which one to use.

If on the other hand I read `call(=phone_number)` then I know
`phone_number` is a phone number and `call` expects one for its
`phone_number` parameter so it's pretty clear how the assignment should
be made.

Another example:

    purchases = load_data('purchases.txt')
    sales = load_data('sales.txt')
    data = load_data('user_data.txt')

    process(data=)  # We have multiple data sets available, so in the
face of ambiguity, ...
   # ... refuse the temptation to guess.

Functions parameters usually represent a concept and arguments then
represent the specific data. Often (not always) specific data can be
assigned a concept but the other way round is almost always ambiguous.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VDYS37U4OQ7TEJ26VQT5DACPD4DJVOTH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Keyword arguments self-assignment



On 17.04.20 23:18, Andrew Barnert via Python-ideas wrote:

On Apr 17, 2020, at 13:39, Alex Hall  wrote:


I also find the example with :keyword a bit jarring at first glance,
so I propose a double colon to alleviate the problem if we go in
that direction. Compare:

    { :a, "b": x, :c }
  { ::a, "b": x, ::c }


I can see the point of explicitly calling out the syntax like this,
and, while it doesn’t really feel necessary to me, it doesn’t feel at
all onerous either. So if there are people who feel strongly against
the single colon and like the double colon, I’m fine with the double
colon.

(I’m still not sure we need to do anything. And I don’t love extending
dict displays, and think that if we do need a solution, maybe there’s
a better one lurking out there that nobody has hit on. But I
definitely dislike extending dict displays less than extending some
but not other uses of **unpacking, or having a mode switch in the
middle of call syntax, or anything else proposed so far.)


It's been a lot of emails, so I probably missed it, but could you
summarize your objections regarding the idea using `**` as a "mode
switch". That is

    func(pos, **, foo, bar)

I think it aligns nicely with the `*` in function definitions. `def
func(a, *, b, c)` or `def func(a, *args, b, c)` means anything following
the `*` part is a keyword-only parameter. Similarly we have today
`func(a, **kwargs, b=b, c=c)`, i.e. anything that follows the `**` part
has to be provided as a keyword argument. Since this eliminates all
ambiguity, why not leave out the parameter names: ` func(a, **kwargs, b,
c)`; and since unpacking is not necessarily required, we could have
`func(a, **, b, c)`.



___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DS4HBKMMRSV6KQ6QHLS5GNTM5RG4SUXR/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/I4A4QVMJ3FTW5EHNC2JY2VDJSCJIGA6G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Keyword arguments self-assignment



On 17.04.20 12:49, Alex Hall wrote:


But this means the reader could miss the star, especially with a
very large function call over multiple lines, and if that reader
happens to use that particular function A LOT and know the
parameter order without having to look they would pretty easily
believe the arguments are doing something different than what is
actually happening.


Thank you for giving an actual scenario explaining how confusion could
occur. Personally I think it's a very unlikely edge case (particularly
someone comparing argument order to their memory), and someone falsely
thinking that correct code is buggy is not a major problem anyway.

I propose using two asterisks instead of one as the magical argument
separator. `**` is more closely associated with keyword arguments,
it's harder to visually miss, and it avoids the problem mentioned
[here](https://mail.python.org/archives/list/python-ideas@python.org/message/XFZ5VH5DKIFJ423FKCTHXPHDONAO3DFI/)
which I think was a valid point. So a call would look like:

function(**, dunder, invert, private, meta, ignorecase)


In that case, would you also allow `**kwargs` unpacking to serve as a
separator? That is:

    function(**kwargs, dunder, invert, private, meta, ignorecase)

Currently this is a SyntaxError. I think it would fit the symmetry with
respect to `def func(*args, bar)` vs. `def func(*, bar)`; whether or not
there is something to unpack, what follows after it remains unaffected.



___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WJYJBOWVR7W3ACCOPS3XP6MEFDEIV3LV/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/M4SEGQXMGU6BDNO7CPPZ6I57OLM27MR6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Keyword arguments self-assignment


I agree that introducing a new way for creating implicit dict literals
only for the purpose of saving on keyword arguments seems too much of a
change. Although it would be an elegant solution as it builds on already
existing structures. And I don't think it hurts readability for function
calls, since all your examples could be written as:

    bar(**{:a, :b, :c}, **kwargs)

This doesn't have the problem of accidentally overriding keys as it is
similar to:

    bar(a=a, b=b, c=c, **kwargs)


On 17.04.20 06:41, oliveira.rodrig...@gmail.com wrote:

I believe this is a different feature, non-exclusive to the one proposed here, 
that would also make it possible not to re-declare keywords.

But implementing this change with the argument of making function calls less 
repetitive or verbose when having redundant named keywords and variables 
doesn't sell it to me.

See, function calls would still suffer to be less redundant if we go with this:

```python
def foo(a, b, **kwargs):
 c = ...
 bar(**{:a, :b, :c, d: kwargs["d"]})  # this just got worse
```

```python
def foo(a, b, **kwargs):
 c = ...
 # all parameters definition is away from the function call, not a fan
 # one can possibly overwrite some key on kwarg without knowing
 kwargs.update({:a, :b, :c})
 bar(**kwargs)
```

```python
def foo(a, b, **kwargs):
 c = ...
 bar(**(kwargs | {:a, :b, :c}))  # a little better but one can still 
overwrite some key on kwarg without knowing
```

Using a "magical" separator does the job and has little interactions with other 
syntaxes, using the `*` character seems better than just picking another random one (like 
we did with `/`). Comparing with all the above excerpts, this is still more appealing and 
clearer for me:

```python
def foo(a, b, **kwargs):
 c = ...
 bar(*, a, b, c, **kwargs)  # also, if any of `a`, `b` or `c` is in 
`kwargs` we get a proper error
```

Rodrigo Martins de Oliveira
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ATCTNM5DTDTXLLCOFEHDHM7OP2MYTQDW/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GI6UDR5OU7H7Q46STLTWBQ3CLUEIWW2K/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Keyword arguments self-assignment

2020-04-16 Thread Dominik Vilsmeier

On 16.04.20 22:28, Alex Hall wrote:

On Thu, Apr 16, 2020 at 10:13 PM Kyle Stanley mailto:aeros...@gmail.com>> wrote:

Dominik Vilsmeier wrote:
> I'm not sure if this is doable from the compiler perspective,
but what
> about allowing tuples after `**` unpacking:
>
>      requests.post(url, **(data, params))
>
>      # similar to
>      requests.post(url, data=data, params=params)

+1. I can see the practical utility of the feature, but was
strongly against the
other syntax proposals so far. IMO, the above alternative does a
great job of
using an existing feature, and I think it would be rather easy to
explain how
it works.

If we go in that direction, I'd prefer curly braces instead so that
it's more reminiscient of a dict instead of a tuple, although
technically it will look like a set literal.

Do you intend this "shortcut" syntax to also work in other contexts?
Because indeed if it looks like a set literal it would be confusing if
it emerges as a dict.

Some other possible syntaxes for a dict (which would have to be
unpacked in a function call) with string keys equal to the variable
name, i.e. {"foo": foo, "bar": bar}:

{*, foo, bar}

This looks like someone forgot an iterable after the `*`.

{**, foo, bar}

This resembles **kwargs but using it to unpack keyword arguments it
looks weird: `func(**{**, foo, bar})`.

{:, foo, bar}
{{ foo, bar }}

This is already valid syntax and attempts to store a set inside another set.

{* foo, bar *}

The `*foo` part is already valid syntax and the `bar *` looks like
someone forgot the second operand for the binary multiply.

{: foo, bar :}
{: foo, bar}

Personally in these cases I usually write dict(foo=foo, bar=bar)
instead of a dict literal because I don't like the quotes, but even
then I'm sad that I have to write the word 'dict'. So I would prefer
that we covered raw dicts rather than function calls, or both.

If at all, I'd prefer something like {:foo, :bar}. But anyway this takes
the proposal in a different direction.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SATYP4EW2ONMA4TFVFLWNILHTBWU3TNG/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/I2XUUDW2BZVX55QSJB3LQ6BRZTDMPQIM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Keyword arguments self-assignment

2020-04-16 Thread Dominik Vilsmeier


I'm not sure if this is doable from the compiler perspective, but what
about allowing tuples after `**` unpacking:

    requests.post(url, **(data, params))

    # similar to
    requests.post(url, data=data, params=params)

Probably some magic would need to happen in order to merge the names
with their values, but the same is true for the `post(url, data=,
params=)` syntax.


On 16.04.20 18:57, oliveira.rodrig...@gmail.com wrote:

@StevenDAprano and this goes for @RhodriJames , thank you for sharing your 
point of view. Indeed the proposed syntax is obscure and would not be that 
readable for beginners.

Couldn't we work around this so? The concept is still good for me just the 
syntax that is obscure, maybe something like this would work:

```python
# '*' character delimits that subsequent passed parameters will be passed
# as keyword arguments with the same name of the variables used
self.do_something(positional, *, keyword)
# this would be equivalent to:
self.do_something(positional, keyword=keyword)
```
I believe this is readable even if you don't know Python: `positional` and 
`keyword` are being passed as parameters, the `*` character is mysterious at 
first but so it is in `def foo(a, *, b)` and it doesn't get into the way of 
basic readability.

Rodrigo Martins de Oliveira
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/N2ZY5NQ5T2OJRSUGZJOANEQOGEQIYYIK/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S44VDMZ4AFBSGIQEVWMKKWOW4P6WRVXY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Keyword arguments self-assignment

2020-04-16 Thread Dominik Vilsmeier


For function definitions, the introduction of `*` to mark keyword-only
parameters was consistent with existing syntax in a sense that `def
foo(*args, bar)` had `args` consume all positional arguments, so `bar`
can only be passed via keyword. Now using `def foo(*, bar)` just omits
the positional argument part, but leaves `bar` unchanged.

However for function calls you can have positional arguments following
argument unpacking:

    def foo(a, b):
    print(a, b)

    a, b = 2, 1
    foo(*[], b,  a)  # prints "1 2"

Now omitting the unpacking part would change the meaning of what
follows, namely that it is to be interpreted as keyword arguments:

    foo(*, b, a)  # prints "2 1"

Sure you could argue that a lonely `*` in a function call has to have
some effect, so it must change what comes after it, but this slight
asymmetry between definition and calling of a function could be confusing.

On 16.04.20 19:54, Alex Hall wrote:


I beg to differ.  I do find "def foo(a, *, b)" gets in the way of
readability.

--
Rhodri James *-* Kynesim Ltd


In what way?

In any case, focusing on the calling syntax being proposed, is there
anything unreadable about:

foo(a, *, b)

compared to

foo(a, b=b)

? I think in the proposed syntax it's quite easy to understand that
we're passing arguments 'a' and 'b' even if we have no idea what the
'*' means, and it's small enough that it's fairly easy to mentally
filter out.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NWJTHDZMZHXVEMEXKKA4TSFYU47IU5BB/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XFZ5VH5DKIFJ423FKCTHXPHDONAO3DFI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Exception spaces

2020-04-12 Thread Dominik Vilsmeier


On 12.04.20 02:20, Soni L. wrote:

I figured something better instead. you can have a class ESpace, but 
you use it like so:


    espace = ESpace()

    try:
    foo(espace=espace)
    except espace.module.submodule.Exception:
    ...

e.g. for builtins:

    espace = ESpace()

    try:
    raise espace.ValueError
    except espace.ValueError:
    ...

and it dynamically creates subclasses of whatever you give it. I'm not 
sure how doable this is in current python, but it's super close to 
what I want. so hey if it works well, we can promote it to the stdlib? 
just need to encourage ppl not to check the type of their espace 
argument so you can silently swap the external one for the stdlib one 
and nothing breaks.


(still need a better way to pass it into operators but eh)


This is possible for example by defining ESpace as a class which returns 
the corresponding exception subclasses via `__getattr__` and stores them 
in a cache. In order to work with a default (i.e. if no espace is 
provided) some base class which delegates attribute lookups directly to 
`builtins` would be required. Something like the following should do:


    import builtins
    import inspect


    class BaseESpace:
    def __getattr__(self, name):
    obj = getattr(builtins, name, None)
    if inspect.isclass(obj) and issubclass(obj, Exception):
    return obj
    else:
    raise AttributeError(name)


    class ESpace(BaseESpace):
    def __init__(self):
    self.cache = {}

    def __getattr__(self, name):
    try:
    return self.cache[name]
    except KeyError:
    custom = type(name, (super().__getattr__(name),), {})
    self.cache[name] = custom
    return custom


    def func_works(espace=BaseESpace()):
    raise espace.ValueError('foo')


    def func_bug(espace=BaseESpace()):
    int('xyz')


    espace = ESpace()

    try:
    func_works(espace=espace)
    except espace.ValueError:
    print('custom exception raised')
    except ValueError:
    print('builtin exception raised')

    try:
    func_works()
    except espace.ValueError:
    print('custom exception raised')
    except ValueError:
    print('builtin exception raised')

    try:
    func_bug()
    except espace.ValueError:
    print('custom exception raised')
    except ValueError:
    print('builtin exception raised')


So it seems there are plenty of options to realize this in custom 
projects without a change to the syntax. In any way, these approaches 
require the API developers to add the extra `espace` parameter to their 
functions, so all of this can only work based on mutual agreement.


Regarding operators, you can always `try / except` and then re-raise a 
similar exception from `espace`.




___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/44WBWPOA36S2H5ZLUEFIZXUHG2AZGYHS/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RPSRBX2CBDHVFMDDV5NB7KPTYZOOPMES/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Exception spaces

2020-04-11 Thread Dominik Vilsmeier


If I understand correctly, you want a way for distinguishing between
exceptions that were explicitly and intentionally `raise`'ed as part of
an API and exceptions that were unintentionally raised due to bugs. So
for example:

    raise ValueError(f'Illegal value for xyz: {xyz}')  # part of the API
    foo['bug']  # this should have been 'bag' so it's a bug

In addition to that, the user of the API should be able to decide
whether they let the API raise in their *exception space* or not.

I think you could realize this in today's Python by using `raise ...
from espace` where espace is an instance of a custom exception and then
check the resulting exception's `__cause__`. So for example:

    class ESpace(Exception):
    pass

    # I'm the user of an API and I want to distinguish their API errors
from bugs.
    espace = ESpace()
    try:
        api_func(espace=espace)
    except KeyError as err:
        if err.__cause__ is espace:  # it's part of the API
            pass
        else:  # it's a bug
            pass

And the API functions would have to raise their exceptions explicitly
from the provided `espace`:

    def api_func(espace=None):
        raise KeyError() from espace  # part of the API; sets the
exception's __cause__ to `espace`
        foo['bug']  # here __cause__ will be None, just like if no
`espace` had been provided

It's probably an abuse of the exception mechanism and also relies on a
dunder, but for your own projects it could serve the purpose.


On 11.04.20 18:07, Soni L. wrote:

the reason I'm proposing this is that I like standard exception types
having well-defined semantics. this "special, very fragile, syntax for
the same purpose" doesn't take away from that, and instead just adds
to it.

it's a way of having multiple, independent, "local" dynamic scopes.
instead of just one global dynamic scope. and enables you to freely
use standard exception types.

if anything it's closer to passing in a namespace with a bunch of
standard exception types every time you wanna do stuff. which... I
could also get behind tbh. an stdlib addition that dynamically exposes
subclasses of the standard exception types, unique to that namespace
instance. would still need some way of passing it in to operators and
stuff tho.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HAT3Y77ONNJ5VKOFH6ZPFQ3LZXEQB2Q4/
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/X7YXXEBILZNA52UUDN7PC4R6RHVAIXGO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make `del x` an expression evaluating to `x`

2020-03-12 Thread Dominik Vilsmeier

On Thu, 12 Mar 2020 at 21:10, Dominik Vilsmeier
wrote:

If I wanted to split the computation over multiple lines and yet have it
optimized I would just reuse the same (target) name instead of creating
a temporary one and then discarding it in the next step:

a = b * c
a += d

This is __exactly__ the point: the numpy patch author said that, with
that patch, this is done automatically __without__ the need to split
the code.

The point is that sometimes you _do_ want to split your code over
multiple lines for the sake of clarity.
Using meaningful names for each of the steps seems intuitive but since
this leads to temporary objects it can hurt performance. I had the
impression that the original example was addressing exactly this case.

I think this is not entirely a bad idea (the idea, not the
implementation). The problem is this can be safely done only for
immutables. Furthermore this will speed up things only if the object
is large. The splitted code will be slower for any other case.

Maybe this can be implemented for tuples too, the only other immutable
in Python, apart str, that can be large. But I rarely see operations
with tuples that needed to be optimized.

`int` can be large too; for example `1024 ** 1024 ** 1024`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/PLVRHDOJFYRTBXULZ4VDIFKHNPZC4KBA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Incremental step on road to improving situation around iterable strings


I agree, a warning that is never converted to an error indicates that
this is more about style than behavior (and in that sense it is use case
specific). It would also be annoying for people that intentionally
iterate over strings and find this a useful feature.

So this sounds more like the job for a linter or, because it's dealing
with types, a type checker. So what about the compromise that for
example mypy added a flag to treat strings as atomic, i.e. then it would
flag usage of strings where an iterable or a sequence is expected. Would
that solve the problem?

On 24.02.20 23:31, Paul Moore wrote:

On Mon, 24 Feb 2020 at 20:13, Alex Hall  wrote:

Conversely, I can't remember a case where I've ever accidentally
iterated over a string when I meant not to.

Do you ever return a string from a function where you should have returned a 
list containing one string? Or similarly passed a string to a function? 
Forgotten to put a trailing comma in a singleton tuple? Forgotten to add 
.items() to `for key, value in kwargs:`?

Not that I remember - that's what I said, basically. No, I'm not
perfect (far from it!) but I don't recall ever hitting this issue.


compelling arguments are typically
around demonstrating how much code would be demonstrably better with
the new behaviour

That represents a misunderstanding of my position. I think I'm an outlier among 
the advocates in this thread, but I do not believe that implementing any of the 
ideas in this proposal would significantly affect code that lives in the long 
term. Some code would become slightly better, some slightly worse.

I beg to differ.

* Code that chooses to use `.chars()` would fail to work on versions
of Python before whatever version implemented this (3.9? 3.10?). That
makes it effectively unusable in libraries for years to come.
* If you make iterating over strings produce a warning before
`.chars()` is available as an option for any code that would be
affected, you're inflicting a warning on all of that code.
* A warning that will never become an error is (IMO) unacceptable.
It's making it annoying to use a particular construct, but with no
intention of ever doing anything beyond annoying people into doing
what you want them to do.
* A warning that *will* become an error just delays the problem -
let's assume we're discussing the point when it becomes an error.

As a maintainer of pip, which currently still supports Python 2.7, and
which will support versions of Python earlier than 3.9 for years yet,
I'd appreciate it if you would explain what pip should do about this
proposed change. (Note: if you suggest just suppressing the warning,
I'll counter by asking you why we'd ever remove the code to suppress
the warning, and in that case what's the point of it?)

And pip is an application, so easier. What about the `packaging`
library? What should that do? In that case, modifying global state
(the warning config) when the library is imported is generally
considered bad form, so how do we protect our users from this warning
being triggered by our code? Again, we won't be able to use `.chars()`
for years.

Boilerplate like

if sys.version_info >= (3, 9):
 def chars(s):
 return s.chars()
else:
 def chars(s):
 return s

would be an option, but that's a lot of clutter for every project to
add for something that *isn't a problem* - remember, long-running,
well-maintained libraries with a broad user base will likely have
already flushed out any bugs that might result from accidentally
iterating over strings. And these days, projects often use mypy which
will catch such errors as well. So this is literally useless
boilerplate for them.


My concern surrounds the user experience when debugging code that accidentally 
iterates over a string. So it's impossible for me to show you code that becomes 
significantly better because that's not what I'm arguing about, and it's unfair 
to say that quoting people who have struggled with these bugs is not evidence 
for the problem.

OK. That's a fair point. But why can't we find other approaches? Type
checking with mypy would catch returning a string when it should be a
list of strings. Same with all of your other examples above. How was
your experience suggesting mypy for this type of problem? I suspect
that, as you are talking about beginners, you didn't inflict anything
that advanced on them - is there anything that could be done to make
mypy more useful in a beginner context?


I would like to reiterate a point that I think is very important and many 
people seem to be brushing aside. We don't have to *break* existing code. We 
can get a lot of value, at least in terms of aiding debugging, just by adding a 
warning.

Years of experience maintaining libraries and applications have
convinced me that warnings can cause as much "breakage" as any other
change. Just saying "you can suppress them" doesn't make the problem
go away. And warnings that are suppressed by default are basically

[Python-ideas] Re: Make ~ (tilde) a binary operator, e.g. sim(self, other)

But that behavior is specific to Jupyter and so it could similarly
provide auto-complete or syntax highlighting for usage of
`smf.ols(formula=...)`. Like PyCharm provides syntax highlighting for
strings used as the `pattern` argument for the various functions of the
`re` module. So if it's only for the sake of auto-complete or syntax
highlighting, this is on the IDE and not on the language, in my opinion.

On 24.02.20 17:48, David Mertz wrote:

I get auto-complete on column names in Pandas when I'm in Jupyter. But
yes, I agree with you.

On Mon, Feb 24, 2020, 11:43 AM Dominik Vilsmeier
mailto:dominik.vilsme...@gmx.de>> wrote:

I don't see what's wrong with the status quo:

 smf.ols(formula='Lottery ~ Literacy + Wealth + Region', data=df)

If I understand correctly you want to use instead:

 smf.ols(formula=df.Lottery ~ df.Literacy + df.Wealth + df.Region)

Or since some people favor indexing over attribute access for
column names:

 smf.ols(formula=df['Lottery'] ~ df['Literacy'] + df['Wealth'] +
df['Region'])

Both alternatives are much more verbose since you have to repeat the
`df` part or even worse the brackets for indexing. In any case you
need
to type the column names that you would like to include and there's no
auto-complete on column names that would help you typing it. So I
don't
see what's the benefit of the operator version.

In addition this requires Pandas to implement the modeling but there's
much more to Pandas than just modeling so perhaps that better
remains a
separate project.

On 24.02.20 01:27, Aaron Hall via Python-ideas wrote:
> I have no behavior for integers in mind. I would expect
high-level libraries to want to implement behavior for it.
>
> - sympy
> - pandas, numpy, sklearn, statsmodels
> - other mathematically minded libraries (monadic bind or compose?)
>
> To do this we need a name. I like `__sim__`. Then we'll need
`__rsim__` and `__isim__` for completeness. We need to make room
for it in the grammar. Is it ok to give it the same priority of
evaluation as `+` or `-`, or slightly higher?
>
> In the past we've made additions to the language when we've been
parsing and evaluating strings. That's what we're currently doing
in statsmodels right now because we lack the binary (in the sense
of two-arguments) `~`.
>
> See: https://www.statsmodels.org/dev/example_formulas.html
> ___
> Python-ideas mailing list -- python-ideas@python.org
<mailto:python-ideas@python.org>
> To unsubscribe send an email to python-ideas-le...@python.org
<mailto:python-ideas-le...@python.org>
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at

https://mail.python.org/archives/list/python-ideas@python.org/message/JWC4HJVTHQA532VIW62UXVPMOEVVR2IT/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
<mailto:python-ideas@python.org>
To unsubscribe send an email to python-ideas-le...@python.org
<mailto:python-ideas-le...@python.org>
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at

https://mail.python.org/archives/list/python-ideas@python.org/message/VQOYS4E5DVLZITGKLREB3YGPU6NEUNHR/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5XV45NDBN4UW5JZ3FXDFQ3IG4K7EOJCN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make ~ (tilde) a binary operator, e.g. sim(self, other)



On 24.02.20 13:24, Rhodri James wrote:

This seems a lot like trying to shoehorn something in so one can write
idiomatic R in Python.  That on the whole sounds like a bad idea; a
friend of mine use to say he could write FORTRAN in any language but
no one else could read it.  Wouldn't it be more pythonic (or more
accurately anything-other-than-R-ic) to use an interface that was more
like

model = model_fn(prediction, seq_of_predictors, data_table)


The problem here is that the `seq_of_predictors` doesn't include a way
for specifying their relationship with `prediction`, i.e. one cannot
(easily) distinguish

    P ~ X + Y + Z

versus

    P ~ X * Y + Z
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TSW3HJFVPY6UWSILKIHFA64QUU7RE3EX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make ~ (tilde) a binary operator, e.g. sim(self, other)


I don't see what's wrong with the status quo:

    smf.ols(formula='Lottery ~ Literacy + Wealth + Region', data=df)

If I understand correctly you want to use instead:

    smf.ols(formula=df.Lottery ~ df.Literacy + df.Wealth + df.Region)

Or since some people favor indexing over attribute access for column names:

    smf.ols(formula=df['Lottery'] ~ df['Literacy'] + df['Wealth'] +
df['Region'])

Both alternatives are much more verbose since you have to repeat the
`df` part or even worse the brackets for indexing. In any case you need
to type the column names that you would like to include and there's no
auto-complete on column names that would help you typing it. So I don't
see what's the benefit of the operator version.

In addition this requires Pandas to implement the modeling but there's
much more to Pandas than just modeling so perhaps that better remains a
separate project.

On 24.02.20 01:27, Aaron Hall via Python-ideas wrote:

I have no behavior for integers in mind. I would expect high-level libraries to 
want to implement behavior for it.

- sympy
- pandas, numpy, sklearn, statsmodels
- other mathematically minded libraries (monadic bind or compose?)

To do this we need a name. I like `__sim__`. Then we'll need `__rsim__` and 
`__isim__` for completeness. We need to make room for it in the grammar. Is it 
ok to give it the same priority of evaluation as `+` or `-`, or slightly higher?

In the past we've made additions to the language when we've been parsing and 
evaluating strings. That's what we're currently doing in statsmodels right now 
because we lack the binary (in the sense of two-arguments) `~`.

See: https://www.statsmodels.org/dev/example_formulas.html
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JWC4HJVTHQA532VIW62UXVPMOEVVR2IT/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VQOYS4E5DVLZITGKLREB3YGPU6NEUNHR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Proposal: Complex comprehensions containing statements

2020-02-22 Thread Dominik Vilsmeier



 
 I also use PyCharm but I don't fold comprehensions; ideally I don't have to since comprehensions are meant to be simple and concise. Folding a comprehension takes away all the information, including the input to the operation.Regarding names, the example function you presented, `clean`, isn't very expressive. For example `strip_and_filter_empty_lines` would be clear about the involved operations. Naming the result instead would be something like `stripped_and_empty_lines_removed`. Not very nice, especially when you're reusing that name elsewhere you carry around that verbosity. It's easier and cleaner to name actions than the result of those actions. And again, with a function call it's clear where the result originates from (the function argument) but a folded comprehension hides that information.One of the points of using a function is to not define it in the local scope, but in some other namespace, so it can be reused and tested. Even if you find the need to define a local non-one-liner, prefixing it with an underscore most likely prevents name clashes and even if not PyCharm will readily report the problem.On 2/22/20, 10:15 Alex Hall  wrote:

  1. At least in my editor (PyCharm), I can collapse (fold) list comprehensions just as easily as functions.
   2. In this example the list comprehension already has a name - `clean_lines`. Using a function actually forces me to come up with a second pointless name.
   3. On the note of coming up with names, if I want to use a local function (which I often do) then I also have to worry about names in two scopes clashing, and might invent more pointless names.
   ___
   Python-ideas mailing list -- python-ideas@python.org
   To unsubscribe send an email to python-ideas-le...@python.org
   
  https://mail.python.org/mailman3/lists/python-ideas.python.org/
   Message archived at 
  https://mail.python.org/archives/list/python-ideas@python.org/message/SPWPNOCSZMOFAF45XYDCXD4MQR5LKXZO/
   Code of Conduct: 
  http://python.org/psf/codeofconduct/
   
 
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FLKGHCVNDUWGBVRM3YEQVPRJXCI5KHR4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Proposal: Complex comprehensions containing statements

2020-02-22 Thread Dominik Vilsmeier

 You have to consider that for the reader of the code that's one line containing a function call versus six lines containing that comprehension. An advantage of functions is that you can hide the implementation but a comprehension always contains all the code. If the function name is clear about what it does you don't have to look at the function itself. So I prefer to readlines = list(clean())versus a six lines long list comprehension (though the function name could be improved).On 2/22/20, 07:57 Alex Hall  wrote:

  > You might be able to avoid calling the method twice using the walrus
   operator.

   I specifically discussed the walrus operator solution, but both you and Dominik Vilsmeier seem to have missed that.

   > I'd use the list constructor with a
   > named function anyway, rather than inlining it in a comprehension. I
   > consider that more readable.

   I'm curious, how do you find this:

   def clean():
   for line in lines:
   line = line.strip()
   if line:
   yield line

   clean_lines = list(clean())

   more readable than this?

   clean_lines = [
   for line in lines:
   line = line.strip()
   if line:
   yield line
   ]

   It's not that I find my version particularly readable, but I don't see how it's worse.
   ___
   Python-ideas mailing list -- python-ideas@python.org
   To unsubscribe send an email to python-ideas-le...@python.org

  https://mail.python.org/mailman3/lists/python-ideas.python.org/
   Message archived at
  https://mail.python.org/archives/list/python-ideas@python.org/message/UNMZTO7QGYD53SUWSFMGZEVUPEIOSAVF/
   Code of Conduct:
  http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2IBR3DIHISUMIXDTTFH6FXTHLAYKFOFL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Proposal: Complex comprehensions containing statements

2020-02-21 Thread Dominik Vilsmeier

 This syntax basically allows to put any code from functions into comprehensions. This enables people to write fewer functions at the cost of more complex comprehensions. But functions are a good idea indeed:* They have a name and from that name it should be clear what that function does without having to look at the implementation.* Functions can have documentation to provide more information about their behaviour.* Functions have meaningful arguments possibly with type annotations that make it clear what input and output of that operation is.* Functions can be reused.* Functions can be tested.With that extended comprehension syntax you'll loose all of the above benefits. Since people can put arbitrary complex code into the comprehensions this puts a burden at whoever has to read the code.As for your first example this is already possible via[stripped for line in lines if (stripped := line.strip())]The matrix example is already pretty complex, and the comprehension difficult to read.Plus for simple comprehensions you have now to ways to write them:[f(x) for x in stuff][for x in stuff: f(x)]I think the first version is much better because you can see immediately what the resulting list contains: f(x) objects. In the second version you have to reach the end of the comprehension to get that information and the `for x in` part is not really interesting.On 2/21/20, 14:49 Alex Hall  wrote:

  > Yes but then it's the same as defining a generator-function.

   List comprehensions are already the same as other things, but they're nice anyway. `lambda` is the same as defining a function, but it's nice too. Syntactic sugar is helpful sometimes. I think this:

   clean = [
   for line in lines:
   stripped = line.strip()
   if stripped:
   yield stripped
   ]

   is easily nicer than this:

   def clean_lines():
   for line in lines:
   line = line.strip()
   if line:
   yield line

   clean = list(clean_lines())

   And this:

   new_matrix = [
   for row in matrix: yield [
   for cell in row:
   try:
   yield f(cell)
   except ValueError:
   yield 0
   ]
   ]

   is nicer than any of these:

   new_matrix = []
   for row in matrix:
   def new_row():
   for cell in row:
   try:
   yield f(cell)
   except ValueError:
   yield 0

   new_matrix.append(list(new_row()))

   def new_row(row):
   for cell in row:
   try:
   yield f(cell)
   except ValueError:
   yield 0

   new_matrix = [list(new_row(row)) for row in matrix]

   def safe_f(cell):
   try:
   return f(cell)
   except ValueError:
   return 0

   new_matrix = [
   [
   safe_f(cell)
   for cell in row
   ]
   for row in matrix
   ]

   > > I think it's ambiguous, like in this example:
   > clean = [
   > for line in lines:
   > stripped = line.strip()
   > if stripped:
   > stripped
   > ]
   > what says that it's the last stripped that should be yielded?

   Because it's the only statement that *can* be yielded. The `yield` is implicit when there's exactly one statement you can put it in front of. You can't `yield stripped = line.strip()`. You can technically have `stripped = yield line.strip()` but we ignore those possibilities.

   > > If that function is the whole statement and there is
   > > no other _expression_ statement in the comprehension, it will be yielded. I can't tell if
   > > there's more to your question.
   > > Imagine this one:
   > foo = [
   > for x in range(5):
   > f(x)
   > if x % 2:
   > x
   > ]
   > what will be the result?

   It will be a SyntaxError, because it's ambiguous.

   Here's a new idea: `yield` is only optional in inline comprehensions, i.e. where the loop body consists entirely of a single _expression_. So for example this is allowed:

   new_row = [for cell in row: f(cell)]

   but this is not:

   new_row = [
   for cell in row:
   thing = g(cell)
   f(thing)
   ]

   Instead the user must write `yield f(thing)` at the end.

   This would mean that you only need to add `yield` when the comprehension is already somewhat long so it's less significant, and there's only one very simple special case to learn about.
   ___
   Python-ideas mailing list -- python-ideas@python.org
   To unsubscribe send an email to python-ideas-le...@python.org

  https://mail.python.org/mailman3/lists/python-ideas.python.org/
   Message archived at 
  https://mail.python.org/archives/list/python-ideas@python.org/message/4QSZ5LCWKBDHFR3VRMPVAV2C5JIODSEP/
   Code of Conduct: 
  http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/D52IFGBE54NUHZVFSCKMHRR3JZVLJQUX/
Code of Conduct:

[Python-ideas] Re: Really long ints

2020-02-05 Thread Dominik Vilsmeier

Dan Sommers wrote:
> On Wed, 5 Feb 2020 11:09:16 +
> Jonathan Fine jfine2...@gmail.com wrote:
> > How about something like:
> > def t1(argv):
> > ... value = 0
> > ... for n in argv:
> > ... value = 1_000
> > ... value += n
> > ... return value
> > t1(123, 456, 789)
> > 123456789
> > Similarly, for define t2 to use 1_000_000, t3 to use 1_000_000_000
> > and so
> > on, instead of 1_000. For really big numbers, you might want to use t10.
> > Someone previously asked about a "base"; your idea could be extended to
> accommodate same:
> > def tbuilder(base):
> > def t(argv):
> > value = 0
> > for n in argv:
> > value = base
> > value += n
> > return value
> > return t
> > tbuilder(1000)(123, 456, 789)
> > 123456789

This won't work when leading zeros are involved, e.g. consider `123_006_789`: 
`t(123, 006, 789)` gives a SyntaxError. Also it's weird that when we want to 
write an int literal that we end up with two function calls and individual 
numbers as arguments (+ you'd need to have that function handy in the first 
place). It would be clearer to use a kw-only argument `base=1000` but even then 
it doesn't really convey they idea of a literal.

> > If you're dealing with really big integers (say 1000 digits or more)
> > then you might be want to use https://pypi.org/project/gmpy2/, in
> > which case you'll appreciate the extra flexibility provided by
> > t10. (This would allow t10 to return a gmpy integer, if import gmpy2
> > succeeds.)
> > +1
> > Finally, perhaps really big numbers should be stored
> > as data, rather
> > than placed in source code files. (For example, this would allow these
> > large pieces of data to be verified, via a secure hash.)
> > +1
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HR73ICP2D43OTOCRZAXFU2LPNOXNQI3E/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Ability to specify function for auto-test in Enum and pass the given value as an argument to _generate_next_value_

2019-10-27 Thread Dominik Vilsmeier

Steve Jorgensen wrote:
> After messing around with Enum for a while, there's one small thing that
> I'd like to see improved. It seems limiting to me that the only way to trigger
> _generate_next_value is to pass auto().
> What if, for a particular Enum, I would like to be able to use
> () as a shorthand for auto()? How about a more complex
> auto-generation that determines the final value based on both the given value 
> and the
> name/key

What's wrong with `auto()`? Why do you need a shorthand for that? Also `()` is 
an empty tuple so it's just another enum value.

> As an example of the second case, start with an Enum subclas in which each
> item, by default, has the name/key as its value and a prettified version of 
> the name as
> its label, but one can also specify the value and label using a 2-item tuple 
> for the
> value. Now, let's say we want to be able to specify a custom label and still 
> use the
> name/key as the value either as a None/label tuple or as a string starting 
> with a
> comma.

It seems that your second case can be accomplished by using a custom descriptor 
that receives the field name via `__set_name__`. For example:

from enum import Enum


class Value:
def __init__(self, val):
self.val = val
self.name = None

def __set_name__(self, owner, name):
self.name = name

def __get__(self, instance, owner=None):
return f'{self.name} {self.val}'

def __set__(self, instance, value):
raise AttributeError


class MyEnum(Enum):
RED = Value(1)
GREEN = Value(2)
BLUE = Value(3)

> Using a custom test for auto, one could identify those cases, Passing the 
> assigned
> value to the _generate_next_value_ function would allow it to make use of
> that information. For backward compatibility, the signature of the
> _generate_next_value_ function can be checked to make sure it can accept the
> extra argument for that before passing that.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6GVFYIT6JYAWFYK6BY53AJSGQRB5L5Y6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-23 Thread Dominik Vilsmeier

Jan Greis wrote:
> On 22/10/2019 06:43, Richard Musil wrote:
> > It is not a "concatenation" though, because you lost
> > {"key1": "val1"} 
> > in the process. The concatenation is not _just_ "writing something 
> > after something", you can do it with anything, but the actual 
> > operation, producing the result.
> > My point is that if I saw {"key1": "val1", "key2": "val2"} + {"key1": 
> "val3"}, I would expect that it would be equivalent to {"key1": "val1", 
> "key2": "val2", "key1": "val3"}.

But that reasoning only works with literals. And chances are that you're not 
going to see something like this in real code. Because why would you add two 
dict literals?

Instead you're going to see something like this: `d1 + d2`. And if one has to 
infer the details of that operation by coming up with some hypothetical example 
involving literals, that doesn't speak in favor of the syntax.

As mentioned, here it is up to the variable names to be clear about what 
happens. E.g.

default_preferences + user_preferences

For that example it's pretty clear that `user_preferences` is meant to 
supersede `default_preferences`. But variable names might not always be 
completely clear or even if they are, they might not allow the reader to infer 
any precedence. And then, "in the face of [that] ambiguity", one has to "refuse 
the temptation to guess". Maybe it's better not to introduce that ambiguity in 
the first place.

> Similarly, I would expect that
> deque([1, 2, 3], maxlen=4) + deque([4, 5]) == deque([1, 2, 3, 4, 5], 
> maxlen=4) == deque([2, 3, 4, 5], maxlen=4)
> which indeed is true.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RDBCJGWMK45RY676YEXQATIPHWMLVQ3Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Dominik Vilsmeier

I don't see what's wrong with `["one", "two", "three"]`. It's the most explicit 
and from the compiler perspective it's probably also as optimal as it can get. 
Also it doesn't hurt readability. Actually it helps. With syntax highlighting 
the word boundaries immediately become clear.

If you're having long lists of string literals and you're annoyed by having to 
type `"` and `,` for every element, then it is the job of your IDE to properly 
support you while coding, not the job of the syntax (as long as it's clear and 
concise).

For that reason all the advanced IDEs with all their features exists. Without 
code completion for example you could also ask for new syntax that helps you 
abbreviating long variable names, because it's too much to type. So instead of 
writing `this_is_a_very_long_but_expressive_name` you could do `this_is...` in 
case there's only one name that starts with "this_is" which can be resolved 
from your scope. That would even shorten the code. Nevertheless I think that 
code completion is a good idea and that we have to use the exact same name 
every time.

The same applies to these "word literals". If you need a list of words, you can 
already create a list literal with the words inside. If that's too much typing, 
then you should ask your favorite IDE to implement corresponding refactoring 
assistance. I'm pretty sure the guys at PyCharm would consider adding something 
like this (e.g. if the caret is inside a string literal you can access the 
context menu via + and there could be something like "split words").

Steve Jorgensen wrote:
> See 
> https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#The_%_Notatio...
> for what Ruby offers.
> For me, the arrays are the most useful aspect.
> %w{one two three}
> => ["one", "two", "three"]
> 
> I did a search, and I don't see that this has been suggested before, but I 
> might have
> missed something. I'm guessing I'm not the first person to ask whether this 
> seems like a
> desirable feature to add to Python.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J7X7BGBNZY43NANEB5OLJXCQFMZ7KHJH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

Steven D'Aprano wrote:
> On Sat, Oct 19, 2019 at 02:02:43PM -0400, David Mertz wrote:
> > The plus operation on two dictionaries feels far more
> > natural as a
> > vectorised merge, were it to mean anything.  E.g., I'd expect
> > {'a': 5, 'b': 4} + {'a': 3, 'b': 1}
> > {'a': 8, 'b': 5}
> > Outside of Counter when would this behaviour be useful?

For example one could use dicts to represent data tables, with the keys being 
either indices or column names and the values being lists (rows or columns). 
Then for joining two such tables it would be desirable if values are added, 
because then you could simply do `joint_table = table1 + table2`.

Or having a list of records from different sources:

purchases_online = {'item1': [datetime1, datetime2, ...], 'item2': ...}
purchases_store = {'item1': [datetime3, datetime4, ...], ...}
purchases_overall = purchases_online + purchases_store  # Records should be 
concatenated.
# Then doing some analysis on the overall purchases.

`pandas.Series` also behaves dict-like (almost) and does add the values on "+".

> I expect that this feels natural to you because you're thinking about 
> simple (dare I say "toy"?) examples like the above, rather than 
> practical use-cases like "merging multiple preferences":
> prefs = defaults + system_prefs + user_prefs
> 
> # or if you prefer the alternative syntax
> prefs = defaults | system_prefs | user_prefs
> 
> (Note that in this case, the lack of commutativity is a good thing: we 
> want the last seen value to win.)

In this case you'd have to infer the order of precedence from the variable 
names, not the "+" syntax itself. I.e. if you had spelled it `a + b + c` I 
would have no idea whether `a` or `c` has highest precedence. Compare that with 
a "directed" operator symbol (again, I'm not particularly arguing for "<<"):

prefs = defaults << system_prefs << user_prefs

Here it becomes immediately clear that `system_prefs` supersedes `defaults` and 
`user_prefs` supersedes the other two. A drawback of "+"  here is that you 
can't infer this information from the syntax itself.

Also I'm not sure if this is a good example, since in case something in 
`system_prefs` changes you'd have to recompute the whole thing (`prefs`), since 
you can't tell whether that setting was overwritten by `user_prefs`. I think in 
such a case it would be better to use `collections.ChainMap` for providing a 
hierarchy of preferences, which let's you easily update each level.

> Dicts are a key:value store, not a multiset, and outside of specialised 
> subclasses like Counter, we can't expect that adding the values is 
> meaningful or even possible. "Adding the values" is too specialised and 
> not general enough for dicts, as a slightly less toy example might show:
> d = ({'customerID': 12932063,
>   'purchaseHistory': ,
>   'name': 'Joe Consumer',
>   'rewardsID': 391187} 
>   + {'name': 'Jane Consumer', 'rewardsID': 445137}
>   )
> 
> Having d['name'] to be 'Joe ConsumerJane Consumer' and d['rewardsID'] to 
> be 836324 would be the very opposite of useful behaviour.

I agree that adding the values doesn't make sense for that example but neither 
does updating the values. Why would you want to take a record corresponding to 
"Joe Consumer" and partially update it with data from another consumer ("Jane 
Consumer")? Actually I couldn't tell what the result of that example should be.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ASKUQIDTVYW2EJ6JBOPJ5QZOKYRH62UC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

Steven D'Aprano wrote:
> On Sat, Oct 19, 2019 at 02:02:43PM -0400, David Mertz wrote:
> > The plus operation on two dictionaries feels far more
> > natural as a
> > vectorised merge, were it to mean anything.  E.g., I'd expect
> > {'a': 5, 'b': 4} + {'a': 3, 'b': 1}
> > {'a': 8, 'b': 5}
> > Outside of Counter when would this behaviour be useful?

For example one could use dicts to represent data tables, with the keys being 
either indices or column names and the values being lists (rows or columns). 
Then for joining two such tables it would be desirable if values are added, 
because then you could simply do `joint_table = table1 + table2`.

Or having a list of records from different sources:

purchases_online = {'item1': [datetime1, datetime2, ...], 'item2': ...}
purchases_store = {'item1': [datetime3, datetime4, ...], ...}
purchases_overall = purchases_online + purchases_store  # Records should be 
concatenated.
# Then doing some analysis on the overall purchases.

`pandas.Series` also behaves dict-like (almost) and does add the values on "+".

> I expect that this feels natural to you because you're thinking about 
> simple (dare I say "toy"?) examples like the above, rather than 
> practical use-cases like "merging multiple preferences":
> prefs = defaults + system_prefs + user_prefs
> 
> # or if you prefer the alternative syntax
> prefs = defaults | system_prefs | user_prefs
> 
> (Note that in this case, the lack of commutativity is a good thing: we 
> want the last seen value to win.)

In this case you'd have to infer the order of precedence from the variable 
names, not the "+" syntax itself. I.e. if you had spelled it `a + b + c` I 
would have no idea whether `a` or `c` has highest precedence. Compare that with 
a "directed" operator symbol (again, I'm not particularly arguing for "<<"):

prefs = defaults << system_prefs << user_prefs

Here it becomes immediately clear that `system_prefs` supersedes `defaults` and 
`user_prefs` supersedes the other two. A drawback of "+"  here is that you 
can't infer this information from the syntax itself.

Also I'm not sure if this is a good example, since in case something in 
`system_prefs` changes you'd have to recompute the whole thing (`prefs`), since 
you can't tell whether that setting was overwritten by `user_prefs`. I think in 
such a case it would be better to use `collections.ChainMap` for providing a 
hierarchy of preferences, which let's you easily update each level.

> Dicts are a key:value store, not a multiset, and outside of specialised 
> subclasses like Counter, we can't expect that adding the values is 
> meaningful or even possible. "Adding the values" is too specialised and 
> not general enough for dicts, as a slightly less toy example might show:
> d = ({'customerID': 12932063,
>   'purchaseHistory': ,
>   'name': 'Joe Consumer',
>   'rewardsID': 391187} 
>   + {'name': 'Jane Consumer', 'rewardsID': 445137}
>   )
> 
> Having d['name'] to be 'Joe ConsumerJane Consumer' and d['rewardsID'] to 
> be 836324 would be the very opposite of useful behaviour.

I agree that adding the values doesn't make sense for that example but neither 
does updating the values. Why would you want to take a record corresponding to 
"Joe Consumer" and partially update it with data from another consumer ("Jane 
Consumer")? Actually I couldn't tell what the result of that example should be.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WBPF4GIJSXED2GPCX4XK3XYUVBRBLKUJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

Steven D'Aprano wrote:
> On Sun, Oct 20, 2019 at 11:29:54PM -0000, Dominik Vilsmeier wrote:
> > The question is, why would someone who has experience
> > with adding 
> > counters but never felt the need to add dicts, assume that this 
> > behavior is specialized in Counter and not inherited by
> > dict.
> > I think you might mean inherited from dict? dict doesn't inherit 
> Counter's behaviour because Counter is the subclass and dict the parent 
> class.

Yes, sorry for the confusion, I meant "inherited from". One of the occasional 
non-native speaker issues :-)

> > Maybe at some point they'll encounter a scenario
> > where they need to 
> > recursive-merge (the Counter style) two dicts and then they might 
> > assume that they just need to add the dicts
> > Okay. So what? If they do this, it will be a mistake. Programmers make 
> mistakes thousands of times a day, it is neither our responsibility nor 
> within our power to prevent them all.
> Programmer error is not a good reason to reject a feature.

But it is the responsibility to assists programmers and help them make as few 
errors as possible by providing a clear and unambiguous syntax. If a specific 
syntax feature is ambiguous in its meaning, it's more likely to be an attractor 
of errors.

> > since they're familiar 
> > with this behavior from Counter and Counter subclasses
> > dict so 
> > it's reasonable to assume this behavior is inherited.
> > No it isn't reasonable. Counters are designed to count. Their values 
> are supposed to be ints, usually positive ints. dicts are general 
> key:value stores where the values can be any kind of object at all, not 
> just numbers or even strings. Most objects don't support addition.
> It is totally unreasonable to assume that dict addition will add values 
> by default when by default, objects cannot be added.

I fully agree to that. But someone working with dicts which store floats or 
lists might be tempted to assume that `d1 + d2` means "add the values" 
(especially if they're reading the code). If in that specific context it's 
perfectly fine (and maybe even reasonable) to add dict values it is more 
difficult to neglect that assumption (unless they're already familiar with the 
syntax). Yes, it's the programmers responsibility to be aware of what specific 
syntax does, but the language should assist as much as possible.

> > how obvious is the conclusion that dict performs a 
> > shallow merge and resolves conflicting keys by giving precedence to 
> > the r.h.s. operand?
> > About as obvious that update performs a shallow merge and resolves 
> duplicate keys by giving precedence to the last seen value.

Only if you know that "+" means "update" in that specific context. Otherwise 
one could even think that there are already ways to copy-merge two dicts, so 
why would they introduce new syntax for that, so "+" must be meaning something 
else (possibly the complementary, preserving l.h.s. values).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Y4YKYHQCSEQKIVURCWZCOOJI7PODRK5G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

Steven D'Aprano wrote:
> On Sun, Oct 20, 2019 at 11:48:10PM -0000, Dominik Vilsmeier wrote:
> > Regarding "|" operator, I think a drawback is the
> > resemblance with 
> > "or" (after all it's associated with "__or__") so people might assume 
> > behavior similar to x or y where x takes precedence (for truthy
> > 
> > values of x). So when reading d1 | d2 one could falsely assume 
> > that values in d1 take precedence over the ones in d2 for 
> > conflicting keys. And this is also the existing set behavior (though 
> > it's not really relevant in this case):
> > There's a much easier way to demonstrate what you did:
> >>> {1} | {1.0}
> {1}
> 
> In any case, dict.update already has this behaviour:
> >>> d = {1: 'a'}
> >>> d.update({1.0: 'A'})
> >>> d
> {1: 'A'}
> 
> The existing key is kept, only the value is changed. The PEP gives a 
> proposed implementation, which if I remember correctly is:
> # d1 | d2
> d = d1.copy()
> d.update(d2)
> 
> so it will keep the current dict behaviour:
> 
> keys are stable (first key seen wins)
> values are updated (last value seen wins)

Exactly, so the dict "+" behavior would match the set "|" behavior, preserving 
the keys. But how many users will be concerned about whether the keys are going 
to be preserved? I guess almost everybody will want to know what happens with 
the values, and that question remains unanswered by just looking at the "+" or 
"|" syntax. It's reasonable to assume that values are preserved as well, i.e. 
`d1 + d2` adds the missing keys from `d2` to `d1`. Of course, once you know 
that "+" is actually similar to "update" you can infer that the last value 
wins. But "+" simply doesn't read "update". So in order to know you'll have to 
look it up, but following that argument you could basically settle on any 
operator symbol for the update operation. A drawback of "+" is that different 
interpretations are plausible, and this fact cannot be denied as can be seen 
from the ongoing discussion. Of course one can blame the programmer, if they 
didn't check the documentation carefully enough, also since "in the 
 face of ambiguity, refuse the temptation to guess". But in the end the 
language should assist the programmer and it's better not to introduce 
ambiguity in the first place.

> 
> I think that, strictly speaking, this "keys are stable" behaviour is not 
> guaranteed by the language reference. But it's probably so deeply built 
> into the implementation of dicts that it is unlike to ever change. (I 
> think Guido mentioned something about it being a side-effect of the way 
> dict __setitem__ works?)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EX2B2QKNVKSTEW7GFPJBLZF7S4TG25R7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-20 Thread Dominik Vilsmeier

Guido van Rossum wrote:
> So the choice is really only three way.
> 1) Add d1 + d2 and d1 += d2 (using similarity with list + and +=)
> 2) Add d1 | d2 and d1 |= d2 (similar to set | and |=)
> 3) Do nothing
> We're not going to introduce a brand new operator for this purpose, nor are
> we going to use a different existing operator.

I didn't mean to argue for another operator, but rather to point out that 
i.m.o. "+" is not a good choice, for similar reasons why I think 
`collections.deque` shouldn't support "+" (regarding ambiguity of precedence 
and potential "loss" of data). Besides for `dict` even more interpretations of 
the meaning of "+" are plausible.

Regarding "|" operator, I think a drawback is the resemblance with "or" (after 
all it's associated with "__or__") so people might assume behavior similar to 
`x or y` where `x` takes precedence (for truthy values of `x`). So when reading 
`d1 | d2` one could falsely assume that values in `d1` take precedence over the 
ones in `d2` for conflicting keys. And this is also the existing `set` behavior 
(though it's not really relevant in this case):

>>> class Test:
... def __init__(self, x):
... self.x = x
... def __hash__(self):
... return 0
... def __eq__(self, other):
... return True
... 
>>> s = {Test(1)} | {Test(2)}
>>> s.pop().x  # leftmost wins.
1

> The asymmetry of the operation (in case there are matching keys with
> conflicting values) doesn't bother me, nor does the behavior of Counter
> affect how I feel about this.
> The += or |= operator will have to behave identical to d1.update(d2) when
> it comes to matching keys.
> I'm not sure whether += or |= needs to be an exact alias for dict.update.
> For lists, += and .extend() behave identically: both accept arbitrary
> sequences as right argument. But for sets, |= requires the right argument
> to be a set, while set.update() does not. (The not-in-place operators
> always require matching types: l1 + l2 requires l2 to be a list, s1 | s2
> requires s2 to be a set.) But this is only a second-order consistency issue
> -- we should probably just follow the operator we're choosing in the end,
> either + or |.
> IMO the reason this is such a tough choice is that Python learners are
> typically introduced to list and dict early on, while sets are introduced
> later. However, the tutorial on docs.python.org covers sets before dicts --
> but lists are covered much earlier, and dicts make some cameo appearances
> in the section on control flow. Perhaps more typical, the tutorial at
> https://www.tutorialspoint.com/python/
> discusses data types in this order:
> numbers, strings, lists, tuples, dictionary, date -- it doesn't
> mention sets at all. This matches Python's historical development (sets
> weren't added until Python 2.3).
> So if we want to cater to what most beginners will know, + and += would be
> the best choice. But if we want to be more future-proof and consistent, |
> and |= are best -- after all dicts are closer to sets (both are hash
> tables) than to lists. (I know you can argue that dicts are closer to lists
> because both support __getitem__ -- but I find that similarity shallower
> than the hash table nature.)
> In the end I'm +0.5 on | and |=, +0 on + and +=, and -0 on doing nothing.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RY43TMSILA5AFS5ENDFFD4LASHC2IKBD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-20 Thread Dominik Vilsmeier

Guido van Rossum wrote:
> So the choice is really only three way.
> 1) Add d1 + d2 and d1 += d2 (using similarity with list + and +=)
> 2) Add d1 | d2 and d1 |= d2 (similar to set | and |=)
> 3) Do nothing
> We're not going to introduce a brand new operator for this purpose, nor are
> we going to use a different existing operator.

I didn't mean to argue for another operator, but rather to point out that 
i.m.o. "+" is not a good choice, for similar reasons why I think 
`collections.deque` shouldn't support "+" (regarding ambiguity of precedence 
and potential "loss" of data). Besides for `dict` even more interpretations of 
the meaning of "+" are plausible.

Regarding "|" operator, I think a drawback is the resemblance with "or" (after 
all it's associated with "__or__") so people might assume behavior similar to 
`x or y` where `x` takes precedence (for truthy values of `x`). So when reading 
`d1 | d2` one could falsely assume that values in `d1` take precedence over the 
ones in `d2` for conflicting keys. And this is also the existing `set` behavior 
(though it's not really relevant in this case):

>>> class Test:
... def __init__(self, x):
... self.x = x
... def __hash__(self):
... return 0
... def __eq__(self, other):
... return True
... 
>>> s = {Test(1)} | {Test(2)}
>>> s.pop().x  # leftmost wins.
1

> The asymmetry of the operation (in case there are matching keys with
> conflicting values) doesn't bother me, nor does the behavior of Counter
> affect how I feel about this.
> The += or |= operator will have to behave identical to d1.update(d2) when
> it comes to matching keys.
> I'm not sure whether += or |= needs to be an exact alias for dict.update.
> For lists, += and .extend() behave identically: both accept arbitrary
> sequences as right argument. But for sets, |= requires the right argument
> to be a set, while set.update() does not. (The not-in-place operators
> always require matching types: l1 + l2 requires l2 to be a list, s1 | s2
> requires s2 to be a set.) But this is only a second-order consistency issue
> -- we should probably just follow the operator we're choosing in the end,
> either + or |.
> IMO the reason this is such a tough choice is that Python learners are
> typically introduced to list and dict early on, while sets are introduced
> later. However, the tutorial on docs.python.org covers sets before dicts --
> but lists are covered much earlier, and dicts make some cameo appearances
> in the section on control flow. Perhaps more typical, the tutorial at
> https://www.tutorialspoint.com/python/
> discusses data types in this order:
> numbers, strings, lists, tuples, dictionary, date -- it doesn't
> mention sets at all. This matches Python's historical development (sets
> weren't added until Python 2.3).
> So if we want to cater to what most beginners will know, + and += would be
> the best choice. But if we want to be more future-proof and consistent, |
> and |= are best -- after all dicts are closer to sets (both are hash
> tables) than to lists. (I know you can argue that dicts are closer to lists
> because both support __getitem__ -- but I find that similarity shallower
> than the hash table nature.)
> In the end I'm +0.5 on | and |=, +0 on + and +=, and -0 on doing nothing.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7FGQHBNJ6QN5QN2QC377UN6IF6CJ5ZRC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-20 Thread Dominik Vilsmeier

Christopher Barker wrote:
> On Sat, Oct 19, 2019 at 3:14 PM Dominik Vilsmeier dominik.vilsme...@gmx.de
> wrote:
> > I like the proposal of adding an operator but I
> > dislike the usage of "+".
> > I'd expect this to do a recursive merge on the dict values for duplicate
> > keys (i.e. adding the values), even more so since Counter (being a
> > subclass of dict) already has that behavior.
> > I think that's actually a counter argument (ha!) -- since there IS a
> special "counter" type, why would anyone expect the regular dict to act
> that way?

The question is, why would someone who has experience with adding counters but 
never felt the need to add dicts, assume that this behavior is specialized in 
`Counter` and not inherited by `dict`. Maybe at some point they'll encounter a 
scenario where they need to recursive-merge (the `Counter` style) two dicts and 
then they might assume that they just need to add the dicts since they're 
familiar with this behavior from `Counter` and `Counter` subclasses `dict` so 
it's reasonable to assume this behavior is inherited.

> Also, that behavior only makes sense for particular dicts -- it really is a
> special case, perfect for a dict subclass (you know, maybe call it
> Counter), but not for generic dict behavor.

Maybe from a language design point of view, but a user might not be aware that 
this behavior is too specialized for generic dict. Besides `Counter`, `pandas` 
is another prominent example that uses the recursive merge strategy for 
mapping-like types (not necessarily in the collections.abc sense but exposing a 
similar interface):

>>> s1 = pd.Series([[0, 1], 2])
>>> s2 = pd.Series([[3, 4], 5])
>>> s1 + s2
0[0, 1, 3, 4]
1   7

Someone who is familiar with these types is probably used to that behavior and 
so it's easy to assume that it originates from dict. And even if they think 
it's too specialized, so `dict` must be doing something else, how obvious is 
the conclusion that dict performs a shallow merge and resolves conflicting keys 
by giving precedence to the r.h.s. operand?

> > I think it would be helpful if the associated
> > operator is not a symmetric
> > symbol but instead is explicit about which operand takes precedence for
> > conflicting keys. The lshift "<<" operator, for example, does have this
> > property. It would be pretty clear what this means a << b:
> > well, maybe. but I think there are two ways of thinking about "intuitive"
> 1) if someone sees this code, will they be right in knowing what it means?
> (Readability)
> 2) if someone want to do something? Will they think to try this?
> (Discoverability)
> So << might be more intuitive from a readability perspective, but less
> discoverable.
> Note in this discussion (particularly the previous long one) that
> apparently newbies often expect to be able to add dicts.
> That being said, << or | is a lot better than adding yet another operator.

I wasn't arguing particularly for the "<<" operator, I wanted to pointed out 
why, i.m.o., the "+" operator, as a symmetric symbol, isn't an ideal choice by 
comparing it to a non-symmetric operator symbol. I agree that "+" is likely 
more discoverable. Regarding intuition, as you pointed out, it's a two-way 
relationship:  <--> . So if someone wants to perform  
it should be intuitive to think of  (discoverability) and if someone 
reads  it should be intuitive to associate it with  
(readability, interpretability); I think "+" isn't very good at the latter.

> > take the items of "b" and put them into "a" (or a
> > copy thereof,
> > overwriting what's already there) in order to create the result. The PEP
> > mentions lack of interest in this operator though, as well as:
> > The "cuteness" value of abusing the operator to
> > indicate information
> > flow got old shortly after C++ did it.
> > I think a clear advantage of "<<" over "+" is that it indicates the
> > direction (or precedence) which is important if items are potentially to be
> > overwritten. I'd say "old but gold".
> > In the section about Dict addition is
> > lossy you
> > write that "no other form of addition is lossy". This is true for the
> > builtin types (except for floating point accuracy) but as part of the
> > stdlib we have collections.deque which supports "+" and can be lossy if
> > it specifies maxlen. For example:
> > >>> d1 = deque([1, 2], maxlen=3)
> > >>> d2 = deque([3, 4])
> > >>> d1 + d2
> > deque([2, 3, 4], maxlen=3)
> > 
> > I

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-19 Thread Dominik Vilsmeier

I like the proposal of adding an operator but I dislike the usage of "+". I'd 
expect this to do a recursive merge on the dict values for duplicate keys (i.e. 
adding the values), even more so since `Counter` (being a subclass of dict) 
already has that behavior. I understand that "+" is meant as a shorthand for 
`update` and this is what `Counter` does but what sticks more to the mind is 
the resulting behavior.

Furthermore, since this operation is potentially lossy, I think it would be 
helpful if the associated operator is not a symmetric symbol but instead is 
explicit about which operand takes precedence for conflicting keys. The lshift 
"<<" operator, for example, does have this property. It would be pretty clear 
what this means `a << b`: take the items of "b" and put them into "a" (or a 
copy thereof, overwriting what's already there) in order to create the result. 
The PEP mentions lack of interest in this operator though, as well as:

> The "cuteness" value of abusing the operator to indicate information flow got 
> old shortly after C++ did it.

I think a clear advantage of "<<" over "+" is that it indicates the direction 
(or precedence) which is important if items are potentially to be overwritten. 
I'd say "old but gold".

In the section about [Dict addition is 
lossy](https://www.python.org/dev/peps/pep-0584/#dict-addition-is-lossy) you 
write that "no other form of addition is lossy". This is true for the builtin 
types (except for floating point accuracy) but as part of the stdlib we have 
`collections.deque` which supports "+" and can be lossy if it specifies 
`maxlen`. For example:

>>> d1 = deque([1, 2], maxlen=3)
>>> d2 = deque([3, 4])
>>> d1 + d2
deque([2, 3, 4], maxlen=3)

I think this is unfortunate especially since as a double ended queue it 
supports both `extend` and `extendleft`, so it's not clear whether this extends 
d1 by d2 or left-extends d2 by d1 (though the latter would probably be 
ambiguous about the order of items appended). Usage of `d1 << d2` on the other 
hand would be explicit and clear about the direction of data flow.
Although a bit different for dicts, it would as well indicate which of the 
operands takes precedence over the other.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6A3DW3UI6BQGI7H4YWT62JWZOKETZLKL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Allow kwargs in {get|set|del|}item

2019-10-08 Thread Dominik Vilsmeier

Caleb Donovick wrote:
> > It captures a tiny fraction of Pandas style filtering
> > while complicating
> > the syntax of Python
> > Sure maybe I we can't represent all filters super concisely but at least
> inequalities, or any filter on single axis, would not be hard. E.g.
> db[x=LT(1)] == db[db.x < 1]

Django for example allows filtering with keyword args including inequalities in 
this way: `x__lt=1` (which translates to "x < 1"). Whether that's more readable 
or not is another question but at least with "keyword args" in `[]` every 
package could come up with its own specifications on how this is used.

> Granted I don’t really see a way to express logical connectives between
> filters in a beautiful way -- beyond doing something like db[filter=OR(x=1,
> y=2)] which really isn't any better than db.filter(OR(x=1, y=2))
> > db['x=1']
> > Ah yes cause parsing strings is a reasonable replacement for language
> support.  I have no idea why Pandas dropped support for this but I have to
> imagine it's because it's horribly ugly, prone to bugs and difficult to
> metaprogram.

Pandas didn't drop the support for query strings, they can be used via the 
`df.query` method. For example: `df.query('x == 1 and y == 2')`. That's equally 
explicit but creating query strings (dynamically) has of course its downsides. 
Given that `[]` is used for element access, extending it to further use cases 
is indeed appealing.

On the other hand one could always argue that a functional interface equally 
does the job, e.g. pandas could provide a function accepting keyword args: 
`df.select(x=1, y=2)`. Or the Django style: `Q(x=1) | Q(y__lt=2)`.

Semantically meaningful strings are terrible. Everytime I
> write a string literal for any reason other than I a want human to read
> that string I die a little inside.  Which is part of the reason I want
> db[x=1] instead of db[{'x':1}].  And yes everything is a string under the
> hood in python but that doesn't make semantic strings less terrible. Really
> under the hood (in assembly) everything is gotos but that doesn't make
> their use better either. /rant
> On Mon, Oct 7, 2019 at 10:07 PM David Mertz me...@gnosis.cx
> wrote:
> > It's really not a worthwhile win.  It captures a tiny
> > fraction of Pandas
> > style filtering while complicating the syntax of Python. Here's another
> > Pandas filter:
> >   db[db.x < 1]
> > 
> > No help there with the next syntax.  Here's another:
> >   db[(db.x == 1) | (db.y == 2)]
> > 
> > A much better idea doesn't require any changes in Python, just a clever
> > class method. Pandas did this for a while, but deprecated it because...
> > reasons. Still, the OP is free to create his version:
> > db['x=1']
> > 
> > Or
> > db['x<1']
> > db['x=1 or y=2']
> > 
> > You can bikeshed the spelling of those predicates, but it doesn't matter,
> > they are just strings that you can see however you decide is best.
> > On Mon, Oct 7, 2019, 8:38 PM Steven D'Aprano st...@pearwood.info wrote:
> > On Tue, Oct 08, 2019 at 09:19:07AM +1100,
> > Cameron Simpson wrote:
> > On 07Oct2019 10:56, Joao S. O. Bueno jsbu...@python.org.br wrote:
> > So, in short, your idea is to allow "=" signs
> > inside [] get notation
> > to
> > be translated
> > to dicts on the call,
> > Subjectively that seems like a tiny tiny win. I'm quite -1 on this
> > idea;
> > language spec bloat to neglible gain.
> > As per Caleb's initial post, this is how Pandas currently does it:
> > db[db['x'] == 1]
> > 
> > Replacing that with db[x=1] seems like a HUGE win to me.
> > Even db[{'x': 1}] is pretty clunky.
> > --
> > Steven
> > 
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/RQH4VJ...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > 
> > Python-ideas mailing list -- python-ideas@python.org
> > To unsubscribe send an email to python-ideas-le...@python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/5O7BLO...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3LXDQE7ICKIE2KIZWJORFBRAL5M3NHLT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Dominik Vilsmeier

It might be interesting to note that Numpy provides various set routines 
operating on their arrays (and hence lists as well, by conversion): 
https://docs.scipy.org/doc/numpy/reference/routines.set.html

For 
[intersection](https://docs.scipy.org/doc/numpy/reference/generated/numpy.intersect1d.html)
 for example they do the following:
1. Concatenate the arrays,
2. Sort the result,
3. Compare subsequent elements for equality.

Most likely because for each of the steps, there is a C extension that provides 
an efficient implementation.

For [membership 
testing](https://github.com/numpy/numpy/blob/d9b1e32cb8ef90d6b4a47853241db2a28146a57d/numpy/lib/arraysetops.py#L560),
 i.e. check which elements of `a` are in `b`, however, they use a condition 
where they decide that sometimes it's faster to use a loop over `a` and `b` 
instead of the concat+sort approach:

if len(b) < 10 * len(a) ** 0.145

Not sure where they got the exact numbers from (maybe from benchmarks?).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VFBIHAQBZNWO45KQAPUZ52YERO5ODBHP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Inspired by Scala, a new syntax for Union type

2019-08-30 Thread Dominik Vilsmeier

Andrew Barnert wrote:
> On Aug 29, 2019, at 16:03, Dominik Vilsmeier dominik.vilsme...@gmx.de wrote:
> > I never really understood the importance of
> > Optional. Often it can be left out altogether and in other cases I find
> > Union[T, None] more expressive (explicit) than Optional[T] (+
> > the latter saves only 3 chars).
> > Especially for people not familiar with typing, the meaning of Optional is
> > not obvious at first sight. Union[T, None] on the other hand is pretty 
> > clear.
> > Also in other cases, where the default (fallback) is different from None,
> > you'd have to use Union anyway. For example a function that normally returns
> > an object of type T but in some circumstances it cannot and then it returns
> > the reason as a str, i.e. -> Union[T, str];
> > Optional won't help here.
> > But this should be very rare.
> Most functions that can return a fallback value return a fallback value of 
> the expected
> return type. For example, a get(key, default) method will return the default 
> param, and
> the caller should pass in a default value of the type they’re expecting to 
> look up. So,
> this shouldn’t be get(key: KeyType, default: T) -> Union[ValueType, T], it 
> should be
> get(key: KeyType, default: ValueType) -> ValueType. Or maybe get(key: 
> KeyType, default:
> Optional[ValueType]=None) -> Optional[ValueType].
> Most functions that want to explain why they failed do so by raising an 
> exception, not
> by returning a string.
> And what other cases are there?

Well, I actually made this up, so I can't think of any other real cases either 
:-)

> Of course you could be trying to add type checking to some weird legacy 
> codebase that
> doesn’t do things Pythonically, so you have to use Union returns. But that’s 
> specific to
> that one weird codebase.
> Meanwhile, Optional return values are common all over Python.
> Also, Python’s typing system is a lot easier to grasp if you’re familiar with 
> an
> established modern-typed language (Swift, Scala, Haskell, F#, etc.), and they 
> also use
> Optional[T] (or optional or Maybe t or some other spelling of the same 
> idea) all
> over be place—so often that many of them have added shortcuts like T? to make 
> it easier to
> write and less intrusive to read.

I don't have experience in any of these languages (basically I'm self-taught 
Python), so I learned it mostly from the docs (also `Optional`). That doesn't 
necessarily imply understanding the importance of the concept, but I 
acknowledge that `Optional[T]` is much easier to read than `Union[T, None]`; 
the former has less visual overhead and it reads more like "natural" language, 
so once you combine this with the fact that functions return `None` when they 
don't hit a `return` statement (or the convention of explicitly putting `return 
None` at the end), the meaning of `Optional[T]` becomes more clear.

> I think there may be a gap in the docs. They make perfect sense to someone 
> with
> experience in one of those languages, but a team that has nobody with that 
> experience
> might be a little lost. There’s a mile-high overview, a theory paper, and 
> then basically
> just reference docs that expect you to already know all the key concepts that 
> you don’t
> already know. Maybe that’s something that an outsider who’s trying to learn 
> from the docs
> plus trial and error could help improve?
> > Scanning through the docs and PEP I can't find
> > strongly motivating examples for Optional (over Union[T, None]).
> > E.g. in the following:
> > def lookup(self, name: str) -> Optional[Node]:
> >nodes = self.get(name)
> >if nodes:
> >return nodes[-1]
> >return None
> > I would rather write Union[Node, None] because that's much more explicit
> > about what happens.
> > Then introducing ~T in place of Optional[T] just further
> > obfuscates the meaning of the code:
> > def lookup(self, name: str) -> ~Node:
> > The ~ is easy to be missed (at least by human readers) and the meaning not
> > obvious.
> > That’s kind of funny, because I had to read your Union[Node, None] a couple 
> > times
> before I realized you hadn’t written Union[Node, Node]. :)

I had a similar thought when writing this, so I get the point. I'm not arguing 
against `Optional` I just think it's less self-explanatory than `Union[T, 
None]` when you see it for the first time and if you're not familiar with the 
concept in general. But that doesn't mean you shouldn't familiarize yourself 
with it :-)

> I do dislike ~ for other reasons (but I already mentioned them, Guido isn’t 
> convinced,
> so… fine, I don’t hate it that much). But I don’t thi

[Python-ideas] Re: Inspired by Scala, a new syntax for Union type

Guido van Rossum wrote:
> On Thu, Aug 29, 2019 at 4:04 PM Dominik Vilsmeier dominik.vilsme...@gmx.de
> wrote:
> > I never really understood the importance of
> > Optional. Often it can be
> > left out altogether and in other cases I find Union[T, None] more
> > expressive (explicit) than Optional[T] (+ the latter saves only 3
> > chars).
> > I respectfully disagree. In our (huge) codebase we see way more occurrences
> of Optional than of Union. It's not that it saves a tremendous amount of
> typing -- it's a much more intuitive meaning. Every time I see Union[T,
> None] I have to read it carefully to see what it means. When I see
> Optional[T] my brain moves on immediately (in a sense it's only one bit of
> information).

You are probably right, it's all a matter of how used our brains are to seeing 
stuff. So if I started using it more frequently, after some time I would 
probably appreciate it over `Union[T, None]`.

> > Especially for people not familiar with typing, the
> > meaning of Optional
> > is not obvious at first sight. Union[T, None] on the other hand is pretty
> > clear. Also in other cases, where the default (fallback) is different from
> > None, you'd have to use Union anyway. For example a function
> > that
> > normally returns an object of type T but in some circumstances it cannot
> > and then it returns the reason as a str, i.e. -> Union[T,
> > str];
> > Optional won't help here. Scanning through the docs and PEP I can't find
> > strongly motivating examples for Optional (over Union[T, None]).
> > E.g.
> > in the following:
> > def lookup(self, name: str) -> Optional[Node]:
> > nodes = self.get(name)
> > if nodes:
> > return nodes[-1]
> > return None
> > 
> > I would rather write Union[Node, None] because that's much more explicit
> > about what happens.
> > Then introducing ~T in place of Optional[T] just further
> > obfuscates
> > the meaning of the code:
> > def lookup(self, name: str) -> ~Node:
> > 
> > The ~ is easy to be missed (at least by human readers) and the meaning
> > not obvious.
> > Do you easily miss the - in an expression like -2?

I don't miss the `-` in the context because my brain is trained on recognizing 
such patterns. We encounter negative numbers everywhere, from (pre-)school on, 
so this pattern is easy to recognize. However `~Noun` is not something you've 
likely seen in the real world (or anywhere), so it's much harder to recognize. 
I cn wrt ths txt wtht vwls or eevn rdoerer teh lterets and you'll still be able 
to read it because your brain just fills in what it expects (i.e. what it is 
accustomed to). For that reason `~Node` is much harder to recognize than 
`-3637` because I wouldn't expect a `~` to appear in that place.

> Surely the meaning of ? in a programming language also has to be learned.
> And not every language uses it to mean "optional" (IIRC there's a language
> where it means "boolean" -- maybe Scheme?)
> > For Union on the other hand it would be
> > more helpful to have a shorter
> > syntax, int | str seems pretty clear, but what prevents tuples (int,
> > str) from being interpreted as unions by type checkers. This doesn't
> > require any changes to the built-in types and it is aligned with the
> > already existing syntax for checking multiple types with isinstance or
> > issubclass: isinstance(x, (int, str)). Having used this a couple
> > of
> > times, whenever I see a tuple of types I immediately think of them as or
> > options.
> > First, if we were to introduce (int, str) it would make
> more sense for
> it to mean Tuple[int, str] (tuples are also a very common type). Second,
> comma is already very overloaded.
> Yes, it's unfortunately that (int, str) means "union" in
> isinstance()
> but it's not enough to sway me.
> Anyway, let's just paint the bikeshed some color. :-)

I don't think it's unfortunate, it's pretty neat syntax (and intuitive). 
Checking if `x` is an instance of `y` it makes sense to "list" (`tuple`) 
multiple options for `y`. It's a clever way of reusing the available syntax / 
functionality of the language. And I think this is what typing should do as 
well: build around the existing language and use whatever is available. Adding 
`__or__` to `type` for allowing things like `int | str` on the other hand bends 
the language toward typing and thus is a step in the opposite direction.

Then I don't think it's the comma that receives emphasis in the syntax `(int, 
str)`, it's rather the parens - and those, as a bonus, provide visual 
boundaries for the beginning and end of the union. Consider

def foo(x: str | int, y: list)

[Python-ideas] Re: Inspired by Scala, a new syntax for Union type

I never really understood the importance of `Optional`. Often it can be left 
out altogether and in other cases I find `Union[T, None]` more expressive 
(explicit) than `Optional[T]` (+ the latter saves only 3 chars).

Especially for people not familiar with typing, the meaning of `Optional` is 
not obvious at first sight. `Union[T, None]` on the other hand is pretty clear. 
Also in other cases, where the default (fallback) is different from `None`, 
you'd have to use `Union` anyway. For example a function that normally returns 
an object of type `T` but in some circumstances it cannot and then it returns 
the reason as a `str`, i.e. `-> Union[T, str]`; `Optional` won't help here. 
Scanning through the docs and PEP I can't find strongly motivating examples for 
`Optional` (over `Union[T, None]`). E.g. in the following:

def lookup(self, name: str) -> Optional[Node]:
nodes = self.get(name)
if nodes:
return nodes[-1]
return None

I would rather write `Union[Node, None]` because that's much more explicit 
about what happens.

Then introducing `~T` in place of `Optional[T]` just further obfuscates the 
meaning of the code:

def lookup(self, name: str) -> ~Node:

The `~` is easy to be missed (at least by human readers) and the meaning not 
obvious.

For `Union` on the other hand it would be more helpful to have a shorter 
syntax, `int | str` seems pretty clear, but what prevents tuples `(int, str)` 
from being interpreted as unions by type checkers. This doesn't require any 
changes to the built-in types and it is aligned with the already existing 
syntax for checking multiple types with `isinstance` or `issubclass`: 
`isinstance(x, (int, str))`. Having used this a couple of times, whenever I see 
a tuple of types I immediately think of them as `or` options.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2CWLDHFUIWQK4ZMITBDUEY22PF2Y3O5J/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Add a `dig` method to dictionaries supporting the retrieval of nested keys

This sounds to me like functionality of a specific type (more specific than 
`dict`). `dict` can have any key-value pairs, it is not necessarily a nested 
structure of dicts. Thus having a method for such a specific use case with the 
general dict type doesn't feel right.

I think it's better if such functionality is provided by whatever 
infrastructure creates those specific data structures (the nested dicts). For 
example there is this project: [pyhocon](https://github.com/chimpler/pyhocon/) 
, a HOCON parser for Python which supports exactly that syntax:

conf['databases.mysql.host']  # conf is a nested dict of depth 3

Also writing a custom function isn't too much work:

def dig(d, *keys, default=None):
obj = d.get(keys[0], default)
if len(keys) > 1:
return dig(obj, *keys[1:], default=default) if isinstance(obj, 
dict) else default
return obj
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Q3Z4STN4VMNMKGMWB37WL3BLHETEKRDI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Inspired by Scala, a new syntax for Union type