[Python-Dev] Re: Azure Pipelines PR: Spurious failure of 3.8 branch

2020-02-03 Thread Chris Withers
This seems the best thread to follow up on, just had a spurious failure 
backporting a patch to 3.8 from master:


https://dev.azure.com/Python/cpython/_build/results?buildId=57386&view=logs&j=c83831cd-3752-5cc7-2f01-8276919eb334&t=5a421c4a-0933-53d5-26b9-04b36ad165eb 



Trying the close/re-open to get CI to re-run...

Chris

On 01/02/2020 23:20, Kyle Stanley wrote:

> I think we're at the point where it's probably okay to disable Azure
> Pipelines as a required check and replace it with the GitHub Actions 
checks.


Sounds good, GitHub Actions CI looks like it's been working smoothly. 
Since you introduced it, I haven't encountered any issues in my own 
PRs or ones that I've reviewed.


> But ignoring that for now, I think it's probably best to re-run CI
> (close/reopen). Just in case the change did actually cause a problem
> that may only show up on that particular configuration. The agent isn't
> actually within our control, so it'll be recreated automatically.

Thanks for the advice! I figured this would be the best option in this 
situation, but since I wasn't sure about how exactly the agents 
worked, it seemed like a good idea to ask first.




On Sat, Feb 1, 2020 at 6:27 AM Steve Dower > wrote:


On 01Feb2020 1840, Kyle Stanley wrote:
> In a recent PR (https://github.com/python/cpython/pull/18057
), I
> received the following error message in the Azure Pipelines
build results:
>
> ##[error]We stopped hearing from agent Azure Pipelines 5. Verify the
> agent machine is running and has a healthy network connection.
Anything
> that terminates an agent process, starves it for CPU, or blocks its
> network access can cause this error. For more information, see:
> https://go.microsoft.com/fwlink/?linkid=846610

>
> Build:
>

https://dev.azure.com/Python/cpython/_build/results?buildId=57319&view=results


>
> Is there something on our end we can do to bring the agent back
online,
> or should I simply wait a while and then try to restart the PR
checks?
> Normally I'd avoid doing that, but in this case it's entirely
unrelated
> to the PR.

I think we're at the point where it's probably okay to disable Azure
Pipelines as a required check and replace it with the GitHub
Actions checks.

But ignoring that for now, I think it's probably best to re-run CI
(close/reopen). Just in case the change did actually cause a problem
that may only show up on that particular configuration. The agent
isn't
actually within our control, so it'll be recreated automatically.

(FWIW, the two failing buildbots on the PR are unsupported for
3.9, but
haven't been disabled yet.)

Cheers,
Steve


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WANEWGWPZKN2KZGRWSC55DEF7RHRMQ4X/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5WFANO453YSLMSJIYLRHNVUTPRAQ4VC6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Raymond Hettinger
> PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x and y are 
> the same object, then equality comparison returns True and inequality False. 
> No attempt is made to execute __eq__ or __ne__ methods in those cases.
> 
> This has visible consequences all over the place, but they don't appear to be 
> documented. For example,
> 
> ...
> despite that math.nan == math.nan is False.
> 
> It's usually clear which methods will be called, and when, but not really 
> here. Any _context_ that calls PyObject_RichCompareBool() under the covers, 
> for an equality or inequality test, may or may not invoke __eq__ or __ne__, 
> depending on whether the comparands are the same object. Also any context 
> that inlines these special cases to avoid the overhead of calling 
> PyObject_RichCompareBool() at all.
> 
> If it's intended that Python-the-language requires this, that needs to be 
> documented.

This has been slowly, but perhaps incompletely documented over the years and 
has become baked in the some of the collections ABCs as well.  For example, 
Sequence.__contains__() is defined as:

def __contains__(self, value):
for v in self:
if v is value or v == value:  # note the identity test
return True
return False

Various collections need to assume reflexivity, not just for speed, but so that 
we can reason about them and so that they can maintain internal consistency. 
For example, MutableSet defines pop() as:

def pop(self):
"""Return the popped value.  Raise KeyError if empty."""
it = iter(self)
try:
value = next(it)
except StopIteration:
raise KeyError from None
self.discard(value)
return value

That pop() logic implicitly assumes an invariant between membership and 
iteration:

   assert(x in collection for x in collection)

We really don't want to pop() a value *x* and then find that *x* is still in 
the container.   This would happen if iter() found the *x*, but discard() 
couldn't find the object because the object can't or won't recognize itself:

 s = {float('NaN')}
 s.pop()
 assert not s  # Do we want the language to guarantee that 
s is now empty?  I think we must.

The code for clear() depends on pop() working:

def clear(self):
"""This is slow (creates N new iterators!) but effective."""
try:
while True:
self.pop()
except KeyError:
pass

It would unfortunate if clear() could not guarantee a post-condition that the 
container is empty:

 s = {float('NaN')}
 s.clear()
 assert not s   # Can this be allowed to fail?

The case of count() is less clear-cut, but even there identity-implies-equality 
improves our ability to reason about code:  Given some list, *s*, possibly 
already populated, would you want the following code to always work:

 c = s.count(x)
 s.append(x)
 assert s.count(x) == c + 1 # To me, this is fundamental to what 
the word "count" means.

I can't find it now, but remember a possibly related discussion where we 
collectively rejected a proposal for an __is__() method.  IIRC, the reasoning 
was that our ability to think about code correctly depended on this being true:

a = b
assert a is b

Back to the discussion at hand, I had thought our position was roughly:

* __eq__ can return anything it wants.

* Containers are allowed but not required to assume that 
identity-implies-equality.

* Python's core containers make that assumption so that we can keep
  the containers internally consistent and so that we can reason about
  the results of operations.

Also, I believe that even very early dict code (at least as far back as Py 
1.5.2) had logic for "v is value or v == value".

As far as NaNs go, the only question is how far to propagate their notion of 
irreflexivity. Should "x == x" return False for them? We've decided yes.  When 
it comes to containers, who makes the rules, the containers or their elements.  
Mostly, we let the elements rule, but containers are allowed to make useful 
assumptions about the elements when necessary.  This isn't much different than 
the rules for the "==" operator where __eq__() can return whatever it wants, 
but functions are still allowed to write "if x == y: ..." and assumes that 
meaningful boolean value has been returned (even if it wasn't).  Likewise, the 
rule for "<" is that it can return whatever it wants, but sorted() and min() 
are allowed to assume a meaningful total ordering (which might or might not be 
true).  In other words, containers and functions are allowed, when necessary or 
useful, to override the decisions made by their data.   This seems like a 
reasonable state of affairs.

The current docs make an effort to describe what we have now: 
https://docs.python.org/3/reference/expressions.html#value-comparisons 

Sorry for the lack of concision.  I'm 

[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Raymond Hettinger
> We propose to revert 5 changes:
> 
>   • Removed tostring/fromstring methods in array.array and base64 modules
>   • Removed collections aliases to ABC classes
>   • Removed fractions.gcd() function (which is similar to math.gcd())
>   • Remove "U" mode of open(): having to use io.open() just for Python 2 
> makes the code uglier
>   • Removed old plistlib API: 2.7 doesn't have the new API

+1 from me.  We don't gain anything by removing these in 3.9 instead of 3.10, 
so it is perfectly reasonable to ease the burden on users by deferring them for 
another release.


Raymond
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/52V6RP2WBC43OWTLBICS77MD3IGSV5CI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Raymond Hettinger
I forget to mention that list.index() also uses PyObject_RichCompareBool().  
Given a non-empty list *s*:

s[0] = x
assert s.index(x) == 0   # We want this to always work

or:
 
s = [x]
assert s.index(x) == 0# Should not raise a ValueError  

If those two assertions aren't reliable, then it's hard to correctly reason 
about algorithms that use index() to find previously stored objects. This, of 
course, is the primary use case for index().

Likewise, list.remove() also uses PyObject_RichCompareBool():

s = []
...
s.append(x)
s.remove(x)

In a code review, would you suspect that the above code could fail?  If so, how 
would you mitigate the risk to prevent failure?  Off-hand, the simplest 
remediation I can think of is:

s = []
...
s.append(x)
if x == x:# New, perplexing code
s.remove(x)  # Now, this is guaranteed not to fail
else:
logging.debug(f"Removing the first occurrence of {x!r} the hard way")
for i, y in enumerate(s):
 if x is y:
 del s[i]
 break

In summary, I think it is important to guarantee the identity-implies-equality 
step currently in PyObject_RichCompareBool().  It isn't just an optimization, 
it is necessary for writing correct application code without tricks such at the 
"if x == x: ..." test.


Raymond
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NDBUPT6OWNLPLTD5MI3A3VYNNKLMA3ME/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Petr Viktorin

On 2020-01-31 19:47, Mike Miller wrote:


On 2020-01-23 07:20, Victor Stinner wrote:
 > Python 3.9 introduces many small incompatible changes which broke tons


There's a well-known and established way of signaling breaking changes 
in software platforms—it is to increment the major version number.


Rather than debating the merits of breaking code on 3.9 or 3.10, 
wouldn't it make more sense to do it in a Python 4.0 instead?  Well, 
either of these strategies sound logical to me:


- Python 4.0 with removal of all of the Python 3-era deprecations
- Continuing Python 3.1X with no breaks

In other words, we should keep compatibility, or not.  In any case, from 
the looks of it these will be tiny breaks compared to the Unicode 
transition.


The Unicode transition also looked very small back when 3.0 was planned.
It only takes one such not-so-little thing to make a big breaking 
release like 3.0. And even if all the changes were little, I wouldn't 
want to inflict 10 years of papercuts at once.


When the changes are rolled out gradually across minor releases, those 
that cause unforeseen trouble in real-world code can be identified in 
the alphas/betas, and rethought/reverted if necessary.



Ethan Furman wrote:
I've gotta say, I like that plan.  Instead of going to x.10, go to x+1.0.  Every ten years we bump the major version and get rid of all the deprecations. 


I don't. I hope the 10-year (and counting) transition from Python 2 to 
Python 3 will not become a tradition.
I'd rather iterate on making removals less drastic (e.g. by making the 
DeprecationWarnings more visible). Iterate with a feedback loop, rather 
than do a one-time change and hope that everything goes well.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CHMOFONIBAACIW5A5SNHLTV6A5BEQXYT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Thomas Wouters
On Mon, Feb 3, 2020 at 10:53 AM Petr Viktorin  wrote:

> On 2020-01-31 19:47, Mike Miller wrote:
> >
> > On 2020-01-23 07:20, Victor Stinner wrote:
> >  > Python 3.9 introduces many small incompatible changes which broke tons
> >
> >
> > There's a well-known and established way of signaling breaking changes
> > in software platforms—it is to increment the major version number.
> >
> > Rather than debating the merits of breaking code on 3.9 or 3.10,
> > wouldn't it make more sense to do it in a Python 4.0 instead?  Well,
> > either of these strategies sound logical to me:
> >
> > - Python 4.0 with removal of all of the Python 3-era deprecations
> > - Continuing Python 3.1X with no breaks
> >
> > In other words, we should keep compatibility, or not.  In any case, from
> > the looks of it these will be tiny breaks compared to the Unicode
> > transition.
>
> The Unicode transition also looked very small back when 3.0 was planned.
> It only takes one such not-so-little thing to make a big breaking
> release like 3.0. And even if all the changes were little, I wouldn't
> want to inflict 10 years of papercuts at once.
>

I agree with the sentiment that gradual deprecations are more easily
managed, this statement about Python 3.0 is not true. The unicode
transition was never thought to be small, and that's *why* 3.0 was such a
big change. We knew it was going to break everything, so we took the
opportunity to break more things, like the behaviour of indexing
bytestrings. (Bytestrings were even going to be *mutable* for a while.) I
think we can all agree that it was a mistake, and that's certainly
something we don't want to repeat: even if we defer actual removal of
features for a x.0.0 release, it must not become carte blanche for breaking
things.


>
> When the changes are rolled out gradually across minor releases, those
> that cause unforeseen trouble in real-world code can be identified in
> the alphas/betas, and rethought/reverted if necessary.
>

That's also the case if things are (loudly) deprecated *but not removed*
until a x.0.0 release. The replacement would already be in use for years
before the old way would go away.


>
>
> Ethan Furman wrote:
> > I've gotta say, I like that plan.  Instead of going to x.10, go to
> x+1.0.  Every ten years we bump the major version and get rid of all the
> deprecations.
>
> I don't. I hope the 10-year (and counting) transition from Python 2 to
> Python 3 will not become a tradition.
> I'd rather iterate on making removals less drastic (e.g. by making the
> DeprecationWarnings more visible). Iterate with a feedback loop, rather
> than do a one-time change and hope that everything goes well.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/CHMOFONIBAACIW5A5SNHLTV6A5BEQXYT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Thomas Wouters 

Hi! I'm an email virus! Think twice before sending your email to help me
spread!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2R5CKAZICLAGNENAEBN6Z4ZNZRLM7YGW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Petr Viktorin

On 2020-02-03 12:55, Thomas Wouters wrote:



On Mon, Feb 3, 2020 at 10:53 AM Petr Viktorin > wrote:


On 2020-01-31 19:47, Mike Miller wrote:
 >
 > On 2020-01-23 07:20, Victor Stinner wrote:
 >  > Python 3.9 introduces many small incompatible changes which
broke tons
 >
 >
 > There's a well-known and established way of signaling breaking
changes
 > in software platforms—it is to increment the major version number.
 >
 > Rather than debating the merits of breaking code on 3.9 or 3.10,
 > wouldn't it make more sense to do it in a Python 4.0 instead?  Well,
 > either of these strategies sound logical to me:
 >
 > - Python 4.0 with removal of all of the Python 3-era deprecations
 > - Continuing Python 3.1X with no breaks
 >
 > In other words, we should keep compatibility, or not.  In any
case, from
 > the looks of it these will be tiny breaks compared to the Unicode
 > transition.

The Unicode transition also looked very small back when 3.0 was planned.
It only takes one such not-so-little thing to make a big breaking
release like 3.0. And even if all the changes were little, I wouldn't
want to inflict 10 years of papercuts at once.


I agree with the sentiment that gradual deprecations are more easily 
managed, this statement about Python 3.0 is not true. The unicode 
transition was never thought to be small, and that's *why* 3.0 was such 
a big change.


Alright, "very small" is an overstatement. But it did seem much smaller 
than it turned out to be.
https://docs.python.org/3.0/whatsnew/3.0.html lists it as the last of 
the big breaking changes, while most of the porting efforts were spent 
on it.



We knew it was going to break everything, so we took the 
opportunity to break more things, like the behaviour of indexing 
bytestrings. (Bytestrings were even going to be *mutable* for a while.) 
I think we can all agree that it was a mistake, and that's certainly 
something we don't want to repeat: even if we defer actual removal of 
features for a x.0.0 release, it must not become carte blanche for 
breaking things. >


When the changes are rolled out gradually across minor releases, those
that cause unforeseen trouble in real-world code can be identified in
the alphas/betas, and rethought/reverted if necessary.


That's also the case if things are (loudly) deprecated *but not removed* 
until a x.0.0 release. The replacement would already be in use for years 
before the old way would go away.


I fear that this is only so if no unexpected issues come up.
If we do a loud deprecation, how can we be sure we did it right?





Ethan Furman wrote:
 > I've gotta say, I like that plan.  Instead of going to x.10, go
to x+1.0.  Every ten years we bump the major version and get rid of
all the deprecations.

I don't. I hope the 10-year (and counting) transition from Python 2 to
Python 3 will not become a tradition.
I'd rather iterate on making removals less drastic (e.g. by making the
DeprecationWarnings more visible). Iterate with a feedback loop, rather
than do a one-time change and hope that everything goes well.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/X5RBXEEB5BFPXGGGT2MTG4KZL3OBXQFD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Serhiy Storchaka
We could introduce parallel kinds of collections: 
ValueList/IdentityList, ValueDict/IdentityDict, etc. Ones would use 
comparison by value and do not preserve identity (so we could use more 
efficient storage for homogeneous collections, for example a list of 
small ints could spend 1 byte/item). And others would use comparison by 
identity.


IdentityDict was already discussed before. There is a demand on this 
feature, but it is not large if keep backward compatibility. There is a 
workaround (a dict of id(key) to a tuple of (key, value)), which is not 
compatible with IdentityDict, so the latter can be a replacement in a 
public API.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SITZY26MW4LYHB7VN5RVCQZIMYTYHOGM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Antoine Pitrou
On Mon, 3 Feb 2020 13:18:46 +0100
Petr Viktorin  wrote:
> > 
> > I agree with the sentiment that gradual deprecations are more easily 
> > managed, this statement about Python 3.0 is not true. The unicode 
> > transition was never thought to be small, and that's *why* 3.0 was such 
> > a big change.  
> 
> Alright, "very small" is an overstatement. But it did seem much smaller 
> than it turned out to be.
> https://docs.python.org/3.0/whatsnew/3.0.html lists it as the last of 
> the big breaking changes, while most of the porting efforts were spent 
> on it.

I don't think the order in this page has any relationship to the
difficulty of each change.  I'm not sure there was any particular
reason to order them in this way, but it seems to me that the easier
to understand changes were put first.

AFAIR, we were all acutely aware that the text model overhaul was going
to be the hardest change for many users (especially those who have
never encountered non-ASCII input). Most other changes in Python 3 have
a predetermined context-free repair, while migrating to the new text
model requires *thinking* about the context in which each Python 2
bytestring is used.  It forces people to think about the problem
they're solving in ways many of them didn't think of it before.

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HFGK6EFTOX7RIWFQE3B2AZ5PZ37SO6SB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Guido van Rossum
+1 on everything Raymond says here (and in his second message).

I don't see a need for more classes or ABCs.

On Mon, Feb 3, 2020 at 00:36 Raymond Hettinger 
wrote:

> > PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x and
> y are the same object, then equality comparison returns True and inequality
> False. No attempt is made to execute __eq__ or __ne__ methods in those
> cases.
> >
> > This has visible consequences all over the place, but they don't appear
> to be documented. For example,
> >
> > ...
> > despite that math.nan == math.nan is False.
> >
> > It's usually clear which methods will be called, and when, but not
> really here. Any _context_ that calls PyObject_RichCompareBool() under the
> covers, for an equality or inequality test, may or may not invoke __eq__ or
> __ne__, depending on whether the comparands are the same object. Also any
> context that inlines these special cases to avoid the overhead of calling
> PyObject_RichCompareBool() at all.
> >
> > If it's intended that Python-the-language requires this, that needs to
> be documented.
>
> This has been slowly, but perhaps incompletely documented over the years
> and has become baked in the some of the collections ABCs as well.  For
> example, Sequence.__contains__() is defined as:
>
> def __contains__(self, value):
> for v in self:
> if v is value or v == value:  # note the identity test
> return True
> return False
>
> Various collections need to assume reflexivity, not just for speed, but so
> that we can reason about them and so that they can maintain internal
> consistency. For example, MutableSet defines pop() as:
>
> def pop(self):
> """Return the popped value.  Raise KeyError if empty."""
> it = iter(self)
> try:
> value = next(it)
> except StopIteration:
> raise KeyError from None
> self.discard(value)
> return value
>
> That pop() logic implicitly assumes an invariant between membership and
> iteration:
>
>assert(x in collection for x in collection)
>
> We really don't want to pop() a value *x* and then find that *x* is still
> in the container.   This would happen if iter() found the *x*, but
> discard() couldn't find the object because the object can't or won't
> recognize itself:
>
>  s = {float('NaN')}
>  s.pop()
>  assert not s  # Do we want the language to guarantee
> that s is now empty?  I think we must.
>
> The code for clear() depends on pop() working:
>
> def clear(self):
> """This is slow (creates N new iterators!) but effective."""
> try:
> while True:
> self.pop()
> except KeyError:
> pass
>
> It would unfortunate if clear() could not guarantee a post-condition that
> the container is empty:
>
>  s = {float('NaN')}
>  s.clear()
>  assert not s   # Can this be allowed to fail?
>
> The case of count() is less clear-cut, but even there
> identity-implies-equality improves our ability to reason about code:  Given
> some list, *s*, possibly already populated, would you want the following
> code to always work:
>
>  c = s.count(x)
>  s.append(x)
>  assert s.count(x) == c + 1 # To me, this is fundamental to
> what the word "count" means.
>
> I can't find it now, but remember a possibly related discussion where we
> collectively rejected a proposal for an __is__() method.  IIRC, the
> reasoning was that our ability to think about code correctly depended on
> this being true:
>
> a = b
> assert a is b
>
> Back to the discussion at hand, I had thought our position was roughly:
>
> * __eq__ can return anything it wants.
>
> * Containers are allowed but not required to assume that
> identity-implies-equality.
>
> * Python's core containers make that assumption so that we can keep
>   the containers internally consistent and so that we can reason about
>   the results of operations.
>
> Also, I believe that even very early dict code (at least as far back as Py
> 1.5.2) had logic for "v is value or v == value".
>
> As far as NaNs go, the only question is how far to propagate their notion
> of irreflexivity. Should "x == x" return False for them? We've decided
> yes.  When it comes to containers, who makes the rules, the containers or
> their elements.  Mostly, we let the elements rule, but containers are
> allowed to make useful assumptions about the elements when necessary.  This
> isn't much different than the rules for the "==" operator where __eq__()
> can return whatever it wants, but functions are still allowed to write "if
> x == y: ..." and assumes that meaningful boolean value has been returned
> (even if it wasn't).  Likewise, the rule for "<" is that it can return
> whatever it wants, but sorted() and min() are allowed to assume a
> meaningful total ordering (which might or might not be true).  In other
> words, contai

[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Tim Peters
[Tim]
>> PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x
>> and y are the same object, then equality comparison returns True
>> and inequality False. No attempt is made to execute __eq__ or
>> __ne__ methods in those cases.
>> ...
>> If it's intended that Python-the-language requires this, that needs to
>> be documented.

[Raymond]
> This has been slowly, but perhaps incompletely documented over the
> years and has become baked in the some of the collections ABCs as well.
>  For example, Sequence.__contains__() is defined as:
>
> def __contains__(self, value):
> for v in self:
> if v is value or v == value:  # note the identity test
> return True
> return False

But it's unclear to me whether that's intended to constrain all
implementations, or is just mimicking CPython's list.__contains__.
That's always a problem with operational definitions.  For example,
does it also constrain all implementations to check in iteration
order?  The order can be visible, e.g, in the number of times v.__eq__
is called.


> Various collections need to assume reflexivity, not just for speed, but so 
> that we
> can reason about them and so that they can maintain internal consistency. For
> example, MutableSet defines pop() as:
>
> def pop(self):
> """Return the popped value.  Raise KeyError if empty."""
> it = iter(self)
> try:
> value = next(it)
> except StopIteration:
> raise KeyError from None
> self.discard(value)
> return value

As above, except  CPyhon's own set implementation implementation
doesn't faithfully conform to that:

>>> x = set(range(0, 10, 2))
>>> next(iter(x))
0
>>> x.pop() # returns first in iteration order
0
>>> x.add(1)
>>> next(iter(x))
1
>>> x.pop()  # ditto
1
>>> x.add(1)  # but try it again!
>>> next(iter(x))
1
>>> x.pop() # oops! didn't pop the first in iteration order
2

Not that I care ;-)  Just emphasizing that it's tricky to say no more
(or less) than what's intended.

> That pop() logic implicitly assumes an invariant between membership and 
> iteration:
>
>assert(x in collection for x in collection)

Missing an "all".

> We really don't want to pop() a value *x* and then find that *x* is still
> in the container.   This would happen if iter() found the *x*, but discard()
> couldn't find the object because the object can't or won't recognize itself:

Speaking of which, why is "discard()" called instead of "remove()"?
It's sending a mixed message:  discard() is appropriate when you're
_not_ sure the object being removed is present.


>  s = {float('NaN')}
>  s.pop()
>  assert not s  # Do we want the language to guarantee that
>   # s is now empty?  I think we must.

I can't imagine an actual container implementation that wouldn't. but
no actual container implements pop() in the odd way MutableSet.pop()
is written.  CPython's set.pop does nothing of the sort - doesn't even
have a pointer equality test (except against C's NULL and `dummy`,
used merely to find "the first (starting at the search finger)" slot
actually in use).

In a world where we decided that the identity shortcut is _not_
guaranteed by the language, the real consequence would be that the
MutableSet.pop() implementation would need to be changed (or made
NotImplemented, or documented as being specific to CPython).

> The code for clear() depends on pop() working:
>
> def clear(self):
> """This is slow (creates N new iterators!) but effective."""
> try:
> while True:
> self.pop()
> except KeyError:
> pass
>
> It would unfortunate if clear() could not guarantee a post-condition that the
> container is empty:

That's again a consequence of how MutableSet.pop was written.  No
actual container has any problem implementing clear() without needing
any kind of object comparison.

>  s = {float('NaN')}
>  s.clear()
>  assert not s   # Can this be allowed to fail?

No, but as above it's a very far stretch to say that clear() emptying
a container _relies_ on the object identity shortcut.  That's a just a
consequence of an odd specific clear() implementation, relying in turn
on an odd specific pop() implementation that assumes the shortcut is
in place.


> The case of count() is less clear-cut, but even there 
> identity-implies-equality
> improves our ability to reason about code:

Absolutely!  That "x is x implies equality" is very useful.  But
that's not the question ;-)

>  Given some list, *s*, possibly already populated, would you want the
> following code to always work:
>
>  c = s.count(x)
>  s.append(x)
>  assert s.count(x) == c + 1 # To me, this is fundamental
>   to what the word 
> "count" means.

I would, yes.  But it's also possible to define s.

[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Mike Miller



On 2020-02-03 01:50, Petr Viktorin wrote:
When the changes are rolled out gradually across minor releases, those that 
cause unforeseen trouble in real-world code can be identified in the 
alphas/betas, and rethought/reverted if necessary.



To be clear, my suggestion was to maintain gradual deprecations and warnings, 
but have a single removal event at the start of a new version major number.  So 
there will be many years of betas and releases to haggle over.


Also, I believe it is possible to separate the "mechanically fixable" breaks 
from larger changes in fundamentals.  Few folks are willing to entertain or 
stomach the latter at this point in the lifecycle of Python.  If one happens to 
occur, scheduling it for X.4 instead of X+1.0 doesn't help much, and may serve 
to obscure it.


In some sense, distributing the breaks avoids potential/temporary bad press at 
the cost of easy planning.  However, it feels like a very regular, easy to 
understand process should trump that in the long run.


As a refinement to the idea above, perhaps a sub rule could be added:

  - No deprecations should appear in a X.9 release to give folks time
to prepare.

-Mike
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A7DWXEXHIBQ24ZOMTJ55NYCWGFOHRPMS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Ethan Furman

On 2020-01-23 07:20, Victor Stinner wrote:

Python 3.9 introduces many small incompatible changes which broke tons


On 2020-01-31 19:47, Mike Miller wrote:

There's a well-known and established way of signaling breaking changes in 
software platforms—it is to increment the major version number.

Rather than debating the merits of breaking code on 3.9 or 3.10, wouldn't it 
make more sense to do it in a Python 4.0 instead?  Well, either of these 
strategies sound logical to me:

- Python 4.0 with removal of all of the Python 3-era deprecations
- Continuing Python 3.1X with no breaks

In other words, we should keep compatibility, or not.  In any case, from the 
looks of it these will be tiny breaks compared to the Unicode transition.


Ethan Furman wrote:
I've gotta say, I like that plan.  Instead of going to x.10, go to x+1.0.  Every ten years we bump the major version and get rid of all the deprecations. 


Petr Viktorin wrote:

I don't. I hope the 10-year (and counting) transition from Python 2 to Python 3 
will not become a tradition.
I'd rather iterate on making removals less drastic (e.g. by making the 
DeprecationWarnings more visible). Iterate with a feedback loop, rather than do 
a one-time change and hope that everything goes well.


As a user I would much rather know that my 3.2 code worked in every version of 
3.x, and not have to make changes in 3.5 and 3.7 and 3.11.  Talk about death by 
paper cuts!  I'd either be stuck updating already working code to get the 
benefits of the latest Python 3, or having multiple versions of Python 3 on my 
system.  Both options are galling.

Having the latest Python 2, the latest Python 3, and the latest Python 4 is 
much more palatable.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HMYYAX6NCAT2MT5E32QXBMRJ2JH7JSNU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Sebastian Berg
Now, probably this has been rejected a hundred times before, and there
are some very good reason why it is a horrible thought...

But if `PyObject_RichCompareBool(..., Py_EQ)` is such a fundamental
operation (and in a sense it seems to me that it is), is there a point
in explicitly defining it?

That would mean adding `operator.equivalent(a, b) -> bool` which would
allow float to override the result and let
`operator.equivalent_value(float("NaN"), float("NaN))` return True;
luckily very few types would actually override the operation.

That operator would obviously be allowed to use the shortcut.

At that point container `==` and `in` (and equivalence) is defined
based on element equivalence.
NAs (missing value handling) may be an actual use-case where it is more
than a theoretical thought. However, I do not seriously work with NAs
myself.

- Sebastian


On Mon, 2020-02-03 at 16:00 -0600, Tim Peters wrote:
> [Tim]
> > > PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if
> > > x
> > > and y are the same object, then equality comparison returns True
> > > and inequality False. No attempt is made to execute __eq__ or
> > > __ne__ methods in those cases.
> > > ...
> > > If it's intended that Python-the-language requires this, that
> > > needs to
> > > be documented.
> 
> [Raymond]
> > This has been slowly, but perhaps incompletely documented over the
> > years and has become baked in the some of the collections ABCs as
> > well.
> >  For example, Sequence.__contains__() is defined as:
> > 
> > def __contains__(self, value):
> > for v in self:
> > if v is value or v == value:  # note the
> > identity test
> > return True
> > return False
> 
> But it's unclear to me whether that's intended to constrain all
> implementations, or is just mimicking CPython's list.__contains__.
> That's always a problem with operational definitions.  For example,
> does it also constrain all implementations to check in iteration
> order?  The order can be visible, e.g, in the number of times
> v.__eq__
> is called.
> 
> 
> > Various collections need to assume reflexivity, not just for speed,
> > but so that we
> > can reason about them and so that they can maintain internal
> > consistency. For
> > example, MutableSet defines pop() as:
> > 
> > def pop(self):
> > """Return the popped value.  Raise KeyError if empty."""
> > it = iter(self)
> > try:
> > value = next(it)
> > except StopIteration:
> > raise KeyError from None
> > self.discard(value)
> > return value
> 
> As above, except  CPyhon's own set implementation implementation
> doesn't faithfully conform to that:
> 
> > > > x = set(range(0, 10, 2))
> > > > next(iter(x))
> 0
> > > > x.pop() # returns first in iteration order
> 0
> > > > x.add(1)
> > > > next(iter(x))
> 1
> > > > x.pop()  # ditto
> 1
> > > > x.add(1)  # but try it again!
> > > > next(iter(x))
> 1
> > > > x.pop() # oops! didn't pop the first in iteration order
> 2
> 
> Not that I care ;-)  Just emphasizing that it's tricky to say no more
> (or less) than what's intended.
> 
> > That pop() logic implicitly assumes an invariant between membership
> > and iteration:
> > 
> >assert(x in collection for x in collection)
> 
> Missing an "all".
> 
> > We really don't want to pop() a value *x* and then find that *x* is
> > still
> > in the container.   This would happen if iter() found the *x*, but
> > discard()
> > couldn't find the object because the object can't or won't
> > recognize itself:
> 
> Speaking of which, why is "discard()" called instead of "remove()"?
> It's sending a mixed message:  discard() is appropriate when you're
> _not_ sure the object being removed is present.
> 
> 
> >  s = {float('NaN')}
> >  s.pop()
> >  assert not s  # Do we want the language to
> > guarantee that
> >   # s is now empty?  I
> > think we must.
> 
> I can't imagine an actual container implementation that wouldn't. but
> no actual container implements pop() in the odd way MutableSet.pop()
> is written.  CPython's set.pop does nothing of the sort - doesn't
> even
> have a pointer equality test (except against C's NULL and `dummy`,
> used merely to find "the first (starting at the search finger)" slot
> actually in use).
> 
> In a world where we decided that the identity shortcut is _not_
> guaranteed by the language, the real consequence would be that the
> MutableSet.pop() implementation would need to be changed (or made
> NotImplemented, or documented as being specific to CPython).
> 
> > The code for clear() depends on pop() working:
> > 
> > def clear(self):
> > """This is slow (creates N new iterators!) but
> > effective."""
> > try:
> > while True:
> > self.pop()
> > except KeyError:
> > pass
> > 
> > It would unfortunate if clear() could n

[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Larry Hastings

On 2/3/20 3:07 PM, Sebastian Berg wrote:

That would mean adding `operator.equivalent(a, b) -> bool` which would
allow float to override the result and let
`operator.equivalent_value(float("NaN"), float("NaN))` return True;
luckily very few types would actually override the operation.


You misunderstand what's going on here.  Python deliberately makes 
float('NaN') != float('NaN'), and in fact there's special code to ensure 
that behavior.  Why?  Because it's mandated by the IEEE 754 
floating-point standard.


   https://en.wikipedia.org/wiki/NaN#Comparison_with_NaN

This bizarre behavior is often exploited by people exploring the murkier 
corners of Python's behavior.  Changing it is (sadly) not viable.



//arry/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GOJNWAJSFHBSCCJD2RYWNDRN7RJHYWD3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Request to postpone some Python 3.9 incompatible changes to Python 3.10

2020-02-03 Thread Brett Cannon
Ethan Furman wrote:
> On 2020-01-23 07:20, Victor Stinner wrote:
> > Python 3.9 introduces many small incompatible changes
> > which broke tons
> > On 2020-01-31 19:47, Mike Miller wrote:
> > There's a well-known and established way of signaling
> > breaking changes in software platforms—it is to increment the major version 
> > number.
> > Rather than debating the merits of breaking code on 3.9 or 3.10, wouldn't 
> > it make more
> > sense to do it in a Python 4.0 instead?  Well, either of these strategies 
> > sound logical to
> > me:
> > 
> > Python 4.0 with removal of all of the Python 3-era deprecations
> > Continuing Python 3.1X with no breaks
> > 
> > In other words, we should keep compatibility, or not.  In any case, from 
> > the looks of
> > it these will be tiny breaks compared to the Unicode transition.
> > Ethan Furman wrote:
> > I've gotta say, I like that plan.  Instead of going
> > to x.10, go to x+1.0.  Every ten years we bump the major version and get 
> > rid of all the
> > deprecations.
> > Petr Viktorin wrote:
> > I don't. I hope the 10-year (and counting) transition
> > from Python 2 to Python 3 will not become a tradition.
> > I'd rather iterate on making removals less drastic (e.g. by making the 
> > DeprecationWarnings
> > more visible). Iterate with a feedback loop, rather than do a one-time 
> > change and hope
> > that everything goes well.
> > As a user I would much rather know that my 3.2 code worked in every version 
> > of
> 3.x, and not have to make changes in 3.5 and 3.7 and 3.11.  Talk about death 
> by paper
> cuts!  I'd either be stuck updating already working code to get the benefits 
> of the latest
> Python 3, or having multiple versions of Python 3 on my system.  Both options 
> are
> galling.
> Having the latest Python 2, the latest Python 3, and the latest Python 4 is 
> much more
> palatable.

Until you're being asked to maintain all of that for a decade. We paid a major 
price keeping Python 2 alive for over a decade. Now I'm not saying it wasn't 
the right thing to do considering what we changed, but for the stuff we are 
talking about removing it doesn't require a massive rewrite on the behalf of 
users. And we know from experience anything that is left in will get used no 
matter how loudly we try to broadcast that fact (and we know people still do 
not have a habit of running their code with warnings turned on).

I think people also forget that prior to worrying about maintaining 
backwards-compatibility with Python 2 we deprecated for a release and then we 
removed (so an 18 month deprecation period). Python survived, users survived, 
and we all had more time for other things than trying to keep deprecated stuff 
from completely rotting out (look at inspect and trying to wedge keyword-only 
arguments into argspec() and friends as to prices paid to keeping stuff we 
would have shifted users off of as having a cost). And I know some people think 
that if we flat-out say we won't touch deprecated code that it should be 
enough, but even triaging issues for deprecated stuff as "won't fix" is still 
not free.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/352ZQCOMN4YN7HHS5U2UK2B353M3CBCP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Sebastian Berg
On Mon, 2020-02-03 at 16:43 -0800, Larry Hastings wrote:
> On 2/3/20 3:07 PM, Sebastian Berg wrote:
> > That would mean adding `operator.equivalent(a, b) -> bool` which
> > would
> > allow float to override the result and let
> > `operator.equivalent_value(float("NaN"), float("NaN))` return True;
> > luckily very few types would actually override the operation.
> 
> You misunderstand what's going on here.  Python deliberately makes
> float('NaN') != float('NaN'), and in fact there's special code to
> ensure that behavior.  Why?  Because it's mandated by the IEEE 754
> floating-point standard.
> 
> > https://en.wikipedia.org/wiki/NaN#Comparison_with_NaN
> > 
> 
> This bizarre behavior is often exploited by people exploring the
> murkier corners of Python's behavior.  Changing it is (sadly) not
> viable.
> 

Of course it is not, I am not saying that it should be changed. What I
mainly meant is that in this discussion there was always the talk about
two distinct, slightly different operations:

1. `==` has of course the logic `NaN == NaN -> False`
2. `PyObject_RichCompareBool(a, b, Py_EQ)` was argued to have a useful
   logic of `a is b or a == b`. And I argued that you could define:
   
   def operator.identical(a, b):
   res = a is b or a == b
   assert type(res) is bool  # arrays have unclear logic
   return res

   to "bless" it as its own desired logic when dealing with containers
   (mainly).

And that making that distinction on the language level would be
a(possibly ugly) resolution of the problem.
Only `identical` is actually always allowed to use the `is` shortcut.
Now, for all practical purposes "identical" is maybe already correctly
defined by `a is b or bool(a == b)` (NaN being the largest
inconsistency, since NaN is not a singleton).
Along that line, I could argue that `PyObject_RichCompareBool` is
actually incorrectly implemented and it should be replaced with a new
`PyObject_Identical` in most places where it is used.

Once you get to the point where you accept the existance of `identical`
as a distinct operation, allowing `identical(NaN, NaN)` to be always
true *can* make sense, and resolves current inconsistencies w.r.t.
containers and NaNs.

- Sebastian

> 
> /arry
> 
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/GOJNWAJSFHBSCCJD2RYWNDRN7RJHYWD3/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CQIATNXMQW3GKZMAKF22GD6TVAO2X5KK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Chris Angelico
On Tue, Feb 4, 2020 at 10:12 AM Sebastian Berg
 wrote:
>
> Now, probably this has been rejected a hundred times before, and there
> are some very good reason why it is a horrible thought...
>
> But if `PyObject_RichCompareBool(..., Py_EQ)` is such a fundamental
> operation (and in a sense it seems to me that it is), is there a point
> in explicitly defining it?
>
> That would mean adding `operator.equivalent(a, b) -> bool` which would
> allow float to override the result and let
> `operator.equivalent_value(float("NaN"), float("NaN))` return True;
> luckily very few types would actually override the operation.
>
> That operator would obviously be allowed to use the shortcut.
>
> At that point container `==` and `in` (and equivalence) is defined
> based on element equivalence.
> NAs (missing value handling) may be an actual use-case where it is more
> than a theoretical thought. However, I do not seriously work with NAs
> myself.

The implication here is that there would be a corresponding dunder
method, yes? If it's possible for a type to override it, that would
need a dunder. I think that's not necessary; but if there were some
useful name that could be given to this "identical or equal"
comparison, then I think it'd be useful to (a) put that function into
the operator table, and (b) use that name in the description of
container operations.

Can the word "equivalent" be used for this, perhaps?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6W2WWBXDD5H45DCGV4H2BRUSC3LI7JWD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Steven D'Aprano
On Tue, Feb 04, 2020 at 12:33:44PM +1100, Chris Angelico wrote:

[Sebastian Berg]
> > But if `PyObject_RichCompareBool(..., Py_EQ)` is such a fundamental
> > operation (and in a sense it seems to me that it is), is there a point
> > in explicitly defining it?
> >
> > That would mean adding `operator.equivalent(a, b) -> bool` which would
> > allow float to override the result and let
> > `operator.equivalent_value(float("NaN"), float("NaN))` return True;
> > luckily very few types would actually override the operation.

> The implication here is that there would be a corresponding dunder
> method, yes? If it's possible for a type to override it, that would
> need a dunder.

I think the whole point of this is that it *cannot* be overridden. 
That's the gist of Raymond's comments about being able to reason about 
behaviour. Individual values can override the equality test, but they 
cannot override the identity test, and that's a good thing.

Can we summarise this issue like this?

[quote] 
Containers or other compound objects are permitted to use identity 
testing to shortcut what would otherwise be an equality test (e.g. in 
list equality tests, and containment tests), even if that would change 
the behaviour of unusual values, such as floating point NANs which 
compare unequal to themselves, or objects where `__eq__` have side 
effects.

Such containers are permitted to assume that their contents all obey the 
reflexivity of equality (each value is equal to itself) and so avoid 
calling `__eq__` or `__ne__`.

This is an implementation-specific detail which may differ across 
different container types and interpreters.
[end quote]

I don't think we need to make any promises about which specific 
containers use this rule. If you need to know, you can test it for 
yourself:

if (t:={'a': float('NAN')}) == t:
print('dict equality obeys reflexivity')

but otherwise, most people shouldn't need to care.


[...]
> Can the word "equivalent" be used for this, perhaps?

We don't need and shouldn't have a dunder for this, but the word 
"equivalent" would be wrong in any case. Two objects may be equivalent 
but not equal, for example, when it comes to iteration, the string "abc" 
is equivalent to the list ['a', 'b', 'c'].

I don't think there is any accurate term shorter than "identical or 
equal".


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R3M5UDOLY27QQYIHB4F5H6JPZ2KRZUBL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Chris Angelico
On Tue, Feb 4, 2020 at 1:08 PM Steven D'Aprano  wrote:
>
> On Tue, Feb 04, 2020 at 12:33:44PM +1100, Chris Angelico wrote:
>
> [Sebastian Berg]
> > > But if `PyObject_RichCompareBool(..., Py_EQ)` is such a fundamental
> > > operation (and in a sense it seems to me that it is), is there a point
> > > in explicitly defining it?
> > >
> > > That would mean adding `operator.equivalent(a, b) -> bool` which would
> > > allow float to override the result and let
> > > `operator.equivalent_value(float("NaN"), float("NaN))` return True;
> > > luckily very few types would actually override the operation.
>
> > The implication here is that there would be a corresponding dunder
> > method, yes? If it's possible for a type to override it, that would
> > need a dunder.
>
> I think the whole point of this is that it *cannot* be overridden.

Yes, I agree.

> Can we summarise this issue like this?
>
> [quote]
> Containers or other compound objects are permitted to use identity
> testing to shortcut what would otherwise be an equality test (e.g. in
> list equality tests, and containment tests), even if that would change
> the behaviour of unusual values, such as floating point NANs which
> compare unequal to themselves, or objects where `__eq__` have side
> effects.
>
> Such containers are permitted to assume that their contents all obey the
> reflexivity of equality (each value is equal to itself) and so avoid
> calling `__eq__` or `__ne__`.
>
> This is an implementation-specific detail which may differ across
> different container types and interpreters.
> [end quote]

I'd actually rather see it codified as a specific form of comparison
and made a guarantee, upon which other guarantees and invariants can
be based. It's not an optimization (although it can have the effect of
improving performance), it's a codification of the expectations of
containers. As such, this comparison would be defined by language
rules as the way that built-in containers behave, and would also be
the recommended and normal obvious way to build other container types.

> > Can the word "equivalent" be used for this, perhaps?
>
> We don't need and shouldn't have a dunder for this, but the word
> "equivalent" would be wrong in any case. Two objects may be equivalent
> but not equal, for example, when it comes to iteration, the string "abc"
> is equivalent to the list ['a', 'b', 'c'].

Hmm, true, although that's equivalent only in one specific situation.
In mathematics, "congruent" means that two things are functionally
equivalent (eg triangles with the same length sides; in programming
terms we'd probably say that two such triangles would be "equal" but
not identical), even if there's a specific context for such
equivalence, such as stating that 12,345 is congruent to 11 modulo 7,
because the remainders 12345%7 and 11%7 are both 4. So maybe
"congruent" could be used for this concept?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AFWTEJ4LHPRGALIB4GNURQ26VCVZXZRC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Steven D'Aprano
On Mon, Feb 03, 2020 at 05:26:38PM -0800, Sebastian Berg wrote:

> 1. `==` has of course the logic `NaN == NaN -> False`
> 2. `PyObject_RichCompareBool(a, b, Py_EQ)` was argued to have a useful
>logic of `a is b or a == b`. And I argued that you could define:
>
>def operator.identical(a, b):
>res = a is b or a == b
>assert type(res) is bool  # arrays have unclear logic
>return res
> 
>to "bless" it as its own desired logic when dealing with containers
>(mainly).

Note that Python arrays define equality similarly to other containers:

py> from array import array
py> array('i', [1, 2, 3]) == array('i', [2, 3, 1])
False

It is numpy arrays which do something unusual with equality. (And I 
would argue that they are wrong to do so. But that ship has long sailed 
over the horizon.)


> Only `identical` is actually always allowed to use the `is` shortcut.

You can't enforce that (and why would you want to?).

If I want to use an `is` shortcut in my `__eq__` methods, or write out 
the condition in full, who are you to say that's forbidden unless I call 
`identical`?


> Now, for all practical purposes "identical" is maybe already correctly
> defined by `a is b or bool(a == b)` (NaN being the largest
> inconsistency, since NaN is not a singleton).
> Along that line, I could argue that `PyObject_RichCompareBool` is
> actually incorrectly implemented and it should be replaced with a new
> `PyObject_Identical` in most places where it is used.

In what way is PyObject_RichCompareBool incorrect? Can you point to a 
bug caused by this incorrect implementation?


> Once you get to the point where you accept the existance of `identical`
> as a distinct operation, allowing `identical(NaN, NaN)` to be always
> true *can* make sense

We already have `identical` in the language, it is the `is` operator. 
Your "identical" function is misnamed, it should be 
"identical_or_equal".

If you want to argue that "identical or equal" is such a fundamental and 
important operation in Python code that we ought to offer it ready-made 
in the operator module, I'm listening. But my gut feeling here is to say 
"not every one line expression needs to be in the stdlib".

PyObject_RichCompareBool is a different story. "Identical or equal" is 
not so simple to implement correctly in C code, and it is a common 
operation used in lists, tuples, dicts and possibly others, so it makes 
sense for there to be a C API for it.


> and resolves current inconsistencies w.r.t. containers and NaNs.

How does it resolve these (alleged) inconsistencies?

The current status quo is that containers perform operations such as 
equality by testing for identity or equality, which they are permitted 
to do and is documented. Changing them to use your "identical or equal" 
API will (as far as I can see) change nothing about the semantics, 
behaviour or even implementation (since the C-level containers like list 
will surely still call PyObject_RichCompareBool rather than a 
Python-level wrapper).

So whatever inconsistencies exist, they will still exist.

If I have missed something, please tell me.



-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BIW455YJTJLS7XXZ2XT557Y7WBQQITJI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Glenn Linderman

On 2/3/2020 6:21 PM, Chris Angelico wrote:


Hmm, true, although that's equivalent only in one specific situation.
In mathematics, "congruent" means that two things are functionally
equivalent (eg triangles with the same length sides; in programming
terms we'd probably say that two such triangles would be "equal" but
not identical), even if there's a specific context for such
equivalence, such as stating that 12,345 is congruent to 11 modulo 7,
because the remainders 12345%7 and 11%7 are both 4. So maybe
"congruent" could be used for this concept?


Congruent is different objects with the same characteristics, whereas 
identical is far stronger: same objects.


But the reason  <=  and   >=  were invented was to avoid saying

a < b  or  a  == b and   a > b  or   a == b

It is just a shorthand.

So just invent   is==   as shorthand for   a is b  or  a == b.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2ESFOJXCSMVBBKY5L2FJKLDWYNH4POUJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?

2020-02-03 Thread Sebastian Berg
On Tue, 2020-02-04 at 13:44 +1100, Steven D'Aprano wrote:
> On Mon, Feb 03, 2020 at 05:26:38PM -0800, Sebastian Berg wrote:
> 


> If you want to argue that "identical or equal" is such a fundamental
> and 
> important operation in Python code that we ought to offer it ready-
> made 
> in the operator module, I'm listening. But my gut feeling here is to
> say 
> "not every one line expression needs to be in the stdlib".
> 

Probably, yes. I am only semi seriously suggesting it. I am happy to
get to the conclusion: NumPy is weird and NaNs are a corner case that
you just have to understand at some point.

Anyway, yes, I hinted at a dunder, I am not sure that is remotely
reasonable. And yes, I thought that if this is an important enough of a
"concept" it may make sense to bless it with a python side function.


> PyObject_RichCompareBool is a different story. "Identical or equal"
> is 
> not so simple to implement correctly in C code, and it is a common


Of course, it is just as simple C. If PyObject_RichCommpareBool would
simply not include the identity check, in which case it is identical to
`bool(a == b)` in python. (Which of course would be annoying to have to
type out.)

>  
> operation used in lists, tuples, dicts and possibly others, so it
> makes 
> sense for there to be a C API for it.
> 
> 
> > and resolves current inconsistencies w.r.t. containers and NaNs.
> 
> How does it resolve these (alleged) inconsistencies?
> 

The alleged inconsistencies (which may be just me) are along these
lines (plus those with NumPy):

import math
print({math.inf - math.inf for i in range(100})
print({math.nan for i in range(10)})

maybe I am alone to perceive that as an inconsistency. I _was_ saying
that if you had a dunder, for this you could enforce that:

 * `a is b` implies `congruent(a, b)`
 * `a == b` implies `congruent(a, b)`
 * `hash(a) == hash(b)` implies `congruent(a, b)`.

So the "inconsistencies" are that of course `hash(NaN)` and `NaN is
NaN` fail to imply `NaN == NaN`, while congruent could be enforced to
do it "right".

Chris said it much better anyway, and is probably right to disregard
the dunder part:

1. Name the current operation (congruent?) to reason about it?
2. Bless it with its own function? (helps maybe documenting it)
3. Consider if its worth resolving the above inconsistencies by making
   it an operator with a dunder.

I am happy to stop at 0 :). I am sure similar discussions about the
hash of NaN come up once a year.

- Sebastian


> The current status quo is that containers perform operations such as 
> equality by testing for identity or equality, which they are
> permitted 
> to do and is documented. Changing them to use your "identical or
> equal" 
> API will (as far as I can see) change nothing about the semantics, 
> behaviour or even implementation (since the C-level containers like
> list 
> will surely still call PyObject_RichCompareBool rather than a 
> Python-level wrapper).
> 
> So whatever inconsistencies exist, they will still exist.
> 
> If I have missed something, please tell me.
> 
> 
> 


signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YEZTWEH7B3SPBV2GIBOOXC2OGWC2CM2T/
Code of Conduct: http://python.org/psf/codeofconduct/