Now, probably this has been rejected a hundred times before, and there
are some very good reason why it is a horrible thought...

But if `PyObject_RichCompareBool(..., Py_EQ)` is such a fundamental
operation (and in a sense it seems to me that it is), is there a point
in explicitly defining it?

That would mean adding `operator.equivalent(a, b) -> bool` which would
allow float to override the result and let
`operator.equivalent_value(float("NaN"), float("NaN))` return True;
luckily very few types would actually override the operation.

That operator would obviously be allowed to use the shortcut.

At that point container `==` and `in` (and equivalence) is defined
based on element equivalence.
NAs (missing value handling) may be an actual use-case where it is more
than a theoretical thought. However, I do not seriously work with NAs
myself.

- Sebastian


On Mon, 2020-02-03 at 16:00 -0600, Tim Peters wrote:
> [Tim]
> > > PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if
> > > x
> > > and y are the same object, then equality comparison returns True
> > > and inequality False. No attempt is made to execute __eq__ or
> > > __ne__ methods in those cases.
> > > ...
> > > If it's intended that Python-the-language requires this, that
> > > needs to
> > > be documented.
> 
> [Raymond]
> > This has been slowly, but perhaps incompletely documented over the
> > years and has become baked in the some of the collections ABCs as
> > well.
> >  For example, Sequence.__contains__() is defined as:
> > 
> >     def __contains__(self, value):
> >         for v in self:
> >             if v is value or v == value:          # note the
> > identity test
> >                 return True
> >         return False
> 
> But it's unclear to me whether that's intended to constrain all
> implementations, or is just mimicking CPython's list.__contains__.
> That's always a problem with operational definitions.  For example,
> does it also constrain all implementations to check in iteration
> order?  The order can be visible, e.g, in the number of times
> v.__eq__
> is called.
> 
> 
> > Various collections need to assume reflexivity, not just for speed,
> > but so that we
> > can reason about them and so that they can maintain internal
> > consistency. For
> > example, MutableSet defines pop() as:
> > 
> >     def pop(self):
> >         """Return the popped value.  Raise KeyError if empty."""
> >         it = iter(self)
> >         try:
> >             value = next(it)
> >         except StopIteration:
> >             raise KeyError from None
> >         self.discard(value)
> >         return value
> 
> As above, except  CPyhon's own set implementation implementation
> doesn't faithfully conform to that:
> 
> > > > x = set(range(0, 10, 2))
> > > > next(iter(x))
> 0
> > > > x.pop() # returns first in iteration order
> 0
> > > > x.add(1)
> > > > next(iter(x))
> 1
> > > > x.pop()  # ditto
> 1
> > > > x.add(1)  # but try it again!
> > > > next(iter(x))
> 1
> > > > x.pop() # oops! didn't pop the first in iteration order
> 2
> 
> Not that I care ;-)  Just emphasizing that it's tricky to say no more
> (or less) than what's intended.
> 
> > That pop() logic implicitly assumes an invariant between membership
> > and iteration:
> > 
> >        assert(x in collection for x in collection)
> 
> Missing an "all".
> 
> > We really don't want to pop() a value *x* and then find that *x* is
> > still
> > in the container.   This would happen if iter() found the *x*, but
> > discard()
> > couldn't find the object because the object can't or won't
> > recognize itself:
> 
> Speaking of which, why is "discard()" called instead of "remove()"?
> It's sending a mixed message:  discard() is appropriate when you're
> _not_ sure the object being removed is present.
> 
> 
> >      s = {float('NaN')}
> >      s.pop()
> >      assert not s                  # Do we want the language to
> > guarantee that
> >                                           # s is now empty?  I
> > think we must.
> 
> I can't imagine an actual container implementation that wouldn't. but
> no actual container implements pop() in the odd way MutableSet.pop()
> is written.  CPython's set.pop does nothing of the sort - doesn't
> even
> have a pointer equality test (except against C's NULL and `dummy`,
> used merely to find "the first (starting at the search finger)" slot
> actually in use).
> 
> In a world where we decided that the identity shortcut is _not_
> guaranteed by the language, the real consequence would be that the
> MutableSet.pop() implementation would need to be changed (or made
> NotImplemented, or documented as being specific to CPython).
> 
> > The code for clear() depends on pop() working:
> > 
> >     def clear(self):
> >         """This is slow (creates N new iterators!) but
> > effective."""
> >         try:
> >             while True:
> >                 self.pop()
> >         except KeyError:
> >             pass
> > 
> > It would unfortunate if clear() could not guarantee a post-
> > condition that the
> > container is empty:
> 
> That's again a consequence of how MutableSet.pop was written.  No
> actual container has any problem implementing clear() without needing
> any kind of object comparison.
> 
> >      s = {float('NaN')}
> >      s.clear()
> >      assert not s           # Can this be allowed to fail?
> 
> No, but as above it's a very far stretch to say that clear() emptying
> a container _relies_ on the object identity shortcut.  That's a just
> a
> consequence of an odd specific clear() implementation, relying in
> turn
> on an odd specific pop() implementation that assumes the shortcut is
> in place.
> 
> 
> > The case of count() is less clear-cut, but even there identity-
> > implies-equality
> > improves our ability to reason about code:
> 
> Absolutely!  That "x is x implies equality" is very useful.  But
> that's not the question ;-)
> 
> >  Given some list, *s*, possibly already populated, would you want
> > the
> > following code to always work:
> > 
> >      c = s.count(x)
> >      s.append(x)
> >      assert s.count(x) == c + 1         # To me, this is
> > fundamental
> >                                                           to what
> > the word "count" means.
> 
> I would, yes.  But it's also possible to define s.count(x) as
> 
>     sum(x == y for y in s)
> 
> and live with the consequences of __eq__.
> 
> > ...
> > Back to the discussion at hand, I had thought our position was
> > roughly:
> > 
> > * __eq__ can return anything it wants.
> > 
> > * Containers are allowed but not required to assume that identity-
> > implies-equality.
> > 
> > * Python's core containers make that assumption so that we can keep
> >   the containers internally consistent and so that we can reason
> > about
> >   the results of operations.
> 
> All reasonable!  Python just needs something now like a benevolent
> dictator ;-)
> 
> > Also, I believe that even very early dict code (at least as far
> > back
> > as Py 1.5.2) had logic for "v is value or v == value".
> 
> Memory fades, but it seems to me that very early Pythons may even
> have
> exploited the shortcut for `==` too.
> 
> > ...
> > The current docs make an effort to describe what we have now: 
> > https://docs.python.org/3/reference/expressions.html#value-comparisons
> 
> Yes, that's been pointed out, and it's at worst "a good start".  The
> people on the original PR that kicked this off weren't aware of that
> it existed.  Terry Reedy said he's thinking about how to (at least)
> make it more discoverable, although at that time Guido appeared to be
> leaning "implementation defined" instead.
> 
> [in another msg]
> >  forget to mention that list.index() also uses
> > PyObject_RichCompareBool()
> 
> A quick scan found about 100 calls to PyObject_RichCompareBool
> passing
> Py_EQ.  So it screams for a way to spell out what's required that
> doesn't degenerate into an exhaustive list of specific
> functions/methods/contexts.
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/44XXRXK2MVDY7GKWTURZK7XFCHIR6JRX/
> Code of Conduct: http://python.org/psf/codeofconduct/
> 

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S5SZS5YGVXF5UOYSLNC6JR6YY5D3BSTQ/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to