Now, probably this has been rejected a hundred times before, and there are some very good reason why it is a horrible thought...
But if `PyObject_RichCompareBool(..., Py_EQ)` is such a fundamental
operation (and in a sense it seems to me that it is), is there a point
in explicitly defining it?
That would mean adding `operator.equivalent(a, b) -> bool` which would
allow float to override the result and let
`operator.equivalent_value(float("NaN"), float("NaN))` return True;
luckily very few types would actually override the operation.
That operator would obviously be allowed to use the shortcut.
At that point container `==` and `in` (and equivalence) is defined
based on element equivalence.
NAs (missing value handling) may be an actual use-case where it is more
than a theoretical thought. However, I do not seriously work with NAs
myself.
- Sebastian
On Mon, 2020-02-03 at 16:00 -0600, Tim Peters wrote:
> [Tim]
> > > PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if
> > > x
> > > and y are the same object, then equality comparison returns True
> > > and inequality False. No attempt is made to execute __eq__ or
> > > __ne__ methods in those cases.
> > > ...
> > > If it's intended that Python-the-language requires this, that
> > > needs to
> > > be documented.
>
> [Raymond]
> > This has been slowly, but perhaps incompletely documented over the
> > years and has become baked in the some of the collections ABCs as
> > well.
> > For example, Sequence.__contains__() is defined as:
> >
> > def __contains__(self, value):
> > for v in self:
> > if v is value or v == value: # note the
> > identity test
> > return True
> > return False
>
> But it's unclear to me whether that's intended to constrain all
> implementations, or is just mimicking CPython's list.__contains__.
> That's always a problem with operational definitions. For example,
> does it also constrain all implementations to check in iteration
> order? The order can be visible, e.g, in the number of times
> v.__eq__
> is called.
>
>
> > Various collections need to assume reflexivity, not just for speed,
> > but so that we
> > can reason about them and so that they can maintain internal
> > consistency. For
> > example, MutableSet defines pop() as:
> >
> > def pop(self):
> > """Return the popped value. Raise KeyError if empty."""
> > it = iter(self)
> > try:
> > value = next(it)
> > except StopIteration:
> > raise KeyError from None
> > self.discard(value)
> > return value
>
> As above, except CPyhon's own set implementation implementation
> doesn't faithfully conform to that:
>
> > > > x = set(range(0, 10, 2))
> > > > next(iter(x))
> 0
> > > > x.pop() # returns first in iteration order
> 0
> > > > x.add(1)
> > > > next(iter(x))
> 1
> > > > x.pop() # ditto
> 1
> > > > x.add(1) # but try it again!
> > > > next(iter(x))
> 1
> > > > x.pop() # oops! didn't pop the first in iteration order
> 2
>
> Not that I care ;-) Just emphasizing that it's tricky to say no more
> (or less) than what's intended.
>
> > That pop() logic implicitly assumes an invariant between membership
> > and iteration:
> >
> > assert(x in collection for x in collection)
>
> Missing an "all".
>
> > We really don't want to pop() a value *x* and then find that *x* is
> > still
> > in the container. This would happen if iter() found the *x*, but
> > discard()
> > couldn't find the object because the object can't or won't
> > recognize itself:
>
> Speaking of which, why is "discard()" called instead of "remove()"?
> It's sending a mixed message: discard() is appropriate when you're
> _not_ sure the object being removed is present.
>
>
> > s = {float('NaN')}
> > s.pop()
> > assert not s # Do we want the language to
> > guarantee that
> > # s is now empty? I
> > think we must.
>
> I can't imagine an actual container implementation that wouldn't. but
> no actual container implements pop() in the odd way MutableSet.pop()
> is written. CPython's set.pop does nothing of the sort - doesn't
> even
> have a pointer equality test (except against C's NULL and `dummy`,
> used merely to find "the first (starting at the search finger)" slot
> actually in use).
>
> In a world where we decided that the identity shortcut is _not_
> guaranteed by the language, the real consequence would be that the
> MutableSet.pop() implementation would need to be changed (or made
> NotImplemented, or documented as being specific to CPython).
>
> > The code for clear() depends on pop() working:
> >
> > def clear(self):
> > """This is slow (creates N new iterators!) but
> > effective."""
> > try:
> > while True:
> > self.pop()
> > except KeyError:
> > pass
> >
> > It would unfortunate if clear() could not guarantee a post-
> > condition that the
> > container is empty:
>
> That's again a consequence of how MutableSet.pop was written. No
> actual container has any problem implementing clear() without needing
> any kind of object comparison.
>
> > s = {float('NaN')}
> > s.clear()
> > assert not s # Can this be allowed to fail?
>
> No, but as above it's a very far stretch to say that clear() emptying
> a container _relies_ on the object identity shortcut. That's a just
> a
> consequence of an odd specific clear() implementation, relying in
> turn
> on an odd specific pop() implementation that assumes the shortcut is
> in place.
>
>
> > The case of count() is less clear-cut, but even there identity-
> > implies-equality
> > improves our ability to reason about code:
>
> Absolutely! That "x is x implies equality" is very useful. But
> that's not the question ;-)
>
> > Given some list, *s*, possibly already populated, would you want
> > the
> > following code to always work:
> >
> > c = s.count(x)
> > s.append(x)
> > assert s.count(x) == c + 1 # To me, this is
> > fundamental
> > to what
> > the word "count" means.
>
> I would, yes. But it's also possible to define s.count(x) as
>
> sum(x == y for y in s)
>
> and live with the consequences of __eq__.
>
> > ...
> > Back to the discussion at hand, I had thought our position was
> > roughly:
> >
> > * __eq__ can return anything it wants.
> >
> > * Containers are allowed but not required to assume that identity-
> > implies-equality.
> >
> > * Python's core containers make that assumption so that we can keep
> > the containers internally consistent and so that we can reason
> > about
> > the results of operations.
>
> All reasonable! Python just needs something now like a benevolent
> dictator ;-)
>
> > Also, I believe that even very early dict code (at least as far
> > back
> > as Py 1.5.2) had logic for "v is value or v == value".
>
> Memory fades, but it seems to me that very early Pythons may even
> have
> exploited the shortcut for `==` too.
>
> > ...
> > The current docs make an effort to describe what we have now:
> > https://docs.python.org/3/reference/expressions.html#value-comparisons
>
> Yes, that's been pointed out, and it's at worst "a good start". The
> people on the original PR that kicked this off weren't aware of that
> it existed. Terry Reedy said he's thinking about how to (at least)
> make it more discoverable, although at that time Guido appeared to be
> leaning "implementation defined" instead.
>
> [in another msg]
> > forget to mention that list.index() also uses
> > PyObject_RichCompareBool()
>
> A quick scan found about 100 calls to PyObject_RichCompareBool
> passing
> Py_EQ. So it screams for a way to spell out what's required that
> doesn't degenerate into an exhaustive list of specific
> functions/methods/contexts.
> _______________________________________________
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/44XXRXK2MVDY7GKWTURZK7XFCHIR6JRX/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/S5SZS5YGVXF5UOYSLNC6JR6YY5D3BSTQ/ Code of Conduct: http://python.org/psf/codeofconduct/
