Now, probably this has been rejected a hundred times before, and there are some very good reason why it is a horrible thought...
But if `PyObject_RichCompareBool(..., Py_EQ)` is such a fundamental operation (and in a sense it seems to me that it is), is there a point in explicitly defining it? That would mean adding `operator.equivalent(a, b) -> bool` which would allow float to override the result and let `operator.equivalent_value(float("NaN"), float("NaN))` return True; luckily very few types would actually override the operation. That operator would obviously be allowed to use the shortcut. At that point container `==` and `in` (and equivalence) is defined based on element equivalence. NAs (missing value handling) may be an actual use-case where it is more than a theoretical thought. However, I do not seriously work with NAs myself. - Sebastian On Mon, 2020-02-03 at 16:00 -0600, Tim Peters wrote: > [Tim] > > > PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if > > > x > > > and y are the same object, then equality comparison returns True > > > and inequality False. No attempt is made to execute __eq__ or > > > __ne__ methods in those cases. > > > ... > > > If it's intended that Python-the-language requires this, that > > > needs to > > > be documented. > > [Raymond] > > This has been slowly, but perhaps incompletely documented over the > > years and has become baked in the some of the collections ABCs as > > well. > > For example, Sequence.__contains__() is defined as: > > > > def __contains__(self, value): > > for v in self: > > if v is value or v == value: # note the > > identity test > > return True > > return False > > But it's unclear to me whether that's intended to constrain all > implementations, or is just mimicking CPython's list.__contains__. > That's always a problem with operational definitions. For example, > does it also constrain all implementations to check in iteration > order? The order can be visible, e.g, in the number of times > v.__eq__ > is called. > > > > Various collections need to assume reflexivity, not just for speed, > > but so that we > > can reason about them and so that they can maintain internal > > consistency. For > > example, MutableSet defines pop() as: > > > > def pop(self): > > """Return the popped value. Raise KeyError if empty.""" > > it = iter(self) > > try: > > value = next(it) > > except StopIteration: > > raise KeyError from None > > self.discard(value) > > return value > > As above, except CPyhon's own set implementation implementation > doesn't faithfully conform to that: > > > > > x = set(range(0, 10, 2)) > > > > next(iter(x)) > 0 > > > > x.pop() # returns first in iteration order > 0 > > > > x.add(1) > > > > next(iter(x)) > 1 > > > > x.pop() # ditto > 1 > > > > x.add(1) # but try it again! > > > > next(iter(x)) > 1 > > > > x.pop() # oops! didn't pop the first in iteration order > 2 > > Not that I care ;-) Just emphasizing that it's tricky to say no more > (or less) than what's intended. > > > That pop() logic implicitly assumes an invariant between membership > > and iteration: > > > > assert(x in collection for x in collection) > > Missing an "all". > > > We really don't want to pop() a value *x* and then find that *x* is > > still > > in the container. This would happen if iter() found the *x*, but > > discard() > > couldn't find the object because the object can't or won't > > recognize itself: > > Speaking of which, why is "discard()" called instead of "remove()"? > It's sending a mixed message: discard() is appropriate when you're > _not_ sure the object being removed is present. > > > > s = {float('NaN')} > > s.pop() > > assert not s # Do we want the language to > > guarantee that > > # s is now empty? I > > think we must. > > I can't imagine an actual container implementation that wouldn't. but > no actual container implements pop() in the odd way MutableSet.pop() > is written. CPython's set.pop does nothing of the sort - doesn't > even > have a pointer equality test (except against C's NULL and `dummy`, > used merely to find "the first (starting at the search finger)" slot > actually in use). > > In a world where we decided that the identity shortcut is _not_ > guaranteed by the language, the real consequence would be that the > MutableSet.pop() implementation would need to be changed (or made > NotImplemented, or documented as being specific to CPython). > > > The code for clear() depends on pop() working: > > > > def clear(self): > > """This is slow (creates N new iterators!) but > > effective.""" > > try: > > while True: > > self.pop() > > except KeyError: > > pass > > > > It would unfortunate if clear() could not guarantee a post- > > condition that the > > container is empty: > > That's again a consequence of how MutableSet.pop was written. No > actual container has any problem implementing clear() without needing > any kind of object comparison. > > > s = {float('NaN')} > > s.clear() > > assert not s # Can this be allowed to fail? > > No, but as above it's a very far stretch to say that clear() emptying > a container _relies_ on the object identity shortcut. That's a just > a > consequence of an odd specific clear() implementation, relying in > turn > on an odd specific pop() implementation that assumes the shortcut is > in place. > > > > The case of count() is less clear-cut, but even there identity- > > implies-equality > > improves our ability to reason about code: > > Absolutely! That "x is x implies equality" is very useful. But > that's not the question ;-) > > > Given some list, *s*, possibly already populated, would you want > > the > > following code to always work: > > > > c = s.count(x) > > s.append(x) > > assert s.count(x) == c + 1 # To me, this is > > fundamental > > to what > > the word "count" means. > > I would, yes. But it's also possible to define s.count(x) as > > sum(x == y for y in s) > > and live with the consequences of __eq__. > > > ... > > Back to the discussion at hand, I had thought our position was > > roughly: > > > > * __eq__ can return anything it wants. > > > > * Containers are allowed but not required to assume that identity- > > implies-equality. > > > > * Python's core containers make that assumption so that we can keep > > the containers internally consistent and so that we can reason > > about > > the results of operations. > > All reasonable! Python just needs something now like a benevolent > dictator ;-) > > > Also, I believe that even very early dict code (at least as far > > back > > as Py 1.5.2) had logic for "v is value or v == value". > > Memory fades, but it seems to me that very early Pythons may even > have > exploited the shortcut for `==` too. > > > ... > > The current docs make an effort to describe what we have now: > > https://docs.python.org/3/reference/expressions.html#value-comparisons > > Yes, that's been pointed out, and it's at worst "a good start". The > people on the original PR that kicked this off weren't aware of that > it existed. Terry Reedy said he's thinking about how to (at least) > make it more discoverable, although at that time Guido appeared to be > leaning "implementation defined" instead. > > [in another msg] > > forget to mention that list.index() also uses > > PyObject_RichCompareBool() > > A quick scan found about 100 calls to PyObject_RichCompareBool > passing > Py_EQ. So it screams for a way to spell out what's required that > doesn't degenerate into an exhaustive list of specific > functions/methods/contexts. > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/44XXRXK2MVDY7GKWTURZK7XFCHIR6JRX/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/S5SZS5YGVXF5UOYSLNC6JR6YY5D3BSTQ/ Code of Conduct: http://python.org/psf/codeofconduct/