Couldn't you add __hash__ to collections.abc.Iterable ? Essentially, expose __hash__ there; then all iterables automatically have a default hash that hashes their ordered contents.
On Wednesday, January 4, 2017 at 7:37:26 PM UTC-5, Steven D'Aprano wrote: > > On Wed, Jan 04, 2017 at 04:38:05PM -0500, j...@math.brown.edu > <javascript:> wrote: > > Instead of the proposals like "hash.from_iterable()", would it make > sense > > to allow tuple.__hash__() to accept any iterable, when called as a > > classmethod? > > The public API for calculating the hash of something is to call the > hash() builtin function on some object, e.g. to call tuple.__hash__ you > write hash((a, b, c)). The __hash__ dunder method is implementation, not > interface, and normally shouldn't be called directly. > > Unless I'm missing something obvious, your proposal would require the > caller to call the dunder methods directly: > > class X: > def __hash__(self): > return tuple.__hash__(iter(self)) > > I consider that a poor interface design. > > But even if we decide to make an exception in this case, tuple.__hash__ > is currently an ordinary instance method right now. There's probably > code that relies on that fact and expects that: > > tuple.__hash__((a, b, c)) > > is currently the same as > > (a, b, c).__hash__() > > > (Starting with the hash() builtin itself, I expect, although that is > easy enough to fix if needed.) Your proposal will break backwards > compatibility, as it requires a change in semantics: > > (1) (a, b, c).__hash__() must keep the current behaviour, which > means behaving like a bound instance method; > > (2) But tuple.__hash__ will no longer return an unbound method (actually > a function object, but the difference is unimportant) and instead will > return something that behaves like a bound class method. > > Here's an implementation which does this: > > http://code.activestate.com/recipes/577030-dualmethod-descriptor/ > > so such a thing is possible. But it breaks backwards-compatability and > introduces something which I consider to be an unclean API (calling a > dunder method directly). Unless there's a *really* strong advantage to > > tuple.__hash__(...) > > over > > hash.from_iterable(...) > > (or equivalent), I would be against this change. > > > > > (And similarly with frozenset.__hash__(), so that the fast C > > implementation of that algorithm could be used, rather than the slow > > collections.Set._hash() implementation. Then the duplicated > implementation > > in _collections_abc.py's Set._hash() could be removed completely, > > delegating to frozenset.__hash__() instead.) > > This is a good point. Until now, I've been assuming that > hash.from_iterable should consider order. But frozenset shows us that > sometimes the hash should *not* consider order. > > This hints that perhaps the hash.from_iterable() should have its own > optional dunder method. Or maybe we need two functions: an ordered > version and an unordered version. > > Hmmm... just tossing out a wild idea here... let's get rid of the dunder > method part of your suggestion, and add new public class methods to > tuple and frozenset: > > tuple.hash_from_iter(iterable) > frozenset.hash_from_iter(iterable) > > > That gets rid of all the objections about backwards compatibility, since > these are new methods. They're not dunder names, so there are no > objections to being used as part of the public API. > > A possible objection is the question, is this functionality *actually* > important enough to bother? > > Another possible objection: are these methods part of the sequence/set > API? If not, do they really belong on the tuple/frozenset? Maybe they > belong elsewhere? > > > > > Would this API more cleanly communicate the algorithm being used and the > > implementation, > > No. If you want to communicate the algorithm being used, write some > documentation. > > Seriously, the public API doesn't communicate the algorithm used for the > implementation. How can it? We can keep the same interface and change > the implementation, or change the interface and keep the implementation. > The two are (mostly) independent. > > > > > while making a smaller increase in API surface area > > compared to introducing a new function? > > It's difficult to quantify "API surface area". On the one hand, we have > the addition of one or two new functions or methods. Contrast with: > > * introducing a new kind of method into the built-ins (one which > behaves like a classmethod when called from the class, and like > an instance method when called from an instance); > > * changing tuple.__hash__ from an ordinary method to one of the > above special methods; > > * and likewise for frozenset.__hash__; > > * change __hash__ from "only used as implementation, not as > interface" to "sometimes used as interface". > > > To me, adding one or two new methods/functions is the smaller, or at > least less disruptive, change. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > python...@python.org <javascript:> > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/