On Wed, Jan 04, 2017 at 04:38:05PM -0500, j...@math.brown.edu wrote: > Instead of the proposals like "hash.from_iterable()", would it make sense > to allow tuple.__hash__() to accept any iterable, when called as a > classmethod?
The public API for calculating the hash of something is to call the hash() builtin function on some object, e.g. to call tuple.__hash__ you write hash((a, b, c)). The __hash__ dunder method is implementation, not interface, and normally shouldn't be called directly. Unless I'm missing something obvious, your proposal would require the caller to call the dunder methods directly: class X: def __hash__(self): return tuple.__hash__(iter(self)) I consider that a poor interface design. But even if we decide to make an exception in this case, tuple.__hash__ is currently an ordinary instance method right now. There's probably code that relies on that fact and expects that: tuple.__hash__((a, b, c)) is currently the same as (a, b, c).__hash__() (Starting with the hash() builtin itself, I expect, although that is easy enough to fix if needed.) Your proposal will break backwards compatibility, as it requires a change in semantics: (1) (a, b, c).__hash__() must keep the current behaviour, which means behaving like a bound instance method; (2) But tuple.__hash__ will no longer return an unbound method (actually a function object, but the difference is unimportant) and instead will return something that behaves like a bound class method. Here's an implementation which does this: http://code.activestate.com/recipes/577030-dualmethod-descriptor/ so such a thing is possible. But it breaks backwards-compatability and introduces something which I consider to be an unclean API (calling a dunder method directly). Unless there's a *really* strong advantage to tuple.__hash__(...) over hash.from_iterable(...) (or equivalent), I would be against this change. > (And similarly with frozenset.__hash__(), so that the fast C > implementation of that algorithm could be used, rather than the slow > collections.Set._hash() implementation. Then the duplicated implementation > in _collections_abc.py's Set._hash() could be removed completely, > delegating to frozenset.__hash__() instead.) This is a good point. Until now, I've been assuming that hash.from_iterable should consider order. But frozenset shows us that sometimes the hash should *not* consider order. This hints that perhaps the hash.from_iterable() should have its own optional dunder method. Or maybe we need two functions: an ordered version and an unordered version. Hmmm... just tossing out a wild idea here... let's get rid of the dunder method part of your suggestion, and add new public class methods to tuple and frozenset: tuple.hash_from_iter(iterable) frozenset.hash_from_iter(iterable) That gets rid of all the objections about backwards compatibility, since these are new methods. They're not dunder names, so there are no objections to being used as part of the public API. A possible objection is the question, is this functionality *actually* important enough to bother? Another possible objection: are these methods part of the sequence/set API? If not, do they really belong on the tuple/frozenset? Maybe they belong elsewhere? > Would this API more cleanly communicate the algorithm being used and the > implementation, No. If you want to communicate the algorithm being used, write some documentation. Seriously, the public API doesn't communicate the algorithm used for the implementation. How can it? We can keep the same interface and change the implementation, or change the interface and keep the implementation. The two are (mostly) independent. > while making a smaller increase in API surface area > compared to introducing a new function? It's difficult to quantify "API surface area". On the one hand, we have the addition of one or two new functions or methods. Contrast with: * introducing a new kind of method into the built-ins (one which behaves like a classmethod when called from the class, and like an instance method when called from an instance); * changing tuple.__hash__ from an ordinary method to one of the above special methods; * and likewise for frozenset.__hash__; * change __hash__ from "only used as implementation, not as interface" to "sometimes used as interface". To me, adding one or two new methods/functions is the smaller, or at least less disruptive, change. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/