On 28.12.2016 04:13, j...@math.brown.edu wrote: > Suppose you have implemented an immutable Position type to represent > the state of a game played on an MxN board, where the board size can > grow quite large. > ... > > According to > https://docs.python.org/3/reference/datamodel.html#object.__hash__ > : > > > """ > it is advised to mix together the hash values of the components of the > object that also play a part in comparison of objects by packing them > into a tuple and hashing the tuple. Example: > > def __hash__(self): > return hash((self.name, self.nick, self.color)) > > """ > > > Applying this advice to the use cases above would require creating an > arbitrarily large tuple in memory before passing it to hash(), which > is then just thrown away. It would be preferable if there were a way > to pass multiple values to hash() in a streaming fashion, such that > the overall hash were computed incrementally, without building up a > large object in memory first.
I think there's a misunderstanding here: the hash(obj) built-in merely interfaces to the obj.__hash__() method (or the tp_hash slot for C types) and returns whatever these methods give. It doesn't implement any logic by itself. If you would like to implement a more efficient hash algorithm for your types, just go ahead and write them as .__hash__() method or tp_hash slot method and you're done. The example from the docs is just to showcase an example of how such a hash function should work, i.e. to mix in all relevant data attributes. In your case, you'd probably use a simple for loop to calculate the hash without creating tuples or any other temporary structures. Here's the hash implementation tuples use as an example /* The addend 82520, was selected from the range(0, 1000000) for generating the greatest number of prime multipliers for tuples upto length eight: 1082527, 1165049, 1082531, 1165057, 1247581, 1330103, 1082533, 1330111, 1412633, 1165069, 1247599, 1495177, 1577699 Tests have shown that it's not worth to cache the hash value, see issue #9685. */ static Py_hash_t tuplehash(PyTupleObject *v) { Py_uhash_t x; /* Unsigned for defined overflow behavior. */ Py_hash_t y; Py_ssize_t len = Py_SIZE(v); PyObject **p; Py_uhash_t mult = _PyHASH_MULTIPLIER; x = 0x345678UL; p = v->ob_item; while (--len >= 0) { y = PyObject_Hash(*p++); if (y == -1) return -1; x = (x ^ y) * mult; /* the cast might truncate len; that doesn't change hash stability */ mult += (Py_hash_t)(82520UL + len + len); } x += 97531UL; if (x == (Py_uhash_t)-1) x = -2; return x; } As you can see, there's some magic going on there to make sure that the hash values behave well when used as "keys" for the dictionary implementation (which is their main purpose in Python). You are free to create your own hash implementation. The only characteristic to pay attention to is to have objects which compare equal give the same hash value. This is needed to be able to map such objects to the same dictionary slots. There should be no need to have a special hash function which works on iterables. As long as those iterable objects define their own .__hash__() method or tp_slot, the hash() built-in (and Python's dict implementation) will use these and, if needed, those methods can then use an approach to build hash values using iterators on the object's internal data along similar lines as the above tuple implementation. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 05 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/