On 12/27/2016 07:13 PM, j...@math.brown.edu wrote:

According to the docs [6]:

"""
it is advised to mix together the hash values of the components of the
object that also play a part in comparison of objects by packing them
into a tuple and hashing the tuple. Example:

def __hash__(self):
     return hash((self.name, self.nick, self.color))

"""


Applying this advice to the use cases above would require creating an
arbitrarily large tuple in memory before passing it to hash(), which
is then just thrown away. It would be preferable if there were a way
to pass multiple values to hash() in a streaming fashion, such that
the overall hash were computed incrementally, without building up a
large object in memory first.

Part of the reason for creating __hash__ like above is that:

- it's simple
- it's reliable

However, it's not the only way to have a hash algorithm that works; in fact, 
the beginning of the sentence you quoted says:

The only required property is that objects which compare equal have
 the same hash value;

In other words, objects that do not compare equal can also have the same hash 
value (although too much of that will reduce the efficiency of Python's 
containers).

(ii) priming the overall hash value with some class-specific initial
value, so that if an instance of a different type of collection, which
comprised the same items but which compared unequal, were to compute
its hash value out of the same constituent items, we make sure our
hash value differs.

This is unnecessary:  hashes are compared first as a way to weed out impossible 
matches, but when the hashes are the same an actual __eq__ test is still done 
[7].

--
~Ethan~


[6] https://docs.python.org/3/reference/datamodel.html#object.__hash__
[7] some test code to prove above points:

--- 8< ------------------------------------------------------------
from unittest import main, TestCase

class Eggs(object):
    def __init__(self, value):
        self.value = value
    def __hash__(self):
        return hash(self.value)
    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.value == other.value
        return NotImplemented

class Spam(object):
    def __init__(self, value):
        self.value = value
    def __hash__(self):
        return hash(self.value)
    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.value == other.value
        return NotImplemented

e10 = Eggs(1)
e20 = Eggs(2)
e11 = Eggs(1)
e21 = Eggs(2)

s10 = Spam(1)
s20 = Spam(2)
s11 = Spam(1)
s21 = Spam(2)

bag = {}
bag[e10] = 1
bag[s10] = 2
bag[e20] = 3
bag[s20] = 4

class TestEqualityAndHashing(TestCase):

    def test_equal(self):
        # same class, same value --> equal
        self.assertEqual(e10, e11)
        self.assertEqual(e20, e21)
        self.assertEqual(s10, s11)
        self.assertEqual(s20, s21)

    def test_not_equal(self):
        # different class, same value --> not equal
        self.assertEqual(e10.value, s10.value)
        self.assertNotEqual(e10, s10)
        self.assertEqual(e20.value, s20.value)
        self.assertNotEqual(e20, s20)

    def test_same_hash(self):
        # same class, same value, same hash
        self.assertEqual(hash(e10), hash(e11))
        self.assertEqual(hash(e20), hash(e21))
        self.assertEqual(hash(s10), hash(s11))
        self.assertEqual(hash(s20), hash(s21))

        # different class, same value, same hash
        self.assertEqual(hash(e10), hash(s10))
        self.assertEqual(hash(e11), hash(s11))
        self.assertEqual(hash(e20), hash(s20))
        self.assertEqual(hash(e21), hash(s21))

    def test_as_key(self):
        # different objects from different classes with same hash should still 
be distinct
        self.assertEqual(len(bag), 4)
        self.assertEqual(bag[e10], 1)
        self.assertEqual(bag[s10], 2)
        self.assertEqual(bag[e20], 3)
        self.assertEqual(bag[s20], 4)

        # different objects from same classes with same hash should not be 
distinct
        self.assertEqual(bag[e11], 1)
        self.assertEqual(bag[s11], 2)
        self.assertEqual(bag[e21], 3)
        self.assertEqual(bag[s21], 4)

main()
--- 8< ------------------------------------------------------------
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to