Tim Peters <t...@python.org> added the comment:

> advantage of my approach is that high-order bits become more
> important:

I don't much care about high-order bits, beyond that we don't systematically 
_lose_ them.  The dict and set lookup routines have their own strategies for 
incorporating high-order bits into the probe sequence, and those routines are 
the primary reason hash() exists in Python.

So there's scant reason to try to let high bits influence low bits (but good 
reason _not_ to, because - as explained before - the low-order bits are already 
in great shape for the important tuple component types).

Something else appears to be going on in this specific example, which has a 
very touchy set of hash codes:

> BEFORE:
> >>> L = [n << 60 for n in range(100)]; T = [(a,b,c) for a in L for b in L for 
> >>> c in L]; len(set(hash(x) for x in T))
> 500000

Which is what I get too on (at least) a 64-bit build.

> AFTER:
> >>> L = [n << 60 for n in range(100)]; T = [(a,b,c) for a in L for b in L for 
> >>> c in L]; len(set(hash(x) for x in T))
> 1000000

Ditto.  However, I get exactly the same from the code I posted before, which 
has no right shifts at all.

More, if I take the code exactly as it exists today, and merely comment out the

    mult += (Py_hash_t)(82520UL + len + len);

line, I also get a million distinct hash codes.

Half of the hash codes have bit 2**60 set, and apart from those all the hash 
codes have no bits set higher than 2**6.  Any number of accidents could cause 
the 2**60 bits to wipe each other out, and merely changing the sequence of 
multipliers was demonstrably enough to stop that.

Indeed, merely changing 82520 to 82524 in the current code is enough to get a 
million unique hash codes in this case.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to