On Fri, Dec 30, 2016 at 9:29 AM,  <j...@math.brown.edu> wrote:
> Updating the docs sounds like the more important change for now, given 3.7+.
> But before the docs make an official recommendation for that recipe, were
> the analyses that Steve and I did sufficient to confirm that its hash
> distribution and performance is good enough at scale, or is more rigorous
> analysis necessary?
>
> I've been trying to find a reasonably detailed and up-to-date reference on
> Python hash() result requirements and analysis methodology, with
> instructions on how to confirm if they're met, but am still looking. Would
> find that an interesting read if it's out there. But I'd take just an
> authoritative thumbs up here too. Just haven't heard one yet.

I'm not an expert on hash table implementation subtleties, but I can
quote Objects/dictobject.c: "Most hash schemes depend on having a
"good" hash function, in the sense of simulating randomness. Python
doesn't ..."

https://github.com/python/cpython/blob/d0a2f68a5f972e4474f5ecce182752f6f7a22178/Objects/dictobject.c#L133

(Basically IIUC the idea is that Python dicts use a relatively
sophisticated probe sequence to resolve collisions which allows the
use of relatively "poor" hashes. There are interesting trade-offs
here...)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to