Hao Hu writes: > > On 17 Dec 2021, at 15:28, Chris Angelico <ros...@gmail.com> wrote:
> > The built-in hash() function is extremely generic, so it can't really > > work that way. Adding a parameter to it would require (a) adding the > > parameter to every __hash__ method of every object, including > > user-defined objects; > > I would not say the opposite, however maybe it appears to be more > complicated than it is really is. Probably it is worth a small > analysis? It's the user-defined objects that are the killer here. We don't want to go wrecking dozens of projects' objects. > >> For instance, if we create a caching programming interface that > >> relies on a distributed kv store, I would be very suspicious of using Python's hash builtin for such a purpose. The Python hash functions are very carefully tuned for high performance in one application only: equality testing in Python, especially for dicts. Many __hash__ methods omit much of the object being hashed; if the variation in your keys depends only on those attributes, you'll get a lot of collisions. Others are extremely predictable. E.g., most integers and other numbers equal to integers hash to themselves mod 2**61 - 1, I believe -1 is only exception. Being predictable as such may not be a problem for your kv store cache, but predictable == pattern, and if your application happens to match that pattern, you could again end up with a massive collision problem. I imagine this is much less likely to be a problem than the case where keys depend on omitted attributes, since presumably the __hash__ method is designed to cover the whole range. And numbers are the only case I know of offhand. > > I'd recommend hashlib: +1 > Otherwise, would that be useful to add siphash24 or fnv in the > hashlib as well? I think that is a good idea. To me, it seems relatively likely to be accepted quickly. However, many cryptographic algorithms are delicate (eg, to avoid timing attacks), so I could be wrong about that. Folks like Christian Heimes might be very concerned about the implementation as well as the algorithm. Note that Python/pyhash.c seems to have implementations of both of these algorithms, although I don't know if these implementations satisfy cryptographic needs. Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/H3BBUYLMJAPGGD66MN3R7A7M7SEYAX66/ Code of Conduct: http://python.org/psf/codeofconduct/