Hao Hu writes:
 > > On 17 Dec 2021, at 15:28, Chris Angelico <ros...@gmail.com> wrote:

 > > The built-in hash() function is extremely generic, so it can't really
 > > work that way. Adding a parameter to it would require (a) adding the
 > > parameter to every __hash__ method of every object, including
 > > user-defined objects;
 > 
 > I would not say the opposite, however maybe it appears to be more
 > complicated than it is really is. Probably it is worth a small
 > analysis?

It's the user-defined objects that are the killer here.  We don't want
to go wrecking dozens of projects' objects.

 > >> For instance, if we create a caching programming interface that
 > >> relies on a distributed kv store,

I would be very suspicious of using Python's hash builtin for such a
purpose.  The Python hash functions are very carefully tuned for high
performance in one application only: equality testing in Python,
especially for dicts.  Many __hash__ methods omit much of the object
being hashed; if the variation in your keys depends only on those
attributes, you'll get a lot of collisions.  Others are extremely
predictable.  E.g., most integers and other numbers equal to integers
hash to themselves mod 2**61 - 1, I believe -1 is only exception.
Being predictable as such may not be a problem for your kv store
cache, but predictable == pattern, and if your application happens to
match that pattern, you could again end up with a massive collision
problem.  I imagine this is much less likely to be a problem than the
case where keys depend on omitted attributes, since presumably the
__hash__ method is designed to cover the whole range.  And numbers are
the only case I know of offhand.

 > > I'd recommend hashlib:

+1

 > Otherwise, would that be useful to add siphash24 or fnv in the
 > hashlib as well?

I think that is a good idea.  To me, it seems relatively likely to be
accepted quickly.  However, many cryptographic algorithms are delicate
(eg, to avoid timing attacks), so I could be wrong about that.  Folks
like Christian Heimes might be very concerned about the implementation
as well as the algorithm.

Note that Python/pyhash.c seems to have implementations of both of
these algorithms, although I don't know if these implementations
satisfy cryptographic needs.

Steve
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/H3BBUYLMJAPGGD66MN3R7A7M7SEYAX66/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to