Josh Rosenberg added the comment:

Assuming Siphash is in fact cryptographically secure (in the sense you can't 
choose a desired result hash with better odds that brute force), and it appears 
to be so, with a keyspace of 64 bits, if it's evenly distributed (which a 
cryptographically secure hash should be), that implies to even have a 1% chance 
of any two keys in a set colliding, you'd need over 2**29 keys ( you can plug 
in your own numbers for the conditional calculation here 
https://lazycackle.com/Probability_of_repeated_event_online_calculator__birthday_problem_.html
 ). Even at the one ten thousandth of a percent collision threshold, you're 
talking about a set of 6 million strings to find even one pair that match.

I'd still be leery of using such an approach for general purpose sets and 
dicts, where they could conceivably contain enough entries to pose a risk 
(vanishingly small, but not "heat death of the universe" small). But for Python 
implementation dictionaries (module, nested scope, class, and instance 
dictionaries), where we're talking about maybe a thousand attributes in extreme 
cases, which are almost never under the control of an "attacker" in any event, 
I could definitely see a low risk win. If you're assuming a dictionary with 
less than 10,000 keys, that's a hit would be literally one in a trillion; under 
a hundred and you're below one in a quadrillion chance, which I think is safe 
enough. If you wanted to make it "safe" you could conceivably use an approach 
that changed algorithms up front, depending on the size of the dictionary; less 
than a hundred entries, use hash only lookup, above 100, use "safe" lookup.

----------
nosy: +josh.r

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23712>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to