Re: [Python-Dev] Status of the fix for the hash collision vulnerability

Martin v. Löwis Tue, 17 Jan 2012 12:33:18 -0800

> I thought that the original problem was that with N insertions in the
> dictionary, by repeatedly inserting different keys generating the same
> hash value an attacker could arrange that the cost of finding an open
> slot is O(N), and thus the cost of N insertions is O(N^2).
> 
> If so, frequent resizing could make the attacker's problem much more
> difficult, as the distribution of secondary probes should change with
> each resize.


Not sure what you mean by "distribution of secondary probes".

Let H be the initial hash value, and let MASK be the current size
of the dictionary. Then I(n), the sequence of dictionary indices
being probed, is computed as

   I(0) = H & MASK
   PERTURB(0) = H
   I(n+1) = (5*I(n) + 1 + PERTURB(n)) & MASK
   PERTURN(n+1) = PERTURB(n) >> 5

So if two objects O1 and O2 have the same hash value H, the sequence of
probed indices is the same for any MASK value. It will be a different
sequence, yes, but they will still collide on each and every slot.

This is the very nature of open addressing. If it *wouldn't* try all
indices in the probe sequence, it may not be possible to perform
the lookup for a key correctly.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

Reply via email to