[issue13703] Hash collision security issue

2012-02-06 Thread Frank Sievertsen
Frank Sievertsen added the comment: > Agreed; it tops out with a constant, but if it takes only 16 bytes of > input to force another run through a 1000-long collision, that may > still be too much leverage. You should prepare the dict so that you have the collisions-run with a one-byt

[issue13703] Hash collision security issue

2012-01-25 Thread Frank Sievertsen
Frank Sievertsen added the comment: For the sake of completeness: Collision-counting (with Exception) has interesting effects, too. >>> d={((1<<(65+i))-2**(i+4)): 9 for i in range(1001)} >>> for i in list(d): ... del d[i] >>> d {} >>> 9 in d False &

[issue13703] Hash collision security issue

2012-01-25 Thread Frank Sievertsen
Frank Sievertsen added the comment: > Is it still able to find the value? Probably not. :( That's exactly why I stopped thinking about all two-hash-functions or rehashing ideas. -- ___ Python tracker <http://bugs.python.org

[issue13703] Hash collision security issue

2012-01-20 Thread Frank Sievertsen
Frank Sievertsen added the comment: > The hash value is just used to know if the key belongs to the left > or the right child tree. Yes, that's what I don't understand: How can you do this, when ALL hash-values are equal. -- __

[issue13703] Hash collision security issue

2012-01-20 Thread Frank Sievertsen
Frank Sievertsen added the comment: > They may be non-orderable, but since they are required to be hashable, > I guess one can build an comparison function with the following: Since we are are trying to fix a problem where hash(X) == hash(Y), you can't make them orderable by usi

[issue13703] Hash collision security issue

2012-01-19 Thread Frank Sievertsen
Frank Sievertsen added the comment: >> That's true. But without the suffix, I can pretty easy and efficient >> guess the prefix by just seeing the result of a few well-chosen and >> short repr(dict(X)). I suppose that's harder with the suffix. > Since the has

[issue13703] Hash collision security issue

2012-01-19 Thread Frank Sievertsen
Frank Sievertsen added the comment: > The suffix only introduces a constant change in all hash values > output, so even if you don't know the suffix, you can still > generate data sets with collisions by just having the prefix. That's true. But without the suffix, I

[issue13703] Hash collision security issue

2012-01-12 Thread Frank Sievertsen
Frank Sievertsen added the comment: I'd like to add a few notes: 1. both 32-bit and 64-bit python are vulnerable 2. collision-counting will break other things 3. imho, randomization is the way to go, enabled by default. 4. do we need a steady hash-function later again? I created ~500