[issue13703] Hash collision security issue

2012-02-06 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: Agreed; it tops out with a constant, but if it takes only 16 bytes of input to force another run through a 1000-long collision, that may still be too much leverage. You should prepare the dict so that you have the collisions-run

[issue13703] Hash collision security issue

2012-01-25 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: Is it still able to find the value? Probably not. :( That's exactly why I stopped thinking about all two-hash-functions or rehashing ideas. -- ___ Python tracker rep...@bugs.python.org

[issue13703] Hash collision security issue

2012-01-25 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: For the sake of completeness: Collision-counting (with Exception) has interesting effects, too. d={((1(65+i))-2**(i+4)): 9 for i in range(1001)} for i in list(d): ... del d[i] d {} 9 in d False 0 in d Traceback (most recent call

[issue13703] Hash collision security issue

2012-01-20 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: They may be non-orderable, but since they are required to be hashable, I guess one can build an comparison function with the following: Since we are are trying to fix a problem where hash(X) == hash(Y), you can't make them orderable

[issue13703] Hash collision security issue

2012-01-20 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: The hash value is just used to know if the key belongs to the left or the right child tree. Yes, that's what I don't understand: How can you do this, when ALL hash-values are equal

[issue13703] Hash collision security issue

2012-01-19 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: The suffix only introduces a constant change in all hash values output, so even if you don't know the suffix, you can still generate data sets with collisions by just having the prefix. That's true. But without the suffix, I can pretty

[issue13703] Hash collision security issue

2012-01-19 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: That's true. But without the suffix, I can pretty easy and efficient guess the prefix by just seeing the result of a few well-chosen and short repr(dict(X)). I suppose that's harder with the suffix. Since the hash function is known

[issue13703] Hash collision security issue

2012-01-12 Thread Frank Sievertsen
Frank Sievertsen pyt...@sievertsen.de added the comment: I'd like to add a few notes: 1. both 32-bit and 64-bit python are vulnerable 2. collision-counting will break other things 3. imho, randomization is the way to go, enabled by default. 4. do we need a steady hash-function later again? I