> I thought that the original problem was that with N insertions in the > dictionary, by repeatedly inserting different keys generating the same > hash value an attacker could arrange that the cost of finding an open > slot is O(N), and thus the cost of N insertions is O(N^2). > > If so, frequent resizing could make the attacker's problem much more > difficult, as the distribution of secondary probes should change with > each resize.
The attack creates 60,000 strings (or more) with exactly the same hash value. A dictionary uses hash(str) & DICT_MASK to compute the bucket index, where DICT_HASH is the number of buckets minus one. If all strings have the same hash value, we always start in the same bucket and the key has to be compared to all previous strings to find the next empty bucket. The attack works because a LOT of strings are compared and comparing strings is slow. If hash(str1)&DICT_MASK == hash(str2)&DICT_MASK but hash(str1)!=hash(str2), strings are not compared (this is a common optimization in Python), and the so the attack would not be successful (it would be slow, but not as slow as comparing two strings). Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com