Eric Appelt added the comment: I also looked at hashes of strings themselves rather than frozensets to check the hashing of strings directly.
For example, n=3: ['', 'a', 'b', 'c', 'ab', 'ac', 'bc', 'abc'] rather than: [frozenset(), frozenset({'a'}), frozenset({'b'}), frozenset({'c'}), frozenset({'b', 'a'}), frozenset({'c', 'a'}), frozenset({'b', 'c'}), frozenset({'b', 'a', 'c'})] I made a distribution as with the last comment but now using the # of unique last-7 bit sequences in a set of 128 such strings (n=7) and compared to pseudorandom integers, just as was done before with frozensets of the letter combinations. This is shown in the file "str_string_n7_10k.png". The last 7-bits of the small string hashes produce a distribution much like regular pseudorandom integers. So if there is a problem with the hash algorithm, it appears to be related to the frozenset hashing and not strings. ---------- Added file: http://bugs.python.org/file45270/str_string_n7_10k.png _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26163> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com