Tim Peters <t...@python.org> added the comment:
Something that may be slightly worth pursuing: in j = (5*j + 1) % 2**power to get a full-period sequence hitting every integer in range(2**power) exactly once, the multiplier (5) is a bit delicate (it has to be congruent to 1 mod 4), but the addend (1) only needs to be odd. When the hash code has a long string of high-order 1 bits, the lower bits of `perturb` are all ones repeatedly across iterations, and a string of `power` one bits is congruent to -1 mod 2**power, cancelling out the +1. Which people stumbled into here: all negative ints with small absolute value _do_ have a long string of high-order 1 bits (as did also my contrived 2**60 - 2). If we changed the addend to, say, +3 instead, it would still be possible to contrive cases as bad, but you'd be unlikely to stumble into them by accident, and they would be sensitive to the value of `power` (the table's current size). For example, for a table size of 8 (the smallest we use), only ints ending with bits 101 are congruent to -3 mod 8. So maximally "bad" hash codes would be of the form (bits, where "x" is "doesn't matter", and grouped into chunks of PERTURB_SHIFT (5) bits from the right): ... xx101 xx101 ... xx101 xx101 xxxxx Provided changing +1 to +3 didn't slow anything down (you never know until you try!), that would be fine by me. But I'm not bothered by the current behavior, so I won't be pursuing it. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue38105> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com