Tim Peters <[email protected]> added the comment:
A more principled change would be to replace instances of this:
i = (i*5 + perturb + 1) & mask;
with this:
i = (i*5 + (perturb << 1) + 1) & mask;
The latter spelling has no fixed points. That's easy to see: `(perturb << 1)
+ 1` is necessarily odd, and then `i*5 + ODD` is even when `i` is odd, and vice
versa.
I had hoped that the time for a new shift could overlap with the multiply
latency, but no such luck. At least Visual Studio didn't multiply to begin
with: it first computes `i*4 + perturb` cheaply with a single LEA instruction,
then adds 1 (INC), then adds in `i` again.
I expect it would be able to fold in a "free" shifted add of perturb with
another LEA, so the latter spelling isn't necessarily more expensive. But I'm
generally loathe to increase operation counts relying on clever compilers to
exploit processor-specific tricks.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue38105>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com