Ciao,
Il Mer, 30 Maggio 2018 10:20 am, Niels Möller ha scritto:
> "Marco Bodrato" writes:
> Thinking about micro optimizations... Consider
>> count_trailing_zeros (c, ulimb);
>> ulimb = (ulimb >> 1) >> c;
> vs
>> count_trailing_zeros (c, ulimb);
>> ulimb >>= (c + 1);
ni...@lysator.liu.se (Niels Möller) writes:
But you may be right that it's important for performance to avoid a
redundant count_trailing_zeros on u.
It is slow on several machines, but it has become more common to provide
good leading/trailing bit count for newer microarchs.
It seems
"Marco Bodrato" writes:
> ... the effect is that in many cases (if U don't need reduction modulo V)
> the trailing zeros of U are removed twice.
But you may be right that it's important for performance to avoid a
redundant count_trailing_zeros on u.
It seems tricky to avoid that, without code
Ciao,
Il Lun, 28 Maggio 2018 10:00 pm, Niels Möller ha scritto:
> I'd suggest the below (complete file, I think that's more readable
The code is clean.
You removed all gotos...
> The last part of the function requires vlimb odd, but tolerates
> arbitrary u, including 0.
... the effect is that