Niels,
> For these moderate sizes, does it pay off to precompute a full inverse
> for the montgomery reduction, rather than using redc_1 or redc_2? (I
> don't quite understand which lines in you benchmark data I should look
> at).
I believe those sizes are too small. With a full inverse, w
Zimmermann Paul writes:
> I am trying to optimize the modular multiplications and squarings in GMP-ECM
> (where we use Montgomery's reduction).
For these moderate sizes, does it pay off to precompute a full inverse
for the montgomery reduction, rather than using redc_1 or redc_2? (I
don't quite
Torbjörn,
> Providing special code for many un,vn combinations (as separate
> functions are as part of mpn_mul_basecase) quickly become unmanageable.
> If we want to handle all sizes <= 16 (say) we'll need 136 variants.
>
> I don't think it makes much sense providing code for just un=vn (e
Zimmermann Paul writes:
GMP currently has variable-size assembly code for mpn_mul_n on some
processors. Could it be faster to have fixed-size assembly code for
small values of n (say up to n=32)? Then mpn_mul_n() would simply be
a wrapper to those fixed-size functions, or to a variable-si
Hi,
GMP currently has variable-size assembly code for mpn_mul_n on some
processors. Could it be faster to have fixed-size assembly code for
small values of n (say up to n=32)? Then mpn_mul_n() would simply be
a wrapper to those fixed-size functions, or to a variable-size code
for n>32.
Pau