Re: fixed-size mpn_mul_n for small n?

2012-02-12 Thread Zimmermann Paul
Niels, > For these moderate sizes, does it pay off to precompute a full inverse > for the montgomery reduction, rather than using redc_1 or redc_2? (I > don't quite understand which lines in you benchmark data I should look > at). I believe those sizes are too small. With a full inverse, w

Re: fixed-size mpn_mul_n for small n?

2012-02-12 Thread Niels Möller
Zimmermann Paul writes: > I am trying to optimize the modular multiplications and squarings in GMP-ECM > (where we use Montgomery's reduction). For these moderate sizes, does it pay off to precompute a full inverse for the montgomery reduction, rather than using redc_1 or redc_2? (I don't quite

Re: fixed-size mpn_mul_n for small n?

2012-02-12 Thread Zimmermann Paul
Torbjörn, > Providing special code for many un,vn combinations (as separate > functions are as part of mpn_mul_basecase) quickly become unmanageable. > If we want to handle all sizes <= 16 (say) we'll need 136 variants. > > I don't think it makes much sense providing code for just un=vn (e

Re: fixed-size mpn_mul_n for small n?

2012-02-12 Thread Torbjorn Granlund
Zimmermann Paul writes: GMP currently has variable-size assembly code for mpn_mul_n on some processors. Could it be faster to have fixed-size assembly code for small values of n (say up to n=32)? Then mpn_mul_n() would simply be a wrapper to those fixed-size functions, or to a variable-si

fixed-size mpn_mul_n for small n?

2012-02-12 Thread Zimmermann Paul
Hi, GMP currently has variable-size assembly code for mpn_mul_n on some processors. Could it be faster to have fixed-size assembly code for small values of n (say up to n=32)? Then mpn_mul_n() would simply be a wrapper to those fixed-size functions, or to a variable-size code for n>32. Pau