Hi, with Samuel Vivien (in cc) we noticed two things related to the FFT code:
1) the use of mpn_add_n_sub_n is not activated by default in mul_fft.c. It might give a small speedup in some cases. 2) the previous code to multiply integers (mpn_mul_fft_full) which did two multiplications mod 2^N1+1 and 2^N2+1 has been replaced by a new code which computes 2^(2N)-1 for 2N ~ N1+N2, using the fact that 2^(2N)-1=(2^N+1)(2^(N/2)+1)(2^(N/4)+1)...(2^(N/2^k)+1)(2^(N/2^k)-1). However that new code is not always faster, for example on an Intel Core i5-4590 with gcc 10.2.1: Vanilla GMP 6.2.1: $ ./speed -s 1000000,2000000,5000000,10000000 mpn_mul_n mpn_mul_n mpn_mul_n mpn_mul_n mpn_mul_n mpn_mul_n 1000000 0.447985000 0.452133000 #0.447968000 2000000 #0.949739000 0.949996000 0.956713000 5000000 #2.669156000 2.678638000 2.677642000 10000000 #5.540204000 5.547306000 5.547152000 With --enable-old-fft-full (without WANT_ADDSUB): 1000000 0.448750000 0.451685000 #0.447571000 2000000 #0.941488000 0.942339000 0.947603000 5000000 #2.668528000 2.675208000 2.669147000 10000000 5.543246000 5.580915000 #5.538155000 Best regards, Paul _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel