The code can be more easily browsed here: https://github.com/wbhart/flint2/tree/trunk/fft
Bill. On 2 January 2012 07:30, Bill Hart <goodwillh...@googlemail.com> wrote: > I've now added the negacyclic transforms to the flint repo and cleaned > it all up ready for some tuning code to be written. > > At the moment some hard coded tuning values are present in > mulmod_2expp1.c. There are no tuning values for the main > multiplication routine because there is no wrapper function > mpn_mul_fft or whatever at this point (it's trivial to write one, it > just gets the appropriate tuning values and calls > fft_mul_mfa_sqrt2_truncate). > > A port to MPIR would involve the following: > > * replace FLINT_BITS with GMP_LIMB_BITS throughout, similarly with > FLINT_MIN -> MIN > * replace malloc with TMP_MARK, TMP_LIMBS_BALLOC throughout > * replace free with TMP_FREE throughout > * #include gmp-impl.h throughout > * copy n_revbin from the ulong_extras directory (I hereby license it > BSD as per the remainder of the FFT implementation) > * add alternative code paths for mpn_sumdiff_n and mpn_addsub_n where > these are used but not available on certain architectures > > Eventually when I add tuning code it will need rewriting for MPIR. > It's trivial to tune this FFT. > > Obviously as two thirds of the time is spent in butterflies, any > assembly optimisation of these will speed things up. I don't plan to > do this myself. > > Bill. > > On 1 January 2012 02:28, Bill Hart <goodwillh...@googlemail.com> wrote: >> I have finished implementing and debugging a new Nussbaumer >> convolution. It isn't the absolutely fastest possible thing one can >> do, but it is still slightly better than what is in mpir. >> >> The fourth column in the following table shows the times for the new >> code. It is only used from a certain point on hence no changes to the >> first half of the table: >> >> bits mpir new_fft gmp nussbaumer >> 195840 1.149s 1.105s 0.997s >> 261120 1.483s 1.415s 1.396s >> 391296 0.261s 0.248s 0.282s >> 521728 0.344s 0.315s 0.411s >> 782592 0.577s 0.539s 0.628s >> 1043456 0.706s 0.688s 0.848s >> 1569024 1.229s 1.153s 1.317s >> 2092032 1.543s 1.462s 2.765s >> 3127296 0.283s 0.286s 0.408s >> 4169728 0.357s 0.350s 0.543s >> 6273024 0.621s 0.615s 0.843s >> 8364032 0.831s 0.799s 1.156s >> 12539904 1.441s 1.471s 1.798s >> 16719872 0.230s 0.209s 0.288s >> 25122816 0.379s 0.357s 0.434s >> 33497088 0.524s 0.462s 0.646s >> 50245632 0.833s 0.782s 1.035s >> 66994176 1.596s 0.967s 1.358s >> 100577280 1.906s 1.704s 2.177s 1.684s >> 134103040 2.784s 2.162s 2.984s 2.154s >> 201129984 3.971s 3.460s 4.536s 3.349s >> 268173312 5.146s 4.419s 5.781s 4.350s >> 402456576 7.548s 7.150s 9.867s 6.717s >> 536608768 9.841s 9.264s 12.71s 8.804s >> 804913152 15.48s 14.71s 20.06s 13.92s >> 1073217536 21.17s 18.85s 27.19s 17.96s >> 1610219520 31.64s 30.64s 43.37s 29.64s >> 2146959360 43.25s 39.17s 57.66s 39.07s >> 3220340736 70.14s 61.39s 92.94s 61.17s >> 4293787648 96.00s 80.15s 146.1s 79.13s >> 6441566208 150.2s 136.9s 217.5s 132.9s >> 8588754944 208.4s 183.3s 312.8s 177.2s >> 12883132416 327.4s 290.2s 447.7s 281.8s >> 17177509888 485.0s 382.2s 614.2s 372.6s >> >> The final step is to write a few missing test functions and to write >> tuning code. (Note the Nussbaumer code is in the mpir-fft repo not the >> flint repo at this stage. I will fix this tomorrow.) >> >> Happy New Year! >> >> Bill. -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.