I've now added the negacyclic transforms to the flint repo and cleaned
it all up ready for some tuning code to be written.

At the moment some hard coded tuning values are present in
mulmod_2expp1.c. There are no tuning values for the main
multiplication routine because there is no wrapper function
mpn_mul_fft or whatever at this point (it's trivial to write one, it
just gets the appropriate tuning values and calls
fft_mul_mfa_sqrt2_truncate).

A port to MPIR would involve the following:

* replace FLINT_BITS with GMP_LIMB_BITS throughout, similarly with
FLINT_MIN -> MIN
* replace malloc with TMP_MARK, TMP_LIMBS_BALLOC throughout
* replace free with TMP_FREE throughout
* #include gmp-impl.h throughout
* copy n_revbin from the ulong_extras directory (I hereby license it
BSD as per the remainder of the FFT implementation)
* add alternative code paths for mpn_sumdiff_n and mpn_addsub_n where
these are used but not available on certain architectures

Eventually when I add tuning code it will need rewriting for MPIR.
It's trivial to tune this FFT.

Obviously as two thirds of the time is spent in butterflies, any
assembly optimisation of these will speed things up. I don't plan to
do this myself.

Bill.

On 1 January 2012 02:28, Bill Hart <goodwillh...@googlemail.com> wrote:
> I have finished implementing and debugging a new Nussbaumer
> convolution. It isn't the absolutely fastest possible thing one can
> do, but it is still slightly better than what is in mpir.
>
> The fourth column in the following table shows the times for the new
> code. It is only used from a certain point on hence no changes to the
> first half of the table:
>
> bits              mpir    new_fft  gmp    nussbaumer
> 195840        1.149s 1.105s 0.997s
> 261120        1.483s 1.415s 1.396s
> 391296        0.261s 0.248s 0.282s
> 521728        0.344s 0.315s 0.411s
> 782592        0.577s 0.539s 0.628s
> 1043456       0.706s 0.688s 0.848s
> 1569024       1.229s 1.153s 1.317s
> 2092032       1.543s 1.462s 2.765s
> 3127296       0.283s 0.286s 0.408s
> 4169728       0.357s 0.350s 0.543s
> 6273024       0.621s 0.615s 0.843s
> 8364032       0.831s 0.799s 1.156s
> 12539904      1.441s 1.471s 1.798s
> 16719872      0.230s 0.209s 0.288s
> 25122816      0.379s 0.357s 0.434s
> 33497088      0.524s 0.462s 0.646s
> 50245632      0.833s 0.782s 1.035s
> 66994176      1.596s 0.967s 1.358s
> 100577280     1.906s 1.704s 2.177s 1.684s
> 134103040     2.784s 2.162s 2.984s 2.154s
> 201129984     3.971s 3.460s 4.536s 3.349s
> 268173312     5.146s 4.419s 5.781s 4.350s
> 402456576     7.548s 7.150s 9.867s 6.717s
> 536608768     9.841s 9.264s 12.71s 8.804s
> 804913152     15.48s 14.71s 20.06s 13.92s
> 1073217536    21.17s 18.85s 27.19s 17.96s
> 1610219520    31.64s 30.64s 43.37s 29.64s
> 2146959360    43.25s 39.17s 57.66s 39.07s
> 3220340736    70.14s 61.39s 92.94s 61.17s
> 4293787648    96.00s 80.15s 146.1s 79.13s
> 6441566208    150.2s 136.9s 217.5s 132.9s
> 8588754944    208.4s 183.3s 312.8s 177.2s
> 12883132416   327.4s 290.2s 447.7s 281.8s
> 17177509888   485.0s 382.2s 614.2s 372.6s
>
> The final step is to write a few missing test functions and to write
> tuning code. (Note the Nussbaumer code is in the mpir-fft repo not the
> flint repo at this stage. I will fix this tomorrow.)
>
> Happy New Year!
>
> Bill.

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Reply via email to