The code can be more easily browsed here:

https://github.com/wbhart/flint2/tree/trunk/fft

Bill.

On 2 January 2012 07:30, Bill Hart <goodwillh...@googlemail.com> wrote:
> I've now added the negacyclic transforms to the flint repo and cleaned
> it all up ready for some tuning code to be written.
>
> At the moment some hard coded tuning values are present in
> mulmod_2expp1.c. There are no tuning values for the main
> multiplication routine because there is no wrapper function
> mpn_mul_fft or whatever at this point (it's trivial to write one, it
> just gets the appropriate tuning values and calls
> fft_mul_mfa_sqrt2_truncate).
>
> A port to MPIR would involve the following:
>
> * replace FLINT_BITS with GMP_LIMB_BITS throughout, similarly with
> FLINT_MIN -> MIN
> * replace malloc with TMP_MARK, TMP_LIMBS_BALLOC throughout
> * replace free with TMP_FREE throughout
> * #include gmp-impl.h throughout
> * copy n_revbin from the ulong_extras directory (I hereby license it
> BSD as per the remainder of the FFT implementation)
> * add alternative code paths for mpn_sumdiff_n and mpn_addsub_n where
> these are used but not available on certain architectures
>
> Eventually when I add tuning code it will need rewriting for MPIR.
> It's trivial to tune this FFT.
>
> Obviously as two thirds of the time is spent in butterflies, any
> assembly optimisation of these will speed things up. I don't plan to
> do this myself.
>
> Bill.
>
> On 1 January 2012 02:28, Bill Hart <goodwillh...@googlemail.com> wrote:
>> I have finished implementing and debugging a new Nussbaumer
>> convolution. It isn't the absolutely fastest possible thing one can
>> do, but it is still slightly better than what is in mpir.
>>
>> The fourth column in the following table shows the times for the new
>> code. It is only used from a certain point on hence no changes to the
>> first half of the table:
>>
>> bits              mpir    new_fft  gmp    nussbaumer
>> 195840        1.149s 1.105s 0.997s
>> 261120        1.483s 1.415s 1.396s
>> 391296        0.261s 0.248s 0.282s
>> 521728        0.344s 0.315s 0.411s
>> 782592        0.577s 0.539s 0.628s
>> 1043456       0.706s 0.688s 0.848s
>> 1569024       1.229s 1.153s 1.317s
>> 2092032       1.543s 1.462s 2.765s
>> 3127296       0.283s 0.286s 0.408s
>> 4169728       0.357s 0.350s 0.543s
>> 6273024       0.621s 0.615s 0.843s
>> 8364032       0.831s 0.799s 1.156s
>> 12539904      1.441s 1.471s 1.798s
>> 16719872      0.230s 0.209s 0.288s
>> 25122816      0.379s 0.357s 0.434s
>> 33497088      0.524s 0.462s 0.646s
>> 50245632      0.833s 0.782s 1.035s
>> 66994176      1.596s 0.967s 1.358s
>> 100577280     1.906s 1.704s 2.177s 1.684s
>> 134103040     2.784s 2.162s 2.984s 2.154s
>> 201129984     3.971s 3.460s 4.536s 3.349s
>> 268173312     5.146s 4.419s 5.781s 4.350s
>> 402456576     7.548s 7.150s 9.867s 6.717s
>> 536608768     9.841s 9.264s 12.71s 8.804s
>> 804913152     15.48s 14.71s 20.06s 13.92s
>> 1073217536    21.17s 18.85s 27.19s 17.96s
>> 1610219520    31.64s 30.64s 43.37s 29.64s
>> 2146959360    43.25s 39.17s 57.66s 39.07s
>> 3220340736    70.14s 61.39s 92.94s 61.17s
>> 4293787648    96.00s 80.15s 146.1s 79.13s
>> 6441566208    150.2s 136.9s 217.5s 132.9s
>> 8588754944    208.4s 183.3s 312.8s 177.2s
>> 12883132416   327.4s 290.2s 447.7s 281.8s
>> 17177509888   485.0s 382.2s 614.2s 372.6s
>>
>> The final step is to write a few missing test functions and to write
>> tuning code. (Note the Nussbaumer code is in the mpir-fft repo not the
>> flint repo at this stage. I will fix this tomorrow.)
>>
>> Happy New Year!
>>
>> Bill.

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Reply via email to