I couldn't resist adding a speedup to the fft. It is a patch of only
about 3-5 lines and I have now tested it for a few hours until my
machine ran out of memory.

I think this is the last thing to go in before we start tuning/testing
in earnest.

I'll roll a new alpha4 now.

Bill.

On 21 October 2012 00:59, Bill Hart <goodwillh...@googlemail.com> wrote:
> Hi all,
>
> I looked into the timing jumps in the FFT in MPIR 2.6 and in flint2.
>
> About 2/3 of the difference in timings is due to the fact that the FFT
> uses coefficients of twice the size (and convolution length just over
> half the original length) once you go over certain boundaries. The
> pointwise multiplications are thus far more expensive due to the
> quadratic nature of their complexity. (A small fraction of the time is
> due to zeroing coefficients, some of which may not actually need to be
> zeroed -- a target for future optimisation.)
>
> There is an optimisation that can be made which will help. Namely, one
> can use fft coefficients slightly bigger than the originals (after
> adjusting the convolution length and truncation so that this is
> possible). At some point I may implement this. It shouldn't be hard to
> do, but we probably need to wait until we have really efficient
> pointwise multiplications.
>
> The effect does become proportionally less as the integers multiplied
> get larger, as the fft coefficients move out of the quadratic range
> for multiplication. But there are clearly many opportunities for
> improvement here.
>
> Bill.
>
> On 17 October 2012 23:38, Bill Hart <goodwillh...@googlemail.com> wrote:
>> Hi all,
>>
>> In discussing another issue with Paul Zimmermann, he noted the
>> following ugly jump in performance in the flint/mpir fft for
>> multiplying two integers of the given number of limbs together:
>>
>>>         mpn_mul_fft_main
>>>      limbs               time
>>> 1046780    0.556109766
>>> 1046781    0.556993896
>>> 1046782    0.556378786
>>> 1046783    0.556399360
>>> 1046784    0.559116916
>>> 1046785    0.783891259
>>> 1046786    0.788226240
>>> 1046787    0.792805531
>>> 1046788    0.784128436
>>> 1046789    0.786670290
>>> 1046790    0.795639613
>>
>> Obviously this is supposed to increase smoothly. So it suggests a
>> problem in the fft.
>>
>> This could possibly be a few things I can think of:
>>
>> * incorrect "tuning" values (not so much the values in fft-tuning.h as
>> the way the functions pick appropriate n, w parameters, etc)
>>
>> * I am not doing the truncation correctly
>>
>> * there is a big jump overall because of a big jump in timing of
>> individual pointwise multiplications
>>
>> I haven't found the problem yet, but will endeavour to find it soon.
>> There is some limited hope that the fft times will improve if I can
>> find the problem (and it isn't too hard to fix).
>>
>> Bill.

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Reply via email to