I couldn't resist adding a speedup to the fft. It is a patch of only about 3-5 lines and I have now tested it for a few hours until my machine ran out of memory.
I think this is the last thing to go in before we start tuning/testing in earnest. I'll roll a new alpha4 now. Bill. On 21 October 2012 00:59, Bill Hart <goodwillh...@googlemail.com> wrote: > Hi all, > > I looked into the timing jumps in the FFT in MPIR 2.6 and in flint2. > > About 2/3 of the difference in timings is due to the fact that the FFT > uses coefficients of twice the size (and convolution length just over > half the original length) once you go over certain boundaries. The > pointwise multiplications are thus far more expensive due to the > quadratic nature of their complexity. (A small fraction of the time is > due to zeroing coefficients, some of which may not actually need to be > zeroed -- a target for future optimisation.) > > There is an optimisation that can be made which will help. Namely, one > can use fft coefficients slightly bigger than the originals (after > adjusting the convolution length and truncation so that this is > possible). At some point I may implement this. It shouldn't be hard to > do, but we probably need to wait until we have really efficient > pointwise multiplications. > > The effect does become proportionally less as the integers multiplied > get larger, as the fft coefficients move out of the quadratic range > for multiplication. But there are clearly many opportunities for > improvement here. > > Bill. > > On 17 October 2012 23:38, Bill Hart <goodwillh...@googlemail.com> wrote: >> Hi all, >> >> In discussing another issue with Paul Zimmermann, he noted the >> following ugly jump in performance in the flint/mpir fft for >> multiplying two integers of the given number of limbs together: >> >>> mpn_mul_fft_main >>> limbs time >>> 1046780 0.556109766 >>> 1046781 0.556993896 >>> 1046782 0.556378786 >>> 1046783 0.556399360 >>> 1046784 0.559116916 >>> 1046785 0.783891259 >>> 1046786 0.788226240 >>> 1046787 0.792805531 >>> 1046788 0.784128436 >>> 1046789 0.786670290 >>> 1046790 0.795639613 >> >> Obviously this is supposed to increase smoothly. So it suggests a >> problem in the fft. >> >> This could possibly be a few things I can think of: >> >> * incorrect "tuning" values (not so much the values in fft-tuning.h as >> the way the functions pick appropriate n, w parameters, etc) >> >> * I am not doing the truncation correctly >> >> * there is a big jump overall because of a big jump in timing of >> individual pointwise multiplications >> >> I haven't found the problem yet, but will endeavour to find it soon. >> There is some limited hope that the fft times will improve if I can >> find the problem (and it isn't too hard to fix). >> >> Bill. -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.