David clarified the timing in that last slide of his. Apparently it was just a misconfiguration external to MPIR. He reports that the correct time for MPIR is 224s, which is in line with what we might expect from the figures I gave for AMD chips.
Bill. On 26 December 2011 14:44, Bill Hart <goodwillh...@googlemail.com> wrote: > A few days ago, David Harvey gave a really nice talk at Warwick about > some new number theoretic transform code he has been working on. It > uses some nice tricks to get *really* fast butterflies. Even a C > implementation turned out to be quite fast, although he is also > working on some assembly optimised versions. Two really nice features > are very low memory usage and ability to easily parallelise it (which > he has done). > > The slides are available here: > > http://wiki.sagemath.org/SageFlintDays/slides?action=AttachFile&do=view&target=fastntt.pdf > > On the final slide you will find some timings of his new code versus > mpir-2.4.0 and gmp-5.0.2. On that machine (an Intel Xeon) it appears > that GMP is way faster than MPIR (and his code is slightly faster > again, and much faster when parallelised). > > Anyhow, I decided to do all my timings for gmp-5.0.2 to see how it > compares with mpir-2.4.0 and my new fft on my machine (a K10-2). I > don't have a copy of David's code, so I can't compare with that at > present. > > bits iters mpir new_fft gmp > 195840 1000 1.149s 1.105s 0.997s > 261120 1000 1.483s 1.415s 1.396s > 391296 100 0.261s 0.248s 0.282s > 521728 100 0.344s 0.315s 0.411s > 782592 100 0.577s 0.539s 0.628s > 1043456 100 0.706s 0.688s 0.848s > 1569024 100 1.229s 1.153s 1.317s > 2092032 100 1.543s 1.462s 2.765s > 3127296 10 0.283s 0.286s 0.408s > 4169728 10 0.357s 0.350s 0.543s > 6273024 10 0.621s 0.615s 0.843s > 8364032 10 0.831s 0.799s 1.156s > 12539904 10 1.441s 1.471s 1.798s > 16719872 1 0.230s 0.209s 0.288s > 25122816 1 0.379s 0.357s 0.434s > 33497088 1 0.524s 0.462s 0.646s > 50245632 1 0.833s 0.782s 1.035s > 66994176 1 1.596s 0.967s 1.358s > 100577280 1 1.906s 1.704s 2.177s > 134103040 1 2.784s 2.162s 2.984s > 201129984 1 3.971s 3.524s 4.536s > 268173312 1 5.146s 4.572s 5.781s > 402456576 1 7.548s 7.041s 9.867s > 536608768 1 9.841s 9.428s 12.71s > 804913152 1 15.48s 14.18s 20.06s > 1073217536 1 21.17s 19.33s 27.19s > 1610219520 1 31.64s 30.83s 43.37s > 2146959360 1 43.25s 39.17s 57.66s > 3220340736 1 70.14s 61.39s 92.94s > 4293787648 1 96.00s 80.15s 146.1s > 6441566208 1 150.2s 136.9s 217.5s > 8588754944 1 208.4s 183.3s 312.8s > 12883132416 1 327.4s 290.2s 447.7s > 17177509888 1 485.0s 382.2s 614.2s > > I've checked multiple ways that I am really compiling against the > right GMP. It also appears that GMP is configuring correctly on my > machine. So it looks very much like either MPIR misconfigured on > David's machine, or the GMP FFT is much, much better with Intel > machines or the MPIR FFT is way, way worse with Intel machines. > > I'll try to get some configuration details from David and eventually > I'll do some timings on an Intel machine to see how everything > compares. > > Last night I tried combining pointwise multiplications with the > inverse transform to improve memory locality. However, this failed > because calling mpir's Nussbaumer convolution rapidly in succession > results in an "Unable to allocate memory" error. I have no idea what > causes this (it is certainly not out of memory). The code was > exhibiting some very odd behaviour which I was unable to account for. > My best guess is that it is to do with too many allocations on the > stack or something. I am doubtlessly calling the function in a way it > was not intended to be called. Anyhow, this just underscores the need > to write a fast Nussbaumer convolution, which I currently do not have > time to do (I don't need it for my application). > > Bill. -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.