I was looking at table 2 where the scale is multiplications per second, i.e. higher is better. But I might be misreading their data.
I'm not sure if it is fair to compare a single CPU core with all cores of a GPU. That might depend on the application. I wonder as well whether dedicated parallel code for the CPU might do better. Bill. On 15 March 2012 23:30, Dann Corbit <dcor...@connx.com> wrote: > -----Original Message----- > From: mpir-devel@googlegroups.com [mailto:mpir-devel@googlegroups.com] On > Behalf Of Bill Hart > Sent: Thursday, March 15, 2012 1:09 PM > To: mpir-devel@googlegroups.com > Subject: Re: [mpir-devel] > > Now that I am not on my mobile I can access the article and I see that they > are comparing with GMP (though they don't say what version -- and it makes a > big difference). [They also reference GMP as being written by the "GNU Open > Source Community". The GNU people would not be thrilled about that. They > don't use the phrase "Open Source" but "Free Software".] > > Anyhow, the GPU only seems to beat the CPUs when the multiplications are > small, basically below the FFT range. Either way, the GPU is giving some > speedup, assuming they compared with a recent GMP. So it is interesting. >>> > It looks to me like it is faster for the CUDA card than for the CPU > everywhere (at least in the places that they tested). Consider "Table 1. CPU > single core vs GPU - multiplication time (in milliseconds)" on article page > 372: > > Table 1. CPU single core vs GPU - multiplication time (in milliseconds) > Size in K bits Core 2 Q6600 Core i7 870 GTX 295 GTX 295 GTX > 480 > > Old Code New Code New Code > ----------------------------------------------------------------------------------------- > 255 x 255 2.071 1.368 0.813 > n/a n/a > 383 x 383 3.266 2.154 n/a > 0.451 0.156 > 510 x 510 4.649 3.032 2.010 > n/a n/a > 766 x 766 7.263 4.834 n/a > 0.957 0.317 > 1020 x 1020 10.381 6.792 5.418 > n/a n/a > 1532 x 1532 16.119 10.937 n/a > 1.821 0.584 > 2040 x 2040 23.576 15.738 15.389 3.544 > 1.122 > 4080 x 4080 53.653 35.283 43.954 7.968 > 2.395 > 8160 x 8160 141.655 80.479 129.358 16.627 > 4.924 > 16320 x 16320 297.032 186.751 386.083 27.841 > 9.666 > > I interpret this to mean (for instance) that my i7 can multiply two 383 bit > numbers in 2.154 milliseconds but my GTX 480 can do the job in .156 ms. > (Ratio: 2.154 /0.156=13.8 times faster ) > Further, my i7 CPU can multiply two 16,320 bit numbers in 27.841 ms but the > GTX 480 can do it in 9.666 seconds (Ratio: 186.751/9.666=19.32 times faster) > I guess that for one million bit numbers the benefit will be even greater. > I do not know if this table takes into account the copy time or not. It > might be worthwhile to contact the original authors for clarification. > > -- > You received this message because you are subscribed to the Google Groups > "mpir-devel" group. > To post to this group, send email to mpir-devel@googlegroups.com. > To unsubscribe from this group, send email to > mpir-devel+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/mpir-devel?hl=en. > -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.