After a lot of work I have managed to remove the performance problem with the new division code on penryn.
Two important facts about core2/penryn are that it is always better to save muls and always better to save memory read/writes where possible, since both take a long time on that architecture. So I now have a version of the code which performs well on Intel and AMD. Unfortunately the difference in the basecase range is much less pronounced on AMD, being only up to about 20% faster, with an average of more like 10%. However, the performance in the divide-and-conquer range has improved by 3-4% and we now beat GMP by 25% at certain points. I still need to tune a couple of crossovers, but the new division code shouldn't much in the way of changes now. Bill. On 18 February 2014 19:35, Bill Hart <goodwillh...@googlemail.com> wrote: > Ah, the problem with mpn_sqr went away when I rebuilt everything from the > latest trunk. I think I was missing some recent patches to the squaring > code. > > So that leaves 1-5 as the major performance issues I'd like to deal with > in this and the next release. > > Bill. > > > On 18 February 2014 19:06, Bill Hart <goodwillh...@googlemail.com> wrote: > >> I ran mpir_bench_two on Penryn and K10. On the latter we seem to do >> better, so I will focus on the former. >> >> I see four areas where we need some improvement: >> >> 1) Very unbalanced multiplication where one of the operands is in the fft >> region (Fredrik's patch probably didn't go far enough). >> >> 2) Asymptotically fast division (in the fft range). We are about a factor >> of 2 slower than GMP. >> >> 3) Our extended gcd code seems to be slower than GMP's (I thought we used >> the same code nowadays). >> >> 4) Our fac_ui code is incredibly slow. >> >> 5) Division by a 64 bit number or 128 bit number (i.e. divrem1/2 with >> full number of bits in divisor). >> >> I think 2 and possibly 5 have to wait for another release. But maybe 1, 3 >> and 4 are easy enough to fix. >> >> Also, for some odd reason, even when speed shows mpn_sqr to be faster in >> MPIR than GMP, mpir_bench shows it the other way around, which is a mystery >> to me, other than that there may be some performance issue in the mpz code. >> >> Bill. >> >> > -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+unsubscr...@googlegroups.com. To post to this group, send email to mpir-devel@googlegroups.com. Visit this group at http://groups.google.com/group/mpir-devel. For more options, visit https://groups.google.com/groups/opt_out.