After a lot of work I have managed to remove the performance problem with
the new division code on penryn.

Two important facts about core2/penryn are that it is always better to save
muls and always better to save memory read/writes where possible, since
both take a long time on that architecture.

So I now have a version of the code which performs well on Intel and AMD.
Unfortunately the difference in the basecase range is much less pronounced
on AMD, being only up to about 20% faster, with an average of more like
10%. However, the performance in the divide-and-conquer range has improved
by 3-4% and we now beat GMP by 25% at certain points.

I still need to tune a couple of crossovers, but the new division code
shouldn't much in the way of changes now.

Bill.


On 18 February 2014 19:35, Bill Hart <goodwillh...@googlemail.com> wrote:

> Ah, the problem with mpn_sqr went away when I rebuilt everything from the
> latest trunk. I think I was missing some recent patches to the squaring
> code.
>
> So that leaves 1-5 as the major performance issues I'd like to deal with
> in this and the next release.
>
> Bill.
>
>
> On 18 February 2014 19:06, Bill Hart <goodwillh...@googlemail.com> wrote:
>
>> I ran mpir_bench_two on Penryn and K10. On the latter we seem to do
>> better, so I will focus on the former.
>>
>> I see four areas where we need some improvement:
>>
>> 1) Very unbalanced multiplication where one of the operands is in the fft
>> region (Fredrik's patch probably didn't go far enough).
>>
>> 2) Asymptotically fast division (in the fft range). We are about a factor
>> of 2 slower than GMP.
>>
>> 3) Our extended gcd code seems to be slower than GMP's (I thought we used
>> the same code nowadays).
>>
>> 4) Our fac_ui code is incredibly slow.
>>
>> 5) Division by a 64 bit number or 128 bit number (i.e. divrem1/2 with
>> full number of bits in divisor).
>>
>> I think 2 and possibly 5 have to wait for another release. But maybe 1, 3
>> and 4 are easy enough to fix.
>>
>> Also, for some odd reason, even when speed shows mpn_sqr to be faster in
>> MPIR than GMP, mpir_bench shows it the other way around, which is a mystery
>> to me, other than that there may be some performance issue in the mpz code.
>>
>> Bill.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mpir-devel+unsubscr...@googlegroups.com.
To post to this group, send email to mpir-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/mpir-devel.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to