On Wed, Oct 23, 2013 at 4:17 PM, Bill Hart <goodwillh...@googlemail.com> wrote:
> I have made a number of speedups to the precomputed inverse code I announced
> earlier, and removed two unneeeded functions. It is now up to 20% faster
> than ordinary 2n x n division for n = 1 limb and from n = 3-15 limbs and up
> to 2.2 times as fast above n = 120 limbs.

So it's slower for n = 2 limbs? Could you hardcode this case?

> There's nothing more I can do for it now. The gaps that remain are essential
> gaps, due to the raw speed of the division code in the mpir trunk (the code
> in mpir-2.6.0 is substantially slower).

Seems like a pretty good improvement, even if there is a gap!

> Unfortunately, in those gaps, the ordinary division code is up to 20% faster
> than using a precomputed inverse. We could eventually close the gaps by
> rewriting mullow and mulhigh in mpir, but this is a lot of work, including
> lots of assembly code and much careful thought about algorithms.

Assembly optimised basecase mullow and mulhigh would be very useful
for many other things too. Few people have the skills to do this,
though, and you are probably right to prioritise other things...

> As it is, it is possible to switch on mpir's mullow code for one of the
> multiplications and it slows it down, again due to the raw speed of mpir's
> ordinary multiplication code.

Fredrik

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mpir-devel+unsubscr...@googlegroups.com.
To post to this group, send email to mpir-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/mpir-devel.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to