On Wednesday 25 February 2009 20:00:19 Bill Hart wrote: > I had a play with the karatsuba function today. I figured that one > might be able to cut out some overhead by having it complete the last > two iterations of the recursion in one go. I wrote some code which > works for the last two iterations when n/2, at that point, is > divisible by 4. But after fiddling for quite some time to try and find > a way to make the existing code faster with this as a special case, I > have not seen any speedups. Anyhow, here is the code, in case someone > else had some ideas. It obviously only works on systems with > mpn_addadd_n and mpn_addsub_n.
Consider a new assembler function carry=mpn_muladd_basecase(rp,sp1,n1,sp2,n2) functionally mpn_mul_basecase(tp,sp1,n1,sp2,n2); carry=mpn_add_n(rp,rp,tp,n1+n2) implemented exactly like mpn_mul_basecase but with the first row an addmul_1 rather than a mul_1 , and a extra carry to be taken from row to row. This is easy to implement at maximum efficiency , and it runs at the same speed as a mul_basecase but with a double sized add thrown in for free.Clearly we can do the same with sqr_basecase.Using it in the current karatsuba code would conflict with the use of addadd/addsub but I suppose/hope we can come up with a varient where we can use both.You would only use it at the lowest level of recursion.An extension of this would be the obvious mpn_muladd_kara which we would use for the "middle mul" as you only need it once.It may be helpful to play around with the different karatsuba formulars to find a better match. Jason --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---