On Tuesday 23 December 2008 22:52:10 Cactus wrote: > On Dec 22, 11:55 pm, jason <ja...@njkfrudils.plus.com> wrote: > > On Dec 20, 1:13 pm, Cactus <rieman...@googlemail.com> wrote: > > > On Dec 20, 10:49 am, Cactus <rieman...@googlemail.com> wrote: > > > > On Dec 20, 3:56 am, "Bill Hart" <goodwillh...@googlemail.com> wrote: > > > > > > Following up my earlier results, I have now played with alignment and > > > jump decisions and I find that: > > > > > > jc .1 > > > jmp .2 > > > > > > align 16 > > > .1:mov rax, [r10+r8*8] > > > > > > in which there is a jump to aligned code (rather than falling through > > > and hence executing the padding code) gives significantly better > > > results: > > > > > > Jason's Code (mp_add_n and mp_sub_n): > > > Jason's Code (mp_addmul_n and mp_submul_n): > > > Jason's Code (mp_mul_1): > > > > > > Running benchmarks > > > Category base > > > Program multiply > > > multiply 128 128 > > > MPIRbench.base.multiply.128.128 result: 26701842 > > > multiply 512 512 > > > MPIRbench.base.multiply.512.512 result: 6455010 > > > multiply 8192 8192 > > > MPIRbench.base.multiply.8192.8192 result: 61537 > > > multiply 131072 131072 > > > MPIRbench.base.multiply.131072.131072 result: 938 > > > multiply 2097152 2097152 > > > MPIRbench.base.multiply.2097152.2097152 result: 23.0 > > > MPIRbench.base.multiply result: 46978.70 > > > Program divide > > > divide 8192 32 > > > MPIRbench.base.divide.8192.32 result: 677900 > > > divide 8192 64 > > > MPIRbench.base.divide.8192.64 result: 689331 > > > divide 8192 128 > > > MPIRbench.base.divide.8192.128 result: 269308 > > > divide 8192 4096 > > > MPIRbench.base.divide.8192.4096 result: 116612 > > > divide 8192 8064 > > > MPIRbench.base.divide.8192.8064 result: 1027764 > > > divide 131072 8192 > > > MPIRbench.base.divide.131072.8192 result: 2667 > > > divide 131072 65536 > > > MPIRbench.base.divide.131072.65536 result: 1249 > > > divide 8388608 4194304 > > > MPIRbench.base.divide.8388608.4194304 result: 2.56 > > > MPIRbench.base.divide result: 24471.64 > > > MPIRbench.base result 33906.43 > > > Category app > > > Program rsa > > > rsa 512 > > > MPIRbench.app.rsa.512 result: 14055 > > > rsa 1024 > > > MPIRbench.app.rsa.1024 result: 2735 > > > rsa 2048 > > > MPIRbench.app.rsa.2048 result: 498 > > > MPIRbench.app.rsa result: 2675.09 > > > MPIRbench.app result 2675.09 > > > MPIRbench result: 9523.81 > > > > > > This is about 8% faster than my original Windows code. > > > > > > Well done Jason! > > > > > > Brian > > > > I've put the mpn_mul_basecase in the mpir development branch , ready > > for conversion to windows.http://www.digitalmischief.co.uk/fruitbowl/is > > the latest with a new mpn_sqr_basecase and mpn_redc_basecase , which > > overall gives me a 60% (which by co-incidence is the same ratio as 4/2.5 > > the addmul > > ratio's!!!) improvement over gmp-4.2.4, they are very much still > > cut&paste , so expect a few more % in time. I'm going to try a > > division_basecase and a mullow and mulhigh basecase next , there is > > also a addmul loop in bdivmod.c which does something , and may be > > worth doing. > > Hi Jason, > > Thanks for the mpn_mul_basecase code. > > I have converted this to Windows and it is slower than my old code - > the mpirbench score with the new code is 9350 whereas the current code > is 9550, which is a 2% performance loss. Only the mpn_mul_basecase > code is different - I have kept your other routines in place in making > this comparison. >
Odd!!! Did you run tune? , I assume your old code is is the Gaudry code, ,doesn't even sound like its running!! > In this case there is about the same prologue/epilogue overhead in > both versions so it will be interesting to see how it compares on > Linux. > > Brian > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---