Running the k8/k10 asm code with no changes on the core2 machine sage we get this
popcount,hamdist no popcount instruction slowdowns--------- add,sub are 0.50x rshift1,lshift1 0.70x k8 lshift,rshift 0.91x addmul_1,submul_1 0.89x but faster for <20 limbs... speedups---------- and,ior,xor are 1.13x nand,nior,xnor,andn,iorn are 1.50x com is 2.00x divebyff 1.40x although not better until 12limbs diveby3 2.30x addadd,addsub 1.50x sumdiff 1.26x addlsh1 1.50x sublsh1 1.40x k10 lshift,rshift 1.18x mul_1 1.04x for mul basecase we get ./speed -c -r -s 1-40 mpn_jaytest mpn_mul_basecase overhead 6.12 cycles, precision 10000 units of 3.75e-10 secs, CPU freq 2666.76 MHz mpn_jaytest mpn_mul_basecase 1 #9.21 2.3531 2 #21.43 2.0070 3 #56.00 1.5234 4 #91.36 1.4960 5 #136.17 1.4320 6 #195.54 1.3769 7 #261.22 1.3718 8 #336.56 1.3482 9 #419.62 1.3441 10 #527.14 1.3054 11 #634.44 1.3105 12 #744.00 1.3172 13 #873.85 1.2931 14 #1024.55 1.1088 15 #1169.00 1.0873 16 #1328.89 1.0704 17 #1492.50 1.0672 18 #1710.00 1.0317 19 #1880.00 1.0488 20 #2112.00 1.0246 21 #2288.00 1.0385 22 #2547.50 1.0128 23 #2787.50 1.0063 24 #3012.50 1.0108 25 #3212.50 1.0241 26 3556.67 #0.9953 27 3836.67 #0.9939 28 4106.67 #0.9935 29 4370.00 #0.9908 30 4700.00 #0.9617 31 4996.67 #0.9973 32 5380.00 #0.9944 33 5685.00 #0.9727 34 6105.00 #0.9853 35 6375.00 #0.9906 36 6775.00 #0.9764 37 7540.00 #0.9151 38 7515.00 #0.9714 39 #7955.00 1.0578 40 #8750.00 1.0086 This is all with no tweeking on cpu family : 6 model : 29 model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz stepping : 1 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---