I been trying to account for each and every cycle in the asm routines , and I came across this oddity
let mpn_fn be unrolled to X-way then run ./speed -cD -s X-X*20 -t X mpn_fn and you see the time differences per loop as you expect , except for the size 9*X , where you have an extra 12-14 cycles (a branch miss-prediction) for example for my mpn_add_n which is 4-way unroll ~/gmpextra-1.0.2/tune# ./speed -cD -s 4-80 -t 4 mpn_add_n overhead 6.04 cycles, precision 10000 units of 5.53e-10 secs, CPU freq 1808.23 MHz mpn_add_n 4 (16.11) 8 5.04 12 5.06 16 6.05 20 6.01 24 6.04 28 6.12 32 6.06 36 19.97 40 9.18 44 5.98 48 6.08 52 6.07 56 6.01 60 6.04 64 6.04 68 6.05 72 6.04 76 6.05 80 6.47 This is quite significant , as it amounts to a 21% slowdown at 36 limbs , and even at 100 limbs we still losing 10% speed. It seems to be a general problem as it effects every mpn fn I've tested. The problem appears to be , when you have a loop with a count >=9 , then the cpu will always have a branch misprediction once. It's possible that this is a problem only with my exact cpu(steping) below /gmp-4.2.4/tune# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 44 model name : AMD Sempron(tm) Processor 3000+ stepping : 2 cpu MHz : 1808.227 cache size : 128 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up rep_good pni lahf_lm bogomips : 3620.38 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc The only fix I could think of is to unroll the function further , and so push the problem to larger sizes , where the slowdown% will be less. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---