On Wednesday 22 December 2010 10:15:39 Cactus wrote: > On Dec 22, 9:08 am, Jason <ja...@njkfrudils.plus.com> wrote: > > Hi > > > > In trunk there is a new mpn_mul_2 for the nehalem/westmere , the old one > > ran at (a measured) 7.59c/l and the new one at 6.84c/l , about 10% > > speed-up , the optimal would be 6.0c/l (bound by add latency) this would > > give a measured 5.87c/l . I'm going to try adding a cpuid serializing > > instruction in our timing code to see if we can get proper timing for > > the nehalem. Note: This new function is VERY sensitive to the exact > > feed-in and wind-down code , it's a right old PITA . If only I could put > > the pipelines in a known state at the start of the function , or time it > > with the exact feed-in code. > > Hi Jason, > > I have added it to the nehalem x64 builds on Windows. > > Of course, the feed in/out code is different so its quite possible > that this will interfere with the optimisation. > > Brian
It seems we allready use cpuid to serialize , however turning off turbo-boost in the bios solves it. with turbo boost ./speed -c -s 1000 mpn_add_n overhead 6.00 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq 2664.58 MHz mpn_add_n 1000 1933.00 and with turbo-boost turned off ./speed -c -s 1000 mpn_add_n overhead 6.00 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq 2664.58 MHz mpn_add_n 1000 2030.00 clearly rdtsc counts the base clock , and if one core if boosted rdtsc still counts the base clock , giving impossible answers , I'll think I'll leave my bios with turbo-boost switched off , accurate answers are far more important than a 5% speedup. Jason -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-de...@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.