Hi , now I have more accurate timings here are the real changes made from mpir-2.2 to the upcoming mpir-2.3
popcount 1310 to 1066 ie 1.25c/l at 4-way to 1.0c/l at 6-way hamdist 2036 to 2040 ie 2.0c/l at 4-way to 2.0c/l at 2-way mul_1 3779 to 3610 ie 3.75c/l at 4-way to 3.563c/l at 3-way mul_2 7961 to 7172 ie 7.9c/l at 3-way to 7.1 at 3-way The popcount and hamdist are as before , but the mul_1,2 are showing some bit rot , in light of the better timings I'll give them another go. Jason On Dec 22, 5:57 pm, Jason <ja...@njkfrudils.plus.com> wrote: > On Wednesday 22 December 2010 10:15:39 Cactus wrote: > > > > > On Dec 22, 9:08 am, Jason <ja...@njkfrudils.plus.com> wrote: > > > Hi > > > > In trunk there is a new mpn_mul_2 for the nehalem/westmere , the old one > > > ran at (a measured) 7.59c/l and the new one at 6.84c/l , about 10% > > > speed-up , the optimal would be 6.0c/l (bound by add latency) this would > > > give a measured 5.87c/l . I'm going to try adding a cpuid serializing > > > instruction in our timing code to see if we can get proper timing for > > > the nehalem. Note: This new function is VERY sensitive to the exact > > > feed-in and wind-down code , it's a right old PITA . If only I could put > > > the pipelines in a known state at the start of the function , or time it > > > with the exact feed-in code. > > > Hi Jason, > > > I have added it to the nehalem x64 builds on Windows. > > > Of course, the feed in/out code is different so its quite possible > > that this will interfere with the optimisation. > > > Brian > > It seems we allready use cpuid to serialize , however turning off turbo-boost > in the bios solves it. > > with turbo boost > > ./speed -c -s 1000 mpn_add_n > overhead 6.00 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq > 2664.58 MHz > mpn_add_n > 1000 1933.00 > > and with turbo-boost turned off > > ./speed -c -s 1000 mpn_add_n > overhead 6.00 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq > 2664.58 MHz > mpn_add_n > 1000 2030.00 > > clearly rdtsc counts the base clock , and if one core if boosted rdtsc still > counts the base clock , giving impossible answers , I'll think I'll leave my > bios with turbo-boost switched off , accurate answers are far more important > than a 5% speedup. > > Jason -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-de...@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.