In which case, could you send me the output of ./configure make cd tune make tune
with a fresh checkout. Thanks. Bill. On 24 March 2014 22:07, Bill Hart <goodwillh...@googlemail.com> wrote: > I don't think it matters much if we leave sqr_basecase.as in or not. > There's no clear winner there. My inclination is to leave it in because it > is faster for very small and very large basecase squarings. > > Bill. > > > On 24 March 2014 21:57, Frithjof Schulze <sfrith...@gmail.com> wrote: > >> >> >> On Monday, March 24, 2014 11:12:34 AM UTC, leif wrote: >>> >>> Bill Hart wrote: >>> > Leif said he was going to tune on a bobcat, but hasn't yet. >>> >>> Well, this of course^TM turned out to be a can of worms... ;-) >>> >>> So far it looks like the bobcat version of mpn_sqr_basecase was actually >>> faster, but I don't really trust the figures. (I played a little with >>> the "precision" option, but this seems to be logically limited to 2^31. >>> With the default precision, I occasionally get non-monotonic numbers; >>> not that one value was exceptionally bad -- as one would expect, but >>> timings about twice as fast, despite both cores being on "performance", >>> and the machine mostly idle.) >>> >> >> If you want to compare, my values for ./speed -s 1-40 mpn_sqr_basecase >> are >> >> with sqr...as -- without >> 1 0.000000006 0.000000009 >> 2 0.000000012 0.000000017 >> 3 0.000000031 0.000000036 >> 4 0.000000056 0.000000054 >> 5 0.000000079 0.000000069 >> 6 0.000000100 0.000000094 >> 7 0.000000128 0.000000128 >> 8 0.000000159 0.000000151 >> 9 0.000000191 0.000000185 >> 10 0.000000228 0.000000223 >> 11 0.000000270 0.000000281 >> 12 0.000000318 0.000000320 >> 13 0.000000368 0.000000376 >> 14 0.000000418 0.000000419 >> 15 0.000000465 0.000000496 >> 16 0.000000527 0.000000539 >> 17 0.000000580 0.000000616 >> 18 0.000000646 0.000000668 >> 19 0.000000710 0.000000763 >> 20 0.000000788 0.000000822 >> 21 0.000000868 0.000000915 >> 22 0.000000935 0.000001000 >> 23 0.000001021 0.000001098 >> 24 0.000001107 0.000001170 >> 25 0.000001191 0.000001274 >> 26 0.000001286 0.000001372 >> 27 0.000001379 0.000001492 >> 28 0.000001496 0.000001577 >> 29 0.000001597 0.000001700 >> 30 0.000001704 0.000001802 >> 31 0.000001810 0.000001942 >> 32 0.000001927 0.000002038 >> 33 0.000002048 0.000002187 >> 34 0.000002167 0.000002299 >> 35 0.000002283 0.000002461 >> 36 0.000002409 0.000002577 >> 37 0.000002537 0.000002724 >> 38 0.000002672 0.000002864 >> 39 0.000002805 0.000003065 >> 40 0.000002959 0.000003180 >> >> This is with >> >> gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) >> Linux 3.8.0-36-generic #52~precise1-Ubuntu SMP Mon Feb 3 21:54:46 UTC >> 2014 x86_64 >> >> ~/src/mpir/tune > cat /proc/cpuinfo >> processor : 0 >> vendor_id : AuthenticAMD >> cpu family : 20 >> model : 1 >> model name : AMD E-350 Processor >> stepping : 0 >> microcode : 0x5000029 >> cpu MHz : 1600.000 >> cache size : 512 KB >> physical id : 0 >> siblings : 2 >> core id : 0 >> cpu cores : 2 >> apicid : 0 >> initial apicid : 0 >> fpu : yes >> fpu_exception : yes >> cpuid level : 6 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca >> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt >> pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid >> aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic >> cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt arat >> hw_pstate npt lbrv svm_lock nrip_save pausefilter >> bogomips : 3193.07 >> TLB size : 1024 4K pages >> clflush size : 64 >> cache_alignment : 64 >> address sizes : 36 bits physical, 48 bits virtual >> power management: ts ttp tm stc 100mhzsteps hwpstate >> >> processor : 1 >> vendor_id : AuthenticAMD >> cpu family : 20 >> [...] >> >> -- Frithjof >> >> The raw output was: >> >> First try with mpn/x86_64/bobcat/sqr_basecase.as >> >> overhead 0.000000004 secs, precision 1000000 units of 6.25e-10 secs, CPU >> freq 1600.00 MHz >> mpn_sqr_basecase >> 1 0.000000006 >> 2 0.000000012 >> 3 0.000000031 >> 4 0.000000056 >> 5 0.000000079 >> 6 0.000000100 >> 7 0.000000128 >> 8 0.000000159 >> 9 0.000000191 >> 10 0.000000228 >> 11 0.000000270 >> 12 0.000000318 >> 13 0.000000368 >> 14 0.000000418 >> 15 0.000000465 >> 16 0.000000527 >> 17 0.000000580 >> 18 0.000000646 >> 19 0.000000710 >> 20 0.000000788 >> 21 0.000000868 >> 22 0.000000935 >> 23 0.000001021 >> 24 0.000001107 >> 25 0.000001191 >> 26 0.000001286 >> 27 0.000001379 >> 28 0.000001496 >> 29 0.000001597 >> 30 0.000001704 >> 31 0.000001810 >> 32 0.000001927 >> 33 0.000002048 >> 34 0.000002167 >> 35 0.000002283 >> 36 0.000002409 >> 37 0.000002537 >> 38 0.000002672 >> 39 0.000002805 >> 40 0.000002959 >> >> Second try with mpn/x86_64/bobcat/sqr_basecase.as removed >> >> overhead 0.000000004 secs, precision 1000000 units of 6.25e-10 secs, CPU >> freq 1600.00 MHz >> mpn_sqr_basecase >> 1 0.000000009 >> 2 0.000000017 >> 3 0.000000036 >> 4 0.000000054 >> 5 0.000000069 >> 6 0.000000094 >> 7 0.000000128 >> 8 0.000000151 >> 9 0.000000185 >> 10 0.000000223 >> 11 0.000000281 >> 12 0.000000320 >> 13 0.000000376 >> 14 0.000000419 >> 15 0.000000496 >> 16 0.000000539 >> 17 0.000000616 >> 18 0.000000668 >> 19 0.000000763 >> 20 0.000000822 >> 21 0.000000915 >> 22 0.000001000 >> 23 0.000001098 >> 24 0.000001170 >> 25 0.000001274 >> 26 0.000001372 >> 27 0.000001492 >> 28 0.000001577 >> 29 0.000001700 >> 30 0.000001802 >> 31 0.000001942 >> 32 0.000002038 >> 33 0.000002187 >> 34 0.000002299 >> 35 0.000002461 >> 36 0.000002577 >> 37 0.000002724 >> 38 0.000002864 >> 39 0.000003065 >> 40 0.000003180 >> >> >> >> > > -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+unsubscr...@googlegroups.com. To post to this group, send email to mpir-devel@googlegroups.com. Visit this group at http://groups.google.com/group/mpir-devel. For more options, visit https://groups.google.com/d/optout.