In which case, could you send me the output of

./configure
make
cd tune
make tune

with a fresh checkout. Thanks.

Bill.


On 24 March 2014 22:07, Bill Hart <goodwillh...@googlemail.com> wrote:

> I don't think it matters much if we leave sqr_basecase.as in or not.
> There's no clear winner there. My inclination is to leave it in because it
> is faster for very small and very large basecase squarings.
>
> Bill.
>
>
> On 24 March 2014 21:57, Frithjof Schulze <sfrith...@gmail.com> wrote:
>
>>
>>
>> On Monday, March 24, 2014 11:12:34 AM UTC, leif wrote:
>>>
>>> Bill Hart wrote:
>>> > Leif said he was going to tune on a bobcat, but hasn't yet.
>>>
>>> Well, this of course^TM turned out to be a can of worms... ;-)
>>>
>>> So far it looks like the bobcat version of mpn_sqr_basecase was actually
>>> faster, but I don't really trust the figures.  (I played a little with
>>> the "precision" option, but this seems to be logically limited to 2^31.
>>>   With the default precision, I occasionally get non-monotonic numbers;
>>> not that one value was exceptionally bad -- as one would expect, but
>>> timings about twice as fast, despite both cores being on "performance",
>>> and the machine mostly idle.)
>>>
>>
>>  If you want to compare, my values for ./speed -s 1-40 mpn_sqr_basecase
>> are
>>
>>    with sqr...as  --  without
>> 1 0.000000006 0.000000009
>> 2 0.000000012 0.000000017
>> 3 0.000000031 0.000000036
>> 4 0.000000056 0.000000054
>> 5 0.000000079 0.000000069
>> 6 0.000000100 0.000000094
>> 7 0.000000128 0.000000128
>> 8 0.000000159 0.000000151
>> 9 0.000000191 0.000000185
>> 10 0.000000228 0.000000223
>> 11 0.000000270 0.000000281
>> 12 0.000000318 0.000000320
>> 13 0.000000368 0.000000376
>> 14 0.000000418 0.000000419
>> 15 0.000000465 0.000000496
>> 16 0.000000527 0.000000539
>> 17 0.000000580 0.000000616
>> 18 0.000000646 0.000000668
>> 19 0.000000710 0.000000763
>> 20 0.000000788 0.000000822
>> 21 0.000000868 0.000000915
>> 22 0.000000935 0.000001000
>> 23 0.000001021 0.000001098
>> 24 0.000001107 0.000001170
>> 25 0.000001191 0.000001274
>> 26 0.000001286 0.000001372
>> 27 0.000001379 0.000001492
>> 28 0.000001496 0.000001577
>> 29 0.000001597 0.000001700
>> 30 0.000001704 0.000001802
>> 31 0.000001810 0.000001942
>> 32 0.000001927 0.000002038
>> 33 0.000002048 0.000002187
>> 34 0.000002167 0.000002299
>> 35 0.000002283 0.000002461
>> 36 0.000002409 0.000002577
>> 37 0.000002537 0.000002724
>> 38 0.000002672 0.000002864
>> 39 0.000002805 0.000003065
>> 40 0.000002959 0.000003180
>>
>> This is with
>>
>> gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
>> Linux 3.8.0-36-generic #52~precise1-Ubuntu SMP Mon Feb 3 21:54:46 UTC
>> 2014 x86_64
>>
>> ~/src/mpir/tune > cat /proc/cpuinfo
>> processor    : 0
>> vendor_id    : AuthenticAMD
>> cpu family    : 20
>> model        : 1
>> model name    : AMD E-350 Processor
>> stepping    : 0
>> microcode    : 0x5000029
>> cpu MHz        : 1600.000
>> cache size    : 512 KB
>> physical id    : 0
>> siblings    : 2
>> core id        : 0
>> cpu cores    : 2
>> apicid        : 0
>> initial apicid    : 0
>> fpu        : yes
>> fpu_exception    : yes
>> cpuid level    : 6
>> wp        : yes
>> flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
>> pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid
>> aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic
>> cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt arat
>> hw_pstate npt lbrv svm_lock nrip_save pausefilter
>> bogomips    : 3193.07
>> TLB size    : 1024 4K pages
>> clflush size    : 64
>> cache_alignment    : 64
>> address sizes    : 36 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor    : 1
>> vendor_id    : AuthenticAMD
>> cpu family    : 20
>> [...]
>>
>> -- Frithjof
>>
>> The raw output was:
>>
>> First try with mpn/x86_64/bobcat/sqr_basecase.as
>>
>> overhead 0.000000004 secs, precision 1000000 units of 6.25e-10 secs, CPU
>> freq 1600.00 MHz
>>         mpn_sqr_basecase
>> 1         0.000000006
>> 2         0.000000012
>> 3         0.000000031
>> 4         0.000000056
>> 5         0.000000079
>> 6         0.000000100
>> 7         0.000000128
>> 8         0.000000159
>> 9         0.000000191
>> 10        0.000000228
>> 11        0.000000270
>> 12        0.000000318
>> 13        0.000000368
>> 14        0.000000418
>> 15        0.000000465
>> 16        0.000000527
>> 17        0.000000580
>> 18        0.000000646
>> 19        0.000000710
>> 20        0.000000788
>> 21        0.000000868
>> 22        0.000000935
>> 23        0.000001021
>> 24        0.000001107
>> 25        0.000001191
>> 26        0.000001286
>> 27        0.000001379
>> 28        0.000001496
>> 29        0.000001597
>> 30        0.000001704
>> 31        0.000001810
>> 32        0.000001927
>> 33        0.000002048
>> 34        0.000002167
>> 35        0.000002283
>> 36        0.000002409
>> 37        0.000002537
>> 38        0.000002672
>> 39        0.000002805
>> 40        0.000002959
>>
>> Second try with mpn/x86_64/bobcat/sqr_basecase.as removed
>>
>> overhead 0.000000004 secs, precision 1000000 units of 6.25e-10 secs, CPU
>> freq 1600.00 MHz
>>         mpn_sqr_basecase
>> 1         0.000000009
>> 2         0.000000017
>> 3         0.000000036
>> 4         0.000000054
>> 5         0.000000069
>> 6         0.000000094
>> 7         0.000000128
>> 8         0.000000151
>> 9         0.000000185
>> 10        0.000000223
>> 11        0.000000281
>> 12        0.000000320
>> 13        0.000000376
>> 14        0.000000419
>> 15        0.000000496
>> 16        0.000000539
>> 17        0.000000616
>> 18        0.000000668
>> 19        0.000000763
>> 20        0.000000822
>> 21        0.000000915
>> 22        0.000001000
>> 23        0.000001098
>> 24        0.000001170
>> 25        0.000001274
>> 26        0.000001372
>> 27        0.000001492
>> 28        0.000001577
>> 29        0.000001700
>> 30        0.000001802
>> 31        0.000001942
>> 32        0.000002038
>> 33        0.000002187
>> 34        0.000002299
>> 35        0.000002461
>> 36        0.000002577
>> 37        0.000002724
>> 38        0.000002864
>> 39        0.000003065
>> 40        0.000003180
>>
>>
>>
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mpir-devel+unsubscr...@googlegroups.com.
To post to this group, send email to mpir-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/mpir-devel.
For more options, visit https://groups.google.com/d/optout.

Reply via email to