Re: [mpir-devel] Re: Future MPIR compatibility with GMP?

Bill Hart Sun, 10 Jan 2010 10:03:10 -0800

Actually, if you take the tuning values that are missing at the end of
the tuning file and replace them with the values we supplied
(MUL_FFT_TABLE, SQR_FFT_TABLE, etc) you will probably find the
assertion goes away. The assertion was left there on purpose to
indicate that the FFT is most definitely using wrong tuning values and
giving wrong results. Only the tuning values we supplied are
guaranteed to not trigger it!


Bill.

2010/1/10 Bill Hart <goodwillh...@googlemail.com>:
> You'll find that all the very large stuff will be markedly worse after
> tuning. One of the reasons we are replacing the FFT is that at present
> the only people that can tune it are us, and it takes hours!
>
> But thanks again for the comparisons.
>
> Bill.
>
> 2010/1/10 Gianrico Fini <gianrico.f...@gmail.com>:
>> I see the comparison is more balanced on you machine. The impressive
>> unbalance is in the numbers of the Mersenne and Fermat test.
>>
>> By the way, I tested it all again with gcc-4.4 and with the result
>> from (cd tune;make tune) inserted in gmp-mparam.h... for all the three
>> libraries.
>>
>> Results for GMP-5.0.0 did not change (more or less). Results for
>> GMP-4.3.2 have improved a little. MPIR-1.3.0-rc4... unfortunately
>> changed... see below:
>>
>> --GMP-4.3.2--
>> $ mpir_bench_two/bench_two_gmp
>>
>> Running MPIR benchmark
>> GenuineIntel Family 6 Model 9 Stepping 5
>> Intel(R) Pentium(R) M processor 1400MHz
>> Speed: 1.40 GHz (reported)
>>  Category base
>>  Program multiply (weight 1.00)
>>         128         0 =>  8679016
>>         512         0 =>  1474201
>>        8192         0 =>    15863
>>      131072         0 =>      276
>>     2097152         0 =>     11.8
>>         128         0 =>  7055718
>>         512         0 =>   942638
>>        8192         0 =>    10977
>>      131072         0 =>      195
>>     2097152         0 =>     8.05
>>       15000         0 =>     5182
>>       20000         0 =>     3953
>>       30000         0 =>     2372
>>    16777216         0 =>     27.5
>>    16777216         0 =>     1.46 =>  3505, 2504
>>  Program divide (weight 1.00)
>>        8192         0 =>   210230
>>        8192         0 =>   169191
>>        8192         0 =>    51893
>>        8192         0 =>    14614
>>      131072         0 =>      202
>>     8388608         0 =>    0.677
>>        8192         0 =>   223045
>>    16777216         0 =>    0.426 =>  2080, 1486
>>  Program gcd (weight 0.50)
>>         128         0 =>   405431
>>         512         0 =>    62607
>>        8192         0 =>     1151
>>      131072         0 =>     14.5
>>     1048576         0 =>    0.701 =>   785,  560
>>  Program gcdext (weight 0.50)
>>         128         0 =>   269312
>>         512         0 =>    40694
>>        8192         0 =>      585
>>      131072         0 =>     8.23
>>     1048576         0 =>    0.436 =>   470,  336
>>  Program root (weight 0.30)
>>         128         0 =>   252938
>>         512         0 =>   171558
>>        8192         0 =>    24178
>>      131072         0 =>      305
>>     1048576         0 =>     16.1 =>  5525, 3946
>>  Program fac_ui (weight 0.20)
>>         128         0 =>   409654
>>        1512         0 =>     5605
>>       15000         0 =>     98.7
>>     1000010         0 =>    0.283
>>     2123456         0 =>    0.106 =>  92.6, 66.2 =>  1546, 1104
>>  Category app
>>  Program rsa (weight 1.00)
>>                   512 =>     2478
>>                  1024 =>      413
>>                  2048 =>     61.6 =>   398,  284
>>  Program pi (weight 1.00)
>>                 10000 =>      104
>>                100000 =>     3.90
>>               1000000 =>    0.218 =>  4.46, 3.18
>>  Program bpsw (weight 1.00)
>>                  1024 =>     74.5
>>                  4096 =>     2.34
>>                 16384 =>   0.0706 =>  2.31, 1.65
>>  Program wagstaff (weight 1.00)
>>                  1024 =>      327
>>                  4096 =>     10.9
>>                 16384 =>    0.342 =>  10.7, 7.62
>>  Program mersenne (weight 1.00)
>>                  3217 =>     4.30
>>                  4253 =>     2.15
>>                  4423 =>     1.94
>>                  9689 =>    0.249
>>                 11213 =>    0.182 => 0.959,0.685
>>  Program fermat (weight 1.00)
>>                     8 =>     1657
>>                    10 =>     83.6
>>                    12 =>     2.52 =>  70.4, 50.3 =>  12.0, 8.56 =>
>> 136, 97.2
>>
>> --MPIR-1.3.0-rc4--
>> $ mpir_bench_two/bench_two
>>
>> Running MPIR benchmark
>> GenuineIntel Family 6 Model 9 Stepping 5
>> Intel(R) Pentium(R) M processor 1400MHz
>> Speed: 1.40 GHz (reported)
>>  Category base
>>  Program multiply (weight 1.00)
>>         128         0 =>  7648002
>>         512         0 =>   810865
>>        8192         0 =>    12089
>>      131072         0 =>      285
>>     2097152         0 =>     10.9
>>         128         0 =>  7683516
>>         512         0 =>   813577
>>        8192         0 =>     9003
>>      131072         0 =>      200
>>     2097152         0 =>     7.52
>>       15000         0 =>     4441
>>       20000         0 =>     3452
>>       30000         0 =>     2045
>>    16777216         0 =>     23.2
>>    16777216         0 =>     1.44 =>  3073, 2195
>>  Program divide (weight 1.00)
>>        8192         0 =>   217532
>>        8192         0 =>    85041
>>        8192         0 =>    54990
>>        8192         0 =>    14711
>>      131072         0 =>      181
>>     8388608         0 =>    0.645
>>        8192         0 =>  1111505
>>    16777216         0 =>    0.384 =>  2286, 1633
>>  Program gcd (weight 0.50)
>>         128         0 =>   331517
>>         512         0 =>    44950
>>        8192         0 =>     1114
>>      131072         0 =>     12.6
>>     1048576         0 =>    0.649 =>   671,  479
>>  Program gcdext (weight 0.50)
>>         128         0 =>   206896
>>         512         0 =>    34492
>>        8192         0 =>      490
>>      131072         0 =>     7.41
>>     1048576         0 =>    0.414 =>   404,  288
>>  Program root (weight 0.30)
>>         128         0 =>   320216
>>         512         0 =>   130654
>>        8192         0 =>    10332
>>      131072         0 =>     94.8
>>     1048576         0 =>     6.29 =>  3035, 2168
>>  Program fac_ui (weight 0.20)
>>         128         0 =>   326635
>>        1512         0 =>     4571
>> mul_fft.c:2346: GNU MP assertion failed: cc == 0
>>       15000         0 =>      119Aborted
>>
>> ...Yes, it started with some slightly better result (I see that tuning
>> is useful for multiplication), then aborted. Changing tuned values
>> give instability? Is it a problem with the new compiler? I do not
>> know...
>> GMP5 seems more stable and supporting my platform, I'll start using
>> it.
>> Thanks for all the informations and keep up the nice work.
>>
>> Gian.
>>
>> On 10 Gen, 18:20, Bill Hart <goodwillh...@googlemail.com> wrote:
>>> Sure, it's interesting to compare.
>>>
>>> On my 64 bit machine (Selmer):
>>>
>>> K10-2:
>>>
>>> Squaring:          MPIR 1.3.0 GMP 5.0.0
>>> =======         ========  ========
>>> 128 x 128 :          56715728   55997671
>>> 512 x 512 :          11350749   13487276
>>> 8192 x 8192 :          149696      151687
>>> 131072 x 131072 :       2512         2640
>>> 2097152 x 2097152 :    94.2          81.6
>>>
>>> Multiplication:
>>> ==========
>>> 128 x 128 :           57689204  56006766
>>> 512 x 512 :           11350738  10179077
>>> 8192 x 8192 :           104945     101532
>>> 131072 x 131072 :       1856         1848
>>> 2097152 x 2097152 :     65.7         54.8
>>>
>>> Unbalanced:
>>> ==========
>>> 15000 x 10000 :          51197      51819
>>> 20000 x 10000 :          40086      38484
>>> 30000 x 10000 :          23539      24674
>>> 16777216 x 512 :            392         456
>>> 16777216 x 262144 :      10.7        12.9
>>>
>>> Division :
>>> =========
>>> 8192 / 32 :               1420564  1318523
>>> 8192 / 64 :               1155167  1334473
>>> 8192 / 128 :               624077    805567
>>> 8192 / 4096 :             171758    249209
>>> 8192 / 8064 :            7084081 8455199
>>> 131072 / 65536 :            1992      2588
>>> 8388608 / 4194304 :       5.86       11.5
>>> 16777216 / 262144 :       4.03       6.94
>>>
>>> GCD :
>>> ====
>>> 128 x 128 :               1820216 1971827
>>> 512 x 512 :                 168623  221378
>>> 8192 x 8192 :                 5560      6321
>>> 131072 x 131072 :            115       121
>>> 1048576 x 1048576 :       5.93      6.27
>>>
>>> XGCD :
>>> =====
>>> 128 x 128 :                 682582  884318
>>> 512 x 512 :                 122152  154781
>>> 8192 x 8192 :                 3826     4339
>>> 131072 x 131072 :          73.3       76.1
>>> 1048576 x 1048576 :       3.89      4.22
>>>
>>> Root:
>>> ====
>>> 128 x 5 :                    996836  557837
>>> 512 x 3 :                    358609  446327
>>> 8192 x 11 :                  93080  141224
>>> 131072 x 3 :                  1016     3441
>>> 1048576 x 3 :                 55.8      166
>>>
>>> Fac_ui:
>>> =====
>>> 128 :                        1385073 1467919
>>> 1512 :                          46727    46355
>>> 10000 :                          1046      1046
>>> 1000010 :                       3.51       2.23
>>> 2123456 :                       1.27      0.796
>>>
>>> RSA :
>>> ====
>>> 512 :                            20478   21112
>>> 1024 :                            4488     4065
>>> 2048 :                              762      736
>>>
>>> Pi :
>>> ===
>>> 10000 :                            398      389
>>> 100000 :                          23.0    23.2
>>> 1000000 :                        1.36    1.32
>>>
>>> BPSW:
>>> =====
>>> 1024 :                              935    1483
>>> 4096 :                             26.2    31.2
>>> 16384 :                          0.714   0.871
>>>
>>> Wagstaff:
>>> ======
>>> 1024 :                            2307     2706
>>> 4096 :                             89.6     96.0
>>> 16384 :                           2.86     2.96
>>>
>>> Mersenne:
>>> =======
>>> 3217 :                             138      43.6
>>> 4253 :                            67.6      21.8
>>> 4423 :                            59.7      20.1
>>> 9689 :                            8.27      2.67
>>> 11213 :                          5.77      1.85
>>>
>>> Fermat:
>>> =====
>>> 8 :                              87725      6791
>>> 10 :                             3241        635
>>> 12 :                              80.3       25.0
>>>
>>> Overall:
>>> =======
>>>                                   1364       1186
>>>
>>> So a mixed bag really. I'm less impressed with the unbalanced
>>> multiplication than I was 10 minutes ago. :-(
>>>
>>> Clearly their division code has improved and our cube root code still
>>> sucks and our gcd and xgcd still needs optimising (that one file I
>>> keep carrying on about). Nothing else is jumping out at me.
>>>
>>> Bill.
>>>
>>> 2010/1/10 Gianrico Fini <gianrico.f...@gmail.com>:
>>>
>>> > Sorry, I don't have a 64-bit processor... I'm working on somehow old
>>> > hardware, usually.
>>>
>>> > Anyway I think there is also another problem in my measure, I did not
>>> > "tune".
>>> > I was trying an update of the compiler to gcc-4.4...
>>> > Then I'll recompile the three libraries, retune them, recompile again,
>>> > and test.
>>> > If you are interested, I'll send the new result here again.
>>>
>>> > On 10 Gen, 17:09, Bill Hart <goodwillh...@googlemail.com> wrote:
>>> >> Thanks very much for taking the time to run those!!
>>>
>>> >> We suffer a little here because of suboptimal assembly code we provide
>>> >> for your (32 bit?) Pentium M processor, as can be seen from the
>>> >> multiply scores for small sizes (which are dominated by the assembly
>>> >> performance).
>>>
>>> >> MPIR 1.3:
>>>
>>> >>         8192         0 =>     8864
>>>
>>> >> GMP 5.0:
>>>
>>> >>         8192         0 =>    11419
>>>
>>> >> Even if we adjust for that, however, the GMP unbalanced multiply
>>> >> scores are still exceptional:
>>>
>>> >> MPIR 1.3:
>>>
>>> >>        15000         0 =>     4503
>>> >>        20000         0 =>     3481
>>> >>        30000         0 =>     2069
>>>
>>> >> GMP 5.0:
>>>
>>> >>        15000         0 =>     5482 (4254 adj.)
>>> >>        20000         0 =>     4619 (3584 adj.)
>>> >>        30000         0 =>     2929 (2272 adj.)
>>>
>>> >> Assuming my adjustment for the assembly bias is valid (questionable),
>>> >> it is clear they are getting up to 10% improvement over us with their
>>> >> higher unbalanced Toom functions. Pretty good work on their part!!
>>>
>>> >> I'd be curious to compare on a 64 bit machine where there should be
>>> >> little to no assembly bias. It looks to me that perhaps we still come
>>> >> out around the same on the pi test.
>>>
>>> >> Bill.
>>>
>>> >> 2010/1/10 Gianrico Fini <gianrico.f...@gmail.com>:
>>>
>>> >> > I tried, on my laptop. I couldn't work with their own test, so I used 
>>> >> > the
>>> >> > one I've found on MPIR main page. I paste here the result (I'm running
>>> >> > Gentoo, gcc-4.3.4).
>>>
>>> >> > --GMP-4.3.2--
>>> >> > $ mpir_bench_two/bench_two_gmp
>>>
>>> >> > Running MPIR benchmark
>>> >> > GenuineIntel Family 6 Model 9 Stepping 5
>>> >> > Intel(R) Pentium(R) M processor 1400MHz
>>> >> > Speed: 1.40 GHz (reported)
>>> >> >  Category base
>>> >> >   Program multiply (weight 1.00)
>>> >> >          128         0 =>  7873205
>>> >> >          512         0 =>  1412472
>>> >> >         8192         0 =>    15535
>>> >> >       131072         0 =>      252
>>> >> >      2097152         0 =>     10.9
>>> >> >          128         0 =>  7014812
>>> >> >          512         0 =>   895116
>>> >> >         8192         0 =>    10290
>>> >> >       131072         0 =>      170
>>> >> >      2097152         0 =>     7.41
>>> >> >        15000         0 =>     4916
>>> >> >        20000         0 =>     3796
>>> >> >        30000         0 =>     2333
>>> >> >     16777216         0 =>     27.4
>>> >> >     16777216         0 =>     1.37 =>  3313, 2366
>>> >> >   Program divide (weight 1.00)
>>> >> >         8192         0 =>   208271
>>> >> >         8192         0 =>   163357
>>> >> >         8192         0 =>    46225
>>> >> >         8192         0 =>    13306
>>> >> >       131072         0 =>      195
>>> >> >      8388608         0 =>    0.636
>>> >> >         8192         0 =>   219948
>>> >> >     16777216         0 =>    0.375 =>  1955, 1397
>>> >> >   Program gcd (weight 0.50)
>>> >> >          128         0 =>   379235
>>> >> >          512         0 =>    63447
>>> >> >         8192         0 =>     1155
>>> >> >       131072         0 =>     13.0
>>> >> >      1048576         0 =>    0.667 =>   752,  537
>>> >> >   Program gcdext (weight 0.50)
>>> >> >          128         0 =>   266774
>>> >> >          512         0 =>    39311
>>> >> >         8192         0 =>      576
>>> >> >       131072         0 =>     7.67
>>> >> >      1048576         0 =>    0.429 =>   457,  326
>>> >> >   Program root (weight 0.30)
>>> >> >          128         0 =>   254520
>>> >> >          512         0 =>   174983
>>> >> >         8192         0 =>    23189
>>> >> >       131072         0 =>      285
>>> >> >      1048576         0 =>     15.1 =>  5365, 3832
>>> >> >   Program fac_ui (weight 0.20)
>>> >> >          128         0 =>   392143
>>> >> >         1512         0 =>     5420
>>> >> >        15000         0 =>      100
>>> >> >      1000010         0 =>    0.272
>>> >> >      2123456         0 =>   0.0989 =>  89.4, 63.9 =>  1473, 1052
>>> >> >  Category app
>>> >> >   Program rsa (weight 1.00)
>>> >> >                    512 =>     2318
>>> >> >                   1024 =>      397
>>> >> >                   2048 =>     60.0 =>   381,  272
>>> >> >   Program pi (weight 1.00)
>>> >> >                  10000 =>     89.4
>>> >> >                 100000 =>     3.69
>>> >> >                1000000 =>    0.202 =>  4.05, 2.90
>>> >> >   Program bpsw (weight 1.00)
>>> >> >                   1024 =>     68.1
>>> >> >                   4096 =>     2.12
>>> >> >                  16384 =>   0.0661 =>  2.12, 1.52
>>> >> >   Program wagstaff (weight 1.00)
>>> >> >                   1024 =>      316
>>> >> >                   4096 =>     10.2
>>> >> >                  16384 =>    0.323 =>  10.1, 7.23
>>> >> >   Program mersenne (weight 1.00)
>>> >> >                   3217 =>     4.17
>>> >> >                   4253 =>     2.11
>>> >> >                   4423 =>     1.78
>>> >> >                   9689 =>    0.255
>>> >> >                  11213 =>    0.175 => 0.931,0.665
>>> >> >   Program fermat (weight 1.00)
>>>
>>> ...
>>>
>>> leggi tutto
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "mpir-devel" group.
>> To post to this group, send email to mpir-de...@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> mpir-devel+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/mpir-devel?hl=en.
>>
>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-de...@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Re: [mpir-devel] Re: Future MPIR compatibility with GMP?

Reply via email to