[mpir-devel] Re: Future MPIR compatibility with GMP?

Gianrico Fini Sun, 10 Jan 2010 09:47:35 -0800

I see the comparison is more balanced on you machine. The impressive
unbalance is in the numbers of the Mersenne and Fermat test.


By the way, I tested it all again with gcc-4.4 and with the result
from (cd tune;make tune) inserted in gmp-mparam.h... for all the three
libraries.

Results for GMP-5.0.0 did not change (more or less). Results for
GMP-4.3.2 have improved a little. MPIR-1.3.0-rc4... unfortunately
changed... see below:

--GMP-4.3.2--
$ mpir_bench_two/bench_two_gmp

Running MPIR benchmark
GenuineIntel Family 6 Model 9 Stepping 5
Intel(R) Pentium(R) M processor 1400MHz
Speed: 1.40 GHz (reported)
 Category base
  Program multiply (weight 1.00)
         128         0 =>  8679016
         512         0 =>  1474201
        8192         0 =>    15863
      131072         0 =>      276
     2097152         0 =>     11.8
         128         0 =>  7055718
         512         0 =>   942638
        8192         0 =>    10977
      131072         0 =>      195
     2097152         0 =>     8.05
       15000         0 =>     5182
       20000         0 =>     3953
       30000         0 =>     2372
    16777216         0 =>     27.5
    16777216         0 =>     1.46 =>  3505, 2504
  Program divide (weight 1.00)
        8192         0 =>   210230
        8192         0 =>   169191
        8192         0 =>    51893
        8192         0 =>    14614
      131072         0 =>      202
     8388608         0 =>    0.677
        8192         0 =>   223045
    16777216         0 =>    0.426 =>  2080, 1486
  Program gcd (weight 0.50)
         128         0 =>   405431
         512         0 =>    62607
        8192         0 =>     1151
      131072         0 =>     14.5
     1048576         0 =>    0.701 =>   785,  560
  Program gcdext (weight 0.50)
         128         0 =>   269312
         512         0 =>    40694
        8192         0 =>      585
      131072         0 =>     8.23
     1048576         0 =>    0.436 =>   470,  336
  Program root (weight 0.30)
         128         0 =>   252938
         512         0 =>   171558
        8192         0 =>    24178
      131072         0 =>      305
     1048576         0 =>     16.1 =>  5525, 3946
  Program fac_ui (weight 0.20)
         128         0 =>   409654
        1512         0 =>     5605
       15000         0 =>     98.7
     1000010         0 =>    0.283
     2123456         0 =>    0.106 =>  92.6, 66.2 =>  1546, 1104
 Category app
  Program rsa (weight 1.00)
                   512 =>     2478
                  1024 =>      413
                  2048 =>     61.6 =>   398,  284
  Program pi (weight 1.00)
                 10000 =>      104
                100000 =>     3.90
               1000000 =>    0.218 =>  4.46, 3.18
  Program bpsw (weight 1.00)
                  1024 =>     74.5
                  4096 =>     2.34
                 16384 =>   0.0706 =>  2.31, 1.65
  Program wagstaff (weight 1.00)
                  1024 =>      327
                  4096 =>     10.9
                 16384 =>    0.342 =>  10.7, 7.62
  Program mersenne (weight 1.00)
                  3217 =>     4.30
                  4253 =>     2.15
                  4423 =>     1.94
                  9689 =>    0.249
                 11213 =>    0.182 => 0.959,0.685
  Program fermat (weight 1.00)
                     8 =>     1657
                    10 =>     83.6
                    12 =>     2.52 =>  70.4, 50.3 =>  12.0, 8.56 =>
136, 97.2

--MPIR-1.3.0-rc4--
$ mpir_bench_two/bench_two

Running MPIR benchmark
GenuineIntel Family 6 Model 9 Stepping 5
Intel(R) Pentium(R) M processor 1400MHz
Speed: 1.40 GHz (reported)
 Category base
  Program multiply (weight 1.00)
         128         0 =>  7648002
         512         0 =>   810865
        8192         0 =>    12089
      131072         0 =>      285
     2097152         0 =>     10.9
         128         0 =>  7683516
         512         0 =>   813577
        8192         0 =>     9003
      131072         0 =>      200
     2097152         0 =>     7.52
       15000         0 =>     4441
       20000         0 =>     3452
       30000         0 =>     2045
    16777216         0 =>     23.2
    16777216         0 =>     1.44 =>  3073, 2195
  Program divide (weight 1.00)
        8192         0 =>   217532
        8192         0 =>    85041
        8192         0 =>    54990
        8192         0 =>    14711
      131072         0 =>      181
     8388608         0 =>    0.645
        8192         0 =>  1111505
    16777216         0 =>    0.384 =>  2286, 1633
  Program gcd (weight 0.50)
         128         0 =>   331517
         512         0 =>    44950
        8192         0 =>     1114
      131072         0 =>     12.6
     1048576         0 =>    0.649 =>   671,  479
  Program gcdext (weight 0.50)
         128         0 =>   206896
         512         0 =>    34492
        8192         0 =>      490
      131072         0 =>     7.41
     1048576         0 =>    0.414 =>   404,  288
  Program root (weight 0.30)
         128         0 =>   320216
         512         0 =>   130654
        8192         0 =>    10332
      131072         0 =>     94.8
     1048576         0 =>     6.29 =>  3035, 2168
  Program fac_ui (weight 0.20)
         128         0 =>   326635
        1512         0 =>     4571
mul_fft.c:2346: GNU MP assertion failed: cc == 0
       15000         0 =>      119Aborted

...Yes, it started with some slightly better result (I see that tuning
is useful for multiplication), then aborted. Changing tuned values
give instability? Is it a problem with the new compiler? I do not
know...
GMP5 seems more stable and supporting my platform, I'll start using
it.
Thanks for all the informations and keep up the nice work.

Gian.

On 10 Gen, 18:20, Bill Hart <goodwillh...@googlemail.com> wrote:
> Sure, it's interesting to compare.
>
> On my 64 bit machine (Selmer):
>
> K10-2:
>
> Squaring:          MPIR 1.3.0 GMP 5.0.0
> =======         ========  ========
> 128 x 128 :          56715728   55997671
> 512 x 512 :          11350749   13487276
> 8192 x 8192 :          149696      151687
> 131072 x 131072 :       2512         2640
> 2097152 x 2097152 :    94.2          81.6
>
> Multiplication:
> ==========
> 128 x 128 :           57689204  56006766
> 512 x 512 :           11350738  10179077
> 8192 x 8192 :           104945     101532
> 131072 x 131072 :       1856         1848
> 2097152 x 2097152 :     65.7         54.8
>
> Unbalanced:
> ==========
> 15000 x 10000 :          51197      51819
> 20000 x 10000 :          40086      38484
> 30000 x 10000 :          23539      24674
> 16777216 x 512 :            392         456
> 16777216 x 262144 :      10.7        12.9
>
> Division :
> =========
> 8192 / 32 :               1420564  1318523
> 8192 / 64 :               1155167  1334473
> 8192 / 128 :               624077    805567
> 8192 / 4096 :             171758    249209
> 8192 / 8064 :            7084081 8455199
> 131072 / 65536 :            1992      2588
> 8388608 / 4194304 :       5.86       11.5
> 16777216 / 262144 :       4.03       6.94
>
> GCD :
> ====
> 128 x 128 :               1820216 1971827
> 512 x 512 :                 168623  221378
> 8192 x 8192 :                 5560      6321
> 131072 x 131072 :            115       121
> 1048576 x 1048576 :       5.93      6.27
>
> XGCD :
> =====
> 128 x 128 :                 682582  884318
> 512 x 512 :                 122152  154781
> 8192 x 8192 :                 3826     4339
> 131072 x 131072 :          73.3       76.1
> 1048576 x 1048576 :       3.89      4.22
>
> Root:
> ====
> 128 x 5 :                    996836  557837
> 512 x 3 :                    358609  446327
> 8192 x 11 :                  93080  141224
> 131072 x 3 :                  1016     3441
> 1048576 x 3 :                 55.8      166
>
> Fac_ui:
> =====
> 128 :                        1385073 1467919
> 1512 :                          46727    46355
> 10000 :                          1046      1046
> 1000010 :                       3.51       2.23
> 2123456 :                       1.27      0.796
>
> RSA :
> ====
> 512 :                            20478   21112
> 1024 :                            4488     4065
> 2048 :                              762      736
>
> Pi :
> ===
> 10000 :                            398      389
> 100000 :                          23.0    23.2
> 1000000 :                        1.36    1.32
>
> BPSW:
> =====
> 1024 :                              935    1483
> 4096 :                             26.2    31.2
> 16384 :                          0.714   0.871
>
> Wagstaff:
> ======
> 1024 :                            2307     2706
> 4096 :                             89.6     96.0
> 16384 :                           2.86     2.96
>
> Mersenne:
> =======
> 3217 :                             138      43.6
> 4253 :                            67.6      21.8
> 4423 :                            59.7      20.1
> 9689 :                            8.27      2.67
> 11213 :                          5.77      1.85
>
> Fermat:
> =====
> 8 :                              87725      6791
> 10 :                             3241        635
> 12 :                              80.3       25.0
>
> Overall:
> =======
>                                   1364       1186
>
> So a mixed bag really. I'm less impressed with the unbalanced
> multiplication than I was 10 minutes ago. :-(
>
> Clearly their division code has improved and our cube root code still
> sucks and our gcd and xgcd still needs optimising (that one file I
> keep carrying on about). Nothing else is jumping out at me.
>
> Bill.
>
> 2010/1/10 Gianrico Fini <gianrico.f...@gmail.com>:
>
> > Sorry, I don't have a 64-bit processor... I'm working on somehow old
> > hardware, usually.
>
> > Anyway I think there is also another problem in my measure, I did not
> > "tune".
> > I was trying an update of the compiler to gcc-4.4...
> > Then I'll recompile the three libraries, retune them, recompile again,
> > and test.
> > If you are interested, I'll send the new result here again.
>
> > On 10 Gen, 17:09, Bill Hart <goodwillh...@googlemail.com> wrote:
> >> Thanks very much for taking the time to run those!!
>
> >> We suffer a little here because of suboptimal assembly code we provide
> >> for your (32 bit?) Pentium M processor, as can be seen from the
> >> multiply scores for small sizes (which are dominated by the assembly
> >> performance).
>
> >> MPIR 1.3:
>
> >>         8192         0 =>     8864
>
> >> GMP 5.0:
>
> >>         8192         0 =>    11419
>
> >> Even if we adjust for that, however, the GMP unbalanced multiply
> >> scores are still exceptional:
>
> >> MPIR 1.3:
>
> >>        15000         0 =>     4503
> >>        20000         0 =>     3481
> >>        30000         0 =>     2069
>
> >> GMP 5.0:
>
> >>        15000         0 =>     5482 (4254 adj.)
> >>        20000         0 =>     4619 (3584 adj.)
> >>        30000         0 =>     2929 (2272 adj.)
>
> >> Assuming my adjustment for the assembly bias is valid (questionable),
> >> it is clear they are getting up to 10% improvement over us with their
> >> higher unbalanced Toom functions. Pretty good work on their part!!
>
> >> I'd be curious to compare on a 64 bit machine where there should be
> >> little to no assembly bias. It looks to me that perhaps we still come
> >> out around the same on the pi test.
>
> >> Bill.
>
> >> 2010/1/10 Gianrico Fini <gianrico.f...@gmail.com>:
>
> >> > I tried, on my laptop. I couldn't work with their own test, so I used the
> >> > one I've found on MPIR main page. I paste here the result (I'm running
> >> > Gentoo, gcc-4.3.4).
>
> >> > --GMP-4.3.2--
> >> > $ mpir_bench_two/bench_two_gmp
>
> >> > Running MPIR benchmark
> >> > GenuineIntel Family 6 Model 9 Stepping 5
> >> > Intel(R) Pentium(R) M processor 1400MHz
> >> > Speed: 1.40 GHz (reported)
> >> >  Category base
> >> >   Program multiply (weight 1.00)
> >> >          128         0 =>  7873205
> >> >          512         0 =>  1412472
> >> >         8192         0 =>    15535
> >> >       131072         0 =>      252
> >> >      2097152         0 =>     10.9
> >> >          128         0 =>  7014812
> >> >          512         0 =>   895116
> >> >         8192         0 =>    10290
> >> >       131072         0 =>      170
> >> >      2097152         0 =>     7.41
> >> >        15000         0 =>     4916
> >> >        20000         0 =>     3796
> >> >        30000         0 =>     2333
> >> >     16777216         0 =>     27.4
> >> >     16777216         0 =>     1.37 =>  3313, 2366
> >> >   Program divide (weight 1.00)
> >> >         8192         0 =>   208271
> >> >         8192         0 =>   163357
> >> >         8192         0 =>    46225
> >> >         8192         0 =>    13306
> >> >       131072         0 =>      195
> >> >      8388608         0 =>    0.636
> >> >         8192         0 =>   219948
> >> >     16777216         0 =>    0.375 =>  1955, 1397
> >> >   Program gcd (weight 0.50)
> >> >          128         0 =>   379235
> >> >          512         0 =>    63447
> >> >         8192         0 =>     1155
> >> >       131072         0 =>     13.0
> >> >      1048576         0 =>    0.667 =>   752,  537
> >> >   Program gcdext (weight 0.50)
> >> >          128         0 =>   266774
> >> >          512         0 =>    39311
> >> >         8192         0 =>      576
> >> >       131072         0 =>     7.67
> >> >      1048576         0 =>    0.429 =>   457,  326
> >> >   Program root (weight 0.30)
> >> >          128         0 =>   254520
> >> >          512         0 =>   174983
> >> >         8192         0 =>    23189
> >> >       131072         0 =>      285
> >> >      1048576         0 =>     15.1 =>  5365, 3832
> >> >   Program fac_ui (weight 0.20)
> >> >          128         0 =>   392143
> >> >         1512         0 =>     5420
> >> >        15000         0 =>      100
> >> >      1000010         0 =>    0.272
> >> >      2123456         0 =>   0.0989 =>  89.4, 63.9 =>  1473, 1052
> >> >  Category app
> >> >   Program rsa (weight 1.00)
> >> >                    512 =>     2318
> >> >                   1024 =>      397
> >> >                   2048 =>     60.0 =>   381,  272
> >> >   Program pi (weight 1.00)
> >> >                  10000 =>     89.4
> >> >                 100000 =>     3.69
> >> >                1000000 =>    0.202 =>  4.05, 2.90
> >> >   Program bpsw (weight 1.00)
> >> >                   1024 =>     68.1
> >> >                   4096 =>     2.12
> >> >                  16384 =>   0.0661 =>  2.12, 1.52
> >> >   Program wagstaff (weight 1.00)
> >> >                   1024 =>      316
> >> >                   4096 =>     10.2
> >> >                  16384 =>    0.323 =>  10.1, 7.23
> >> >   Program mersenne (weight 1.00)
> >> >                   3217 =>     4.17
> >> >                   4253 =>     2.11
> >> >                   4423 =>     1.78
> >> >                   9689 =>    0.255
> >> >                  11213 =>    0.175 => 0.931,0.665
> >> >   Program fermat (weight 1.00)
>
> ...
>
> leggi tutto

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-de...@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

[mpir-devel] Re: Future MPIR compatibility with GMP?

Reply via email to