https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78379

--- Comment #2 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
Here are some measurements with the AVX-enabling patch.
They were done on an AVX machine, namely gcc75 from the compile farm.

This was done with the command line

gfortran -static-libgfortran -finline-matmul-limit=0 -Ofast -o compare_mavx
compare_2.f90

Uncontidionally setting -mavx in the Makefile for matmul, with stock trunk:

 =========================================================
 ================            MEASURED GIGAFLOPS          =
 =========================================================
                 Matmul                           Matmul
                 fixed                 Matmul     variable
 Size  Loops     explicit   refMatmul  assumed    explicit
 =========================================================
    2  5000      0.067      0.077      0.051      0.069
    3  5000      0.193      0.218      0.157      0.194
    4  5000      0.429      0.423      0.368      0.435
    5  5000      0.609      0.659      0.556      0.630
    7  5000      0.948      1.018      0.931      1.009
    8  5000      1.608      1.251      1.589      1.715
    9  5000      1.755      1.484      1.745      1.856
   15  5000      2.710      2.175      2.963      3.105
   16  5000      4.289      2.510      4.541      4.784
   17  5000      4.411      3.032      4.675      4.888
   31  5000      6.165      4.395      6.912      6.902
   32  5000      8.800      4.362      8.793      8.809
   33  5000      8.156      4.463      8.145      8.193
   63  5000      9.727      4.364      9.709      9.716
   64  5000     11.828      4.023     11.810     11.798
   65  5000     10.726      4.489     10.654     10.725
  127  3920     12.144      4.292     12.281     12.268
  128  3829     13.829      4.484     13.807     13.841
  129  3741     12.986      4.438     12.964     12.985
  255   483     14.446      4.571     14.462     14.442
  256   477     15.738      4.707     15.744     15.738
  257   472     13.981      4.565     13.995     13.990
  511    60     14.954      4.674     14.977     14.933
  512    59     16.120      4.840     16.137     16.062
  513    59     14.488      4.392     14.497     14.490
 1023     7     15.011      3.573     15.021     14.995
 1024     7     15.938      3.489     15.947     15.938
 1025     7     14.670      3.568     14.683     14.627

With library-side switching
(https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01810.html):

 =========================================================
 ================            MEASURED GIGAFLOPS          =
 =========================================================
                 Matmul                           Matmul
                 fixed                 Matmul     variable
 Size  Loops     explicit   refMatmul  assumed    explicit
 =========================================================
    2  5000      0.067      0.080      0.053      0.067
    3  5000      0.192      0.226      0.159      0.192
    4  5000      0.427      0.436      0.364      0.431
    5  5000      0.588      0.664      0.543      0.621
    7  5000      0.938      0.914      0.926      1.011
    8  5000      1.589      1.235      1.558      1.671
    9  5000      1.704      1.486      1.694      1.810
   15  5000      2.638      2.175      2.854      3.031
   16  5000      4.234      2.532      4.533      4.745
   17  5000      4.374      3.044      4.677      4.839
   31  5000      6.207      4.401      6.891      6.918
   32  5000      8.824      4.364      8.614      8.603
   33  5000      7.954      4.349      7.945      7.944
   63  5000      8.802      4.369      9.728      9.764
   64  5000     11.845      4.025     11.783     11.849
   65  5000     10.753      4.595     10.719     10.753
  127  3920     12.023      4.314     12.285     12.004
  128  3829     13.427      4.369     13.722     13.742
  129  3741     12.877      4.323     12.668     12.985
  255   483     14.398      4.453     14.336     13.496
  256   477     15.708      4.680     15.711     15.465
  257   472     13.977      4.439     13.965     13.977
  511    60     14.920      4.691     14.937     14.939
  512    59     15.959      4.787     16.084     16.082
  513    59     14.444      4.636     14.464     14.452
 1023     7     14.978      3.448     14.979     14.980
 1024     7     15.903      3.640     15.900     15.905
 1025     7     14.638      3.464     14.626     14.636

With stock trunk:

 =========================================================
 ================            MEASURED GIGAFLOPS          =
 =========================================================
                 Matmul                           Matmul
                 fixed                 Matmul     variable
 Size  Loops     explicit   refMatmul  assumed    explicit
 =========================================================
    2  5000      0.072      0.078      0.053      0.072
    3  5000      0.199      0.224      0.165      0.200
    4  5000      0.458      0.403      0.387      0.462
    5  5000      0.629      0.661      0.563      0.651
    7  5000      1.073      1.010      1.029      1.131
    8  5000      1.671      1.234      1.637      1.760
    9  5000      1.732      1.465      1.720      1.829
   15  5000      2.895      2.152      3.195      3.349
   16  5000      3.870      2.483      4.168      4.318
   17  5000      3.976      3.029      4.253      4.424
   31  5000      6.210      4.403      6.861      6.868
   32  5000      7.551      4.293      7.544      7.509
   33  5000      7.119      4.418      7.094      7.090
   63  5000      8.742      4.377      8.753      8.728
   64  5000      9.415      4.019      9.384      9.260
   65  5000      8.882      4.540      8.842      8.856
  127  3920     10.073      4.432      9.966      9.988
  128  3829     10.556      4.469     10.552     10.405
  129  3741      9.923      4.428      9.990      9.930
  255   483     10.827      4.569     10.875     10.768
  256   477     11.328      4.705     11.281     11.129
  257   472     10.402      4.492     10.344     10.360
  511    60     10.947      4.674     11.003     10.938
  512    59     11.503      4.842     11.504     11.314
  513    59     10.654      4.672     10.651     10.619
 1023     7     10.941      3.641     10.944     10.863
 1024     7     11.370      3.587     11.261     11.193
 1025     7     10.734      3.601     10.652     10.704

With inlined, -Ofast without -mavx:

 =========================================================
 ================            MEASURED GIGAFLOPS          =
 =========================================================
                 Matmul                           Matmul
                 fixed                 Matmul     variable
 Size  Loops     explicit   refMatmul  assumed    explicit
 =========================================================
    2  5000      8.979      0.078      0.154      0.241
    3  5000     14.042      0.224      0.348      0.451
    4  5000      1.686      0.435      0.500      0.707
    5  5000      1.989      0.617      0.577      0.829
    7  5000      2.163      0.846      0.783      1.123
    8  5000      3.742      1.224      0.879      1.322
    9  5000      2.764      1.420      0.996      1.458
   15  5000      3.461      2.108      1.305      2.420
   16  5000      4.395      2.589      1.619      2.901
   17  5000      5.238      3.291      1.934      3.579
   31  5000      7.207      4.434      2.347      4.385
   32  5000      7.318      4.306      2.351      4.329
   33  5000      7.204      4.466      2.052      4.421
   63  5000      4.688      4.365      2.486      4.700
   64  5000      4.246      4.022      2.480      4.664
   65  5000      4.238      4.355      2.486      4.703
  127  3920      4.411      4.427      2.821      4.340
  128  3829      4.365      4.481      2.846      4.434
  129  3741      4.427      4.441      2.828      4.396
  255   483      4.561      4.569      2.972      4.517
  256   477      4.666      4.701      2.905      4.685
  257   472      4.520      4.573      2.974      4.550
  511    60      4.669      4.675      3.075      4.666
  512    59      4.823      4.843      3.095      4.835
  513    59      4.655      4.672      3.077      4.651
 1023     7      3.555      3.563      2.718      3.554
 1024     7      3.519      3.529      2.713      3.519
 1025     7      3.527      3.543      2.715      3.536

With inline version with -mavx:

 =========================================================
 ================            MEASURED GIGAFLOPS          =
 =========================================================
                 Matmul                           Matmul
                 fixed                 Matmul     variable
 Size  Loops     explicit   refMatmul  assumed    explicit
 =========================================================
    2  5000      8.990      0.074      0.155      0.206
    3  5000      7.488      0.212      0.304      0.396
    4  5000      1.773      0.342      0.501      0.533
    5  5000      2.000      0.552      0.615      0.739
    7  5000      2.163      0.919      0.807      1.057
    8  5000      3.369      1.388      0.905      1.578
    9  5000      2.694      1.347      1.020      1.492
   15  5000      3.441      2.201      1.325      2.631
   16  5000      1.831      3.399      1.677      4.137
   17  5000      4.554      3.461      1.976      4.120
   31  5000      7.111      5.286      2.372      5.712
   32  5000      8.384      5.887      2.040      6.725
   33  5000      7.218      5.374      2.057      5.798
   63  5000      8.131      6.107      2.477      6.418
   64  5000      8.707      6.518      2.313      7.228
   65  5000      7.768      6.003      2.427      4.503
  127  3920      6.714      5.688      2.761      6.293
  128  3829      7.067      6.688      2.777      6.880
  129  3741      6.277      6.023      2.765      6.296
  255   483      6.036      5.681      2.877      5.765
  256   477      6.177      5.869      2.921      5.917
  257   472      6.017      5.687      2.880      5.766
  511    60      6.156      5.878      2.848      5.920
  512    59      6.338      6.107      3.026      6.092
  513    59      6.125      5.826      2.954      5.817
 1023     7      4.130      4.111      2.623      4.104
 1024     7      4.270      4.219      2.667      4.198
 1025     7      4.206      4.159      2.616      4.149

Reply via email to