Here is my report on Fortran benchmarking. I compare the trunk dated
20080507 (no revision number, sorry) and the IRA branch rev. 135035. I
run the Polyhedron benchmark
(http://www.polyhedron.co.uk/polyhedron_benchmark_suite0html) which is
probably the most widely used benchmark in the Fortran community. I
don't have many points, but they're very well converged (the
"standard" parameters were used, which means each test and each set of
compilation option is run between 10 and 100 times, until timing
standard deviation becomes less than 0.1%).

I compile with -march=native -ffast-math -funroll-loops -O3 and run on
a dual-core biprocessor machine with 8GB RAM. /proc/cpuinfo says it's
a Dual-Core AMD Opteron(tm) Processor 2220 running around 2.8 GHz.
Full timings are below, with a summary here:

Overall (judged by geometric mean exec time), IRA introduces a 2.2%
regression in execution time (and a 2.7% regression in compilation
time, consistent with my previous mail). Using the CB algorithm
doesn't change this significantly.

The performance regression is mainly due to one testcase, induct,
which is taking a 30% hit on IRA. If the performance of that one were
the same with IRA than with the old allocator, the switch would be
(for this benchmark) performance-neutral. So, I have investigated the
case of induct, and I found that with the IRA branch compiler without
-fira, it's already 30% slower than with trunk. So, is it an issue
with the IRA branch, or has it just not been merged recently and we
had a recent great improvement of induct on trunk? I'd appreciate if
you could enlighten me on this point.

So, other than that small question, everything seems mostly good on
the Fortran performance front.

Cheers,
FX


Comparison of execution time (see in fixed-width font):

    Benchmark   Execution time, compared to mainline
         Name        IRA     IRA-CB
    ---------   --------   --------
           ac     +1.59%     +6.80%
       aermod     +5.87%     +3.14%
          air     -0.33%     -0.83%
     capacita     +5.17%     +2.58%
      channel     +0.30%      0.00%
        doduc     -3.61%     -3.61%
      fatigue     -0.93%     -2.67%
      gas_dyn      0.99%     +2.48%
       induct    +30.28%    +29.64%
        linpk     -1.80%     -1.57%
         mdbx     -2.19%     -2.74%
           nf     +0.74%     -0.30%
      protein     +1.30%     +1.58%
       rnflow     +0.16%     +0.22%
     test_fpu     -0.83%     -0.39%
         tfft     -0.72%     +0.14%
 ----------------------------------
 geometric mean   +2.25%     +2.16%


Detailed timings for mainline:

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      7.36     1175251     11.32      15  0.0938
      aermod     90.03     2424785     38.82      14  0.0866
         air      6.83     1365405     12.04      19  0.0983
    capacita      2.54     1235764     78.93      23  0.0785
     channel      1.66     1254613     10.12      19  0.0885
       doduc     13.20     1416729     35.21      13  0.0870
     fatigue      6.20     1299862      8.60      12  0.0951
     gas_dyn      6.45     1269413     10.08     100  0.1026
      induct     19.57     1593762     34.38      10  0.0965
       linpk      1.43     1162116     26.17      77  0.2626
        mdbx      3.37     1192451     16.41      24  0.0939
          nf      7.65     1217536     29.72      68  0.1240
     protein     12.77     1342400     57.54      10  0.0942
      rnflow     12.81     1357019     31.42      12  0.0976
    test_fpu     11.78     1331485     18.07      24  0.0879
        tfft      1.13     1173880      6.91      24  0.0991

Geometric Mean Execution Time =      20.85 seconds


Timing for IRA branch with -fira:

    Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
           ac      6.06     1158971     11.50      15  0.0979
      aermod     94.22     2421896     41.10      12  0.0725
         air      7.07     1352645     12.00      23  0.0899
    capacita      2.89     1221980     83.01      25  0.1860
      channel      1.81     1241539     10.15      31  0.0879
       doduc     15.20     1404025     33.94      10  0.0628
     fatigue      6.17     1273630      8.52      14  0.0966
     gas_dyn      7.79     1256267     10.18      32  0.0920
       induct     14.28     1567935     44.79      12  0.0772
       linpk      1.44     1145546     25.70      77  0.0920
        mdbx      3.54     1181755     16.05      15  0.0588
          nf      7.73     1205207     29.94      66  0.0890
      protein     12.89     1325392     58.29      10  0.0458
      rnflow     12.45     1340531     31.47      12  0.0570
    test_fpu     12.18     1312704     17.92      58  0.0797
        tfft      1.28     1158396      6.86      32  0.0853

Geometric Mean Execution Time =      21.27 seconds


Timing for IRA branch with -fira -fira-algorithm=CB:

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
    ---------   -------  ----------   ------- -------  ------
          ac      6.33     1158907     12.09      14  0.0943
       aermod     89.54     2421640     40.04      14  0.0877
         air      7.44     1352613     11.94      30  0.0841
     capacita      2.79     1221980     80.97      25  0.2601
     channel      1.75     1241411     10.12      24  0.0909
        doduc     14.12     1403417     33.94      10  0.0438
     fatigue      5.90     1273630      8.37      16  0.0884
      gas_dyn      7.01     1256267     10.33      38  0.0855
      induct     13.74     1568287     44.57      13  0.0978
        linpk      2.50     1145546     25.76      78  0.2625
        mdbx      3.53     1181979     15.96      49  0.0619
           nf      7.91     1205207     29.63      68  0.1055
     protein     12.36     1325264     58.45      10  0.0717
       rnflow     11.78     1340083     31.49      17  0.0892
    test_fpu     11.49     1311040     18.00      18  0.0615
         tfft      1.24     1158492      6.92      25  0.0807

Geometric Mean Execution Time =      21.25 seconds



-- 
FX Coudert
http://www.homepages.ucl.ac.uk/~uccafco/

Reply via email to