Sebastian,
Below are the results for the Polyhedron 2005 benchmarks on
x86_64-apple-darwin10 using -O3 -ffast-math -funroll-loops under gcc
trunk at r169776, with -fgraphite-identity and with -fgraphite-identity
-ftree-loop-linear. I am surprised at the absence of any impact from
-ftree-loop-linear in either run-time or executable size. The increase
in compile time on some of the benchmarks suggested it was in effect.
Is this a poor combination of optimizations for -ftree-loop-linear or
is fortran less effective in using that optimization?
Jack
ps Hopefully when the remaining loop regressions in -fgraphite-identity
are solved, the graphite results will improve a bit more.
Using built-in specs.
COLLECT_GCC=gcc-4
COLLECT_LTO_WRAPPER=/sw/lib/gcc4.6/libexec/gcc/x86_64-apple-darwin10.7.0/4.6.0/lto-wrapper
Target: x86_64-apple-darwin10.7.0
Configured with: ../gcc-4.6-20110202/configure --prefix=/sw
--prefix=/sw/lib/gcc4.6 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.6/info
--with-build-config=bootstrap-lto --enable-stage1-languages=c,lto
--enable-languages=c,c++,fortran,lto,objc,obj-c++,java --with-gmp=/sw
--with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw
--with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib
--program-suffix=-fsf-4.6 --enable-checking=yes --enable-cloog-backend=isl
Thread model: posix
gcc version 4.6.0 20110203 (experimental) (GCC)
command=gfortran -O3 -ffast-math -funroll-loops
Run-time
stock -fgraphite-identity -fgraphite-identity
-ftree-loop-linear
ac 8.80 8.80 8.80
aermod 17.32 17.43 17.43
air 5.48 5.43 5.44
capacita 32.45 32.52 32.53
channel 1.84 1.84 1.84
doduc 28.30 26.28 26.28
fatigue 8.13 8.09 8.09
gas_dyn 4.30 4.32 4.31
induct 13.07 12.51 12.51
linpk 15.47 15.41 15.41
mdbx 11.21 11.21 11.21
nf 29.91 30.20 30.01
protein 32.86 32.21 32.20
rnflow 23.94 24.18 24.17
test_fpu 8.02 8.05 8.04
tfft 1.87 1.87 1.87
Compile-time
stock -fgraphite-identity -fgraphite-identity
-ftree-loop-linear
ac 2.12 2.12 2.12
aermod 57.45 59.22 59.30
air 3.84 4.37 4.93
capacita 2.82 2.94 3.07
channel 1.00 1.20 1.33
doduc 8.57 8.92 8.95
fatigue 3.19 3.17 3.17
gas_dyn 5.38 5.57 5.57
induct 6.59 6.77 8.81
linpk 1.08 1.33 1.31
mdbx 2.83 2.92 2.92
nf 3.09 3.08 3.10
protein 8.51 8.70 8.67
rnflow 9.94 10.09 10.09
test_fpu 7.22 7.24 7.28
tfft 0.81 0.88 0.83
Executable size
stock -fgraphite-identity -fgraphite-identity
-ftree-loop-linear
ac 50976 50976 50976
aermod 1264832 1268928 1268928
air 73984 82184 82184
capacita 77976 77976 77976
channel 34792 34792 34792
doduc 193096 193096 193096
fatigue 86032 86032 86032
gas_dyn 119704 115608 115608
induct 174848 174848 174848
linpk 38648 38648 38648
mdbx 82072 82072 82072
nf 75912 71816 71816
protein 131992 131992 131992
rnflow 181080 181080 181080
test_fpu 155048 150952 150952
tfft 30760 30760 30760