------- Comment #4 from jv244 at cam dot ac dot uk 2007-07-03 19:30 ------- Now, I get the same timings for the hand-optimised loop and compiled loop if I use the option:
gfortran -O3 -ffast-math -ftree-vectorize -march=native -funroll-loops -fvariable-expansion-in-unroller test.f90 whereas -funroll-loops is quite common to add, -fvariable-expansion-in-unroller is not. Could one have a heuristic that switches that on by default if -funroll-loops (and -ffast-math) ? For S31 the timings are: > gfortran -O3 -ffast-math -ftree-vectorize -march=native -funroll-loops > test.f90 > time ./a.out real 0m6.618s > gfortran -O3 -ffast-math -ftree-vectorize -march=native -funroll-loops > -fvariable-expansion-in-unroller test.f90 > time ./a.out real 0m4.457s so a 50% improvement. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25621