http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25621



Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:



           What    |Removed                     |Added

----------------------------------------------------------------------------

                 CC|                            |Joost.VandeVondele at mat

                   |                            |dot ethz.ch

         Depends on|                            |53947



--- Comment #12 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 
2013-03-29 10:07:06 UTC ---

This has become much more a vectorizer problem. Basically ifort generates code

that is twice as fast for routine S31 of the initial comment. Given that this

is a common dot product, it might be good to see why that happens. Both

compilers fail to notice that S32 is basically the same code hand-unrolled.



Tested with the code in comment #6 (without inlining)



> gfortran -march=native -ffast-math -O3 -fno-inline PR25621.f90

> ./a.out

 default loop  0.56491500000000006     

 hand optimized loop  0.74488600000000016     

> ifort -xHost -O3 -fno-inline PR25621.f90

> ./a.out

 default loop  0.377943000000000     

 hand optimized loop  0.579911000000000

Reply via email to