http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179

--- Comment #11 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 
2012-06-30 11:26:59 UTC ---
It looks like this problem is solved in the current 4.7 and 4.8 branches. At
least on an avx machine, the best performance found by the code in comment #4
jumps from 5.3Gflops in 4.6 to 13.9Glfops in 4.7/4.8. Great work.

I can't test this right now on interlagos, but I guess this could be OK as
well.

Reply via email to