[Bug tree-optimization/14741] graphite with loop blocking and interchanging doesn't optimize a matrix multiplication loop

Joost.VandeVondele at mat dot ethz.ch Sun, 17 May 2015 23:29:18 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741


--- Comment #32 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 
---
(In reply to Thomas Koenig from comment #31)
> If the middle end is not up to this, should we be looking at doing loop
> blocking in the Fortran front end, at least for the Matmul intrinsic?

I think this makes sense, fixing this issue in the middle end seems to be a
project on a different timescale. Ideally, matmul expands to something that
generates good code even at e.g. -O2 -march=native (which would require both
blocking and unrolling). At that point, the inlined code would be faster than
the runtime library...for all sizes.

[Bug tree-optimization/14741] graphite with loop blocking and interchanging doesn't optimize a matrix multiplication loop

Reply via email to