------- Comment #4 from tkoenig at gcc dot gnu dot org  2008-08-23 13:18 -------
Created an attachment (id=16134)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16134&action=view)
test case

Actually, the test cases were a bit unfair, because 
the middle-end decided not to calculate the
values of c that were never used.

Attached is a better test case.

Timings on x86_64-unknown-linux-gnu:

 matmul =    12.840802      s
 subroutine without explicit interface:   0.88805580      s
 subroutine with explicit interface:   0.87605572      s
 inline with sum   2.0721283      s

While inlining is still much better than matmul, a hand-rolled
3*3 subroutine is much faster overall, which I find a bit surprising.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37131

Reply via email to