Take the following code:
struct X { float array[4]; };
X a,b;
float foobar () {
float s = 0;
X c;
for (unsigned int d=0; d<4; ++d)
c.array[d] = a.array[d] * b.array[d];
for (unsigned int d=0; d<4; ++d)
s+=c.array[d];
return s;
}
With -O3 -funroll-loops -fno-ivopts -ffast-math (the fno-ivopts is because
there are some ADDR_EXPR
which we don't mark as invariant/constant which I will file in another bug) we
get in .vars:
c.array[0] = a.array[0] * b.array[0];
c.array[1] = a.array[1] * b.array[1];
c.array[2] = a.array[2] * b.array[2];
D.1572 = a.array[3] * b.array[3];
c.array[3] = D.1572;
return D.1572 + c.array[0] + c.array[1] + c.array[2];
Note how we could SRA c.array.
--
Summary: unrolling happens too late/SRA does not happen late
enough
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Keywords: missed-optimization, TREE
Severity: enhancement
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: pinskia at gcc dot gnu dot org
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18754