Turns out the code we are generating for vectorized induction ppc is quite terrible - the vector induction variable is advanced by a constant step in the loop (e.g., {4,4,4,4} as in the testcase below). This is the sequence gcc currently creates for altivec in order to generate the {4,4,4,4} vector:
li 0,4 stw 0,-48(1) lvewx 0,0,9 vspltw 0,0,0 So, one thing to figure out is why we don't use the immediate form of the splat (vspltiw); The other is - why this sequence ends up getting generated not only before the loop (see insns marked with "<<<1" below), but also inside the loop... (see insns marked with "<<<2" below). This is the testcase (it is basically the testcase gcc.dg/vect/no-tree-scev-cprop-vect-iv-1.c with larger loop count to avoid complete unrolling): int main1 (int X) { int s = X; int i; for (i = 0; i < 96; i++) s += i; return s; } compiled as follows: gcc -O2 -ftree-vectorize -maltivec -fno-tree-scev-cprop -S t.c li 0,4 <<<1 stw 0,-48(1) <<<1 ld 9,[EMAIL PROTECTED](2) li 0,23 mr 11,3 mtctr 0 lvx 1,0,9 addi 9,1,-48 vor 13,1,1 lvewx 0,0,9 <<<1 vspltw 0,0,0 <<<1 vadduwm 1,1,0 .p2align 4,,15 .L2: li 0,4 <<<2 addi 9,1,-48 vadduwm 13,13,1 stw 0,-48(1) <<<2 lvewx 0,0,9 <<<2 vspltw 0,0,0 <<<2 vadduwm 1,1,0 bdnz .L2 -- Summary: Bad codegen for vectorized induction with altivec Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: dorit at il dot ibm dot com GCC build triplet: powerpc-linux GCC host triplet: powerpc-linux GCC target triplet: powerpc-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31334