The modifies testcase from PR18767 shows the problem where loop count variables still remains in vectorized loop. Compiling the modified testcase with 'g++ -O2 -march=pentium4 -ftree-vectorize' following code is produced for the first loop:
... leal -24(%ebp), %esi leal -40(%ebp), %ebx leal -56(%ebp), %ecx xorl %eax, %eax xorl %edx, %edx .L2: addl $1, %eax movaps (%edx,%esi), %xmm0 mulps (%ebx,%edx), %xmm0 movaps %xmm0, (%edx,%ecx) addl $16, %edx cmpl $1, %eax jne .L2 ... It looks that the compiler does not figure out that the conditional jump is never taken. However with 'g++ -O2 -march=pentium4 -ftree-vectorize -funroll-loops' generated code is a lot better: ... movaps -24(%ebp), %xmm0 mulps -40(%ebp), %xmm0 movaps %xmm0, -56(%ebp) ... Uros. -- Summary: Redundant loop count insns in simple vectorized loop Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18777