For the following test case, if we compile with -O3 -fprefetch-loop-arrays -march=amdfam10, the loop is versioned (for runtime alias checking) to be vectorized. However, we see prefetches in the non-vectorize version, but not in the vectorized version.
void foo(int beta, float *a, float *b) { int i; for(i=0; i<1024; i++) a[i] = a[i] + beta * b[i]; } For the vectorized loop, in tree-ssa-loop-arrays.c (idx_analyze_ref): if (TREE_CODE (base) == MISALIGNED_INDIRECT_REF || TREE_CODE (base) == ALIGN_INDIRECT_REF) return false; FALSE is returned due to mis-aligned indirect reference: M*vect_p.18_61{misalignment: 0} M*vect_p.23_66{misalignment: 0} M*vect_p.31_74{misalignment: 0} -- Summary: No prefetch for the vectorized loop Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: changpeng dot fang at amd dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45022