http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49955
Summary: Fails to do partial basic-block SLP Product: gcc Version: 4.7.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: rgue...@gcc.gnu.org CC: i...@gcc.gnu.org 410.bwaves in shell_lam.f has a lot of arrays with inner dimension 5 operated on in loops that are either unrolled by early unrolling or manually unrolled in source. All but one loop in shell_lam.f are not vectorized. One reason is that basic-block vectorization gives up if it sees interleaving size that is not a multiple of a supported vectorization factor. Testcase: double a[1024], b[1024]; void foo (int k) { int j; a[k*5 + 0] = a[k*5 + 0] + b[k*5 + 0]; a[k*5 + 1] = a[k*5 + 1] + b[k*5 + 1]; a[k*5 + 2] = a[k*5 + 2] + b[k*5 + 2]; a[k*5 + 3] = a[k*5 + 3] + b[k*5 + 3]; a[k*5 + 4] = a[k*5 + 4] + b[k*5 + 4]; } taken from the last loop in shell_lam.f which has its innermost loop unrolled (and loop SLP refuses to vectorize as well, see separate bug). For the above we get: t.c:6: note: === vect_analyze_data_ref_accesses === t.c:6: note: Detected interleaving of size 5 t.c:6: note: Detected interleaving of size 5 t.c:6: note: Detected interleaving of size 5 t.c:6: note: Vectorizing an unaligned access. t.c:6: note: Vectorizing an unaligned access. t.c:6: note: Vectorizing an unaligned access. t.c:6: note: === vect_analyze_slp === t.c:6: note: get vectype with 2 units of type double t.c:6: note: vectype: vector(2) double t.c:6: note: Build SLP failed: unrolling required in basic block SLP t.c:6: note: Failed to SLP the basic block. t.c:6: note: not vectorized: failed to find SLP opportunities in basic block. but of course we could simply vectorize with an interleaving size of 4 leaving the excess operations unvectorized (with optimization opportunity if we can pick a properly sized and aligned set of accesses).