https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88978
Bug ID: 88978 Summary: Failed outer loop vectorization with grouped accesses Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- We fail to vectorize outer loops when there are grouped accesses in the inner loop: int a[1024]; int b[1024][1024]; void foo () { for (int i = 0; i < 512; ++i) { int a1 = a[2*i]; int a2 = a[2*i+1]; for (int j = 0; j < 1024; ++j) { b[j][2*i] = a1; b[j][2*i+1] = a2; } } } This is mostly because we cannot do SLP here (for implementation reasons). We are vectorizing the following just fine, applying interleaving to the outer loop accesses: int a[1024]; int b[1024][1024]; void foo () { for (int i = 0; i < 512; ++i) { int a1 = a[2*i]; int a2 = a[2*i+1]; for (int j = 0; j < 1024; ++j) b[j][i] = a1+a2; } } The guard in question is the following which is premature (before SLP would be even tried) and somewhat inaccurate since it is grouped accesses in the inner loop when doing outer loop vectorization rather than grouped accesses in an outer loop that fail. static bool vect_analyze_data_ref_access (dr_vec_info *dr_info) { ... if (loop && nested_in_vect_loop_p (loop, stmt_info)) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "grouped access in outer loop.\n"); return false; }