https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61680
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> --- /* When we perform grouped accesses and perform implicit CSE by detecting equal accesses and doing disambiguation with runtime alias tests like for .. = a[i]; .. = a[i+1]; a[i] = ..; a[i+1] = ..; *p = ..; .. = a[i]; .. = a[i+1]; where we will end up loading { a[i], a[i+1] } once, make sure that inserting group loads before the first load and stores after the last store will do the right thing. */ if ((STMT_VINFO_GROUPED_ACCESS (stmtinfo_a) && GROUP_SAME_DR_STMT (stmtinfo_a)) || (STMT_VINFO_GROUPED_ACCESS (stmtinfo_b) && GROUP_SAME_DR_STMT (stmtinfo_b))) { gimple earlier_stmt; earlier_stmt = get_earlier_stmt (DR_STMT (dra), DR_STMT (drb)); if (DR_IS_WRITE (STMT_VINFO_DATA_REF (vinfo_for_stmt (earlier_stmt)))) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "READ_WRITE dependence in interleaving." "\n"); return true; } } is supposed to catch this kind of issue ... but it's very simple-minded and GROUP_SAME_DR_STMT is not set (we don't have redundant loads to CSE). What is interesting is that we don't CSE the loads from w[i_26][{0,1}]. That would have likely "fixed" this as well ... Removing the GROUP_SAME_DR_STMT tests fixes the testcase.