https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93946
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> --- OK, now looking myself. RTL expansion creates (insn 8 7 9 2 (set (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4 A32]) (reg:SI 49)) "t.c":12:13 -1 (nil)) (insn 9 8 10 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct aa *)ptr_1(D)].a.u.i+0 S4 A32]) (const_int 0 [0])) "t.c":13:12 -1 (nil)) (insn 10 9 11 2 (set (mem/j:SI (plus:SI (reg/v/f:SI 48 [ ptr ]) (const_int 4 [0x4])) [1 MEM[(struct aa *)ptr_1(D)].a.u.i+4 S4 A32]) (const_int 0 [0])) "t.c":13:12 -1 (nil)) (insn 11 10 12 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb *)ptr_1(D)].b.u.f+0 S4 A32]) (const_int 0 [0])) "t.c":14:12 -1 (nil)) (insn 12 11 13 2 (set (reg:SI 51) (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4 A32])) "t.c":15:17 -1 (nil)) where insn 11 is the important one. Somehow on nios2 the CSE1 removes that store. deferring deletion of insn with uid = 11. and we end up with (insn 8 7 9 2 (set (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4 A32]) (reg:SI 49)) "t.c":12:13 5 {movsi_internal} (expr_list:REG_DEAD (reg:SI 49) (nil))) (insn 9 8 10 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct aa *)ptr_1(D)].a.u.i+0 S4 A32]) (const_int 0 [0])) "t.c":13:12 5 {movsi_internal} (nil)) (insn 10 9 12 2 (set (mem/j:SI (plus:SI (reg/v/f:SI 48 [ ptr ]) (const_int 4 [0x4])) [1 MEM[(struct aa *)ptr_1(D)].a.u.i+4 S4 A32]) (const_int 0 [0])) "t.c":13:12 5 {movsi_internal} (nil)) (insn 12 10 13 2 (set (reg:SI 51 [ bv_3(D)->b.u.f ]) (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4 A32])) "t.c":15:17 5 {movsi_internal} (expr_list:REG_DEAD (reg/v/f:SI 47 [ bv ]) (nil))) where there indeed is no scheduling barrier anymore. I didn't know CSE removes stores or why this only triggers on nios2, it looks like some DF thing? Backtrace of the "DSE": #0 delete_insn (insn=0x7ffff6bc3400) at /space/rguenther/src/gcc/gcc/cfgrtl.c:135 #1 0x0000000000b0bfa5 in delete_insn_and_edges (insn=0x7ffff6bc3400) at /space/rguenther/src/gcc/gcc/cfgrtl.c:237 #2 0x0000000001a9d8eb in cse_insn (insn=0x7ffff6bc3400) at /space/rguenther/src/gcc/gcc/cse.c:5571 #3 0x0000000001aa0b76 in cse_extended_basic_block (ebb_data=0x7fffffffdc90) at /space/rguenther/src/gcc/gcc/cse.c:6614 #4 0x0000000001aa10a5 in cse_main (f=0x7ffff6cce310, nregs=52) at /space/rguenther/src/gcc/gcc/cse.c:6793 that's /* Similarly for no-op moves. */ else if (noop_insn) { if (cfun->can_throw_non_call_exceptions && can_throw_internal (insn)) cse_cfg_altered = true; cse_cfg_altered |= delete_insn_and_edges (insn); /* No more processing for this set. */ sets[i].rtl = 0; so appearantly it does redundant store removal as well... /* Similarly, lots of targets don't allow no-op (set (mem x) (mem x)) moves. Even (set (reg x) (reg x)) might be impossible for certain registers (like CC registers). */ else if (n_sets == 1 && !CALL_P (insn) && (MEM_P (trial) || REG_P (trial)) && rtx_equal_p (trial, dest) && !side_effects_p (dest) && (cfun->can_delete_dead_exceptions || insn_nothrow_p (insn))) { SET_SRC (sets[i].rtl) = trial; noop_insn = true; break; } where (gdb) p debug_rtx (insn) (insn 11 10 12 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb *)ptr_1(D)].b.u.f+0 S4 A32]) (const_int 0 [0])) "t.c":14:12 5 {movsi_internal} (expr_list:REG_DEAD (reg/v/f:SI 48 [ ptr ]) (nil))) (gdb) p debug_rtx (trial) (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb *)ptr_1(D)].b.u.f+0 S4 A32]) $4 = void (gdb) p debug_rtx (dest) (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb *)ptr_1(D)].b.u.f+0 S4 A32]) $6 = void so it might be that the trigger is a target where sizeof(long long) = 2 * sizeof(long) _and_ we split stores to the larger type (I tried to pick a set of types where sizeof is the same but alias-sets are different - otherwise I'd have to cater for big vs. little-endian).