https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591
--- Comment #21 from Kewen Lin <linkw at gcc dot gnu.org> --- For optimized IR: a$raw$3_220 = D.39813.rawD.30221[3]; vect_a_raw_4_70.539_1584 = MEM <vector(4) short intD.20> [(short intD.20 *)&D.39813 + 8B]; _1640 = a$raw$0_221 & 255; _1649 = a$raw$1_74 & 255; _1658 = a$raw$2_264 & 255; _52 = a$raw$3_220 & 255; vD.39776 = bD.39739; // involved decl1 MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&b00D.39742] = MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&vD.39776]; vD.39776 ={v} {CLOBBER(eol)}; vD.39779 = b00D.39742; // involved decl2 raw_u_1614 = vD.39779.rawD.30221[0]; _1615 = raw_u_1614 << 8; vD.39779.rawD.30221[0] = _1615; raw_u_1622 = vD.39779.rawD.30221[1]; _1623 = raw_u_1622 << 8; vD.39779.rawD.30221[1] = _1623; ... Partition 1: size 16 align 16 D.39819 vD.39749 vD.39756 vD.39764 aD.39773 vD.39779 vD.39735 vD.39736 aD.39630 vD.39636 aD.39640 vD.39753 vD.39761 vD.39776 vD.39782 vD.39776 and vD.39779 are coalesced. It's expanded as: vD.39776 = bD.39739; (insn 383 382 384 (set (reg:V2DI 616) (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 48 [0x30])) [7 MEM[(struct Vec128D.30433 *)_1274]+0 S16 A128])) -1 (nil)) (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0 S16 A128]) (reg:V2DI 616)) -1 (nil)) MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&b00D.39742] = MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&vD.39776]; (insn 385 384 386 (set (reg:V2DI 617) (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 16 [0x10])) [0 MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})_10]+0 S16 A128])) "test.cc":14:19 -1 (nil)) (insn 386 385 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 80 [0x50])) [0 MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})_1277]+0 S16 A128]) (reg:V2DI 617)) "test.cc":14:19 -1 (nil)) vD.39776 ={v} {CLOBBER(eol)}; vD.39779 = b00D.39742; (insn 387 386 388 (set (reg:V2DI 618) (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 80 [0x50])) [5 MEM[(struct Vec128D.30212 *)_1277]+0 S16 A128])) -1 (nil)) (insn 388 387 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 16 [0x10])) [5 MEM[(struct Vec128D.30212 *)_10]+0 S16 A128]) (reg:V2DI 618)) -1 (nil)) raw_u_1614 = vD.39779.rawD.30221[0]; _1615 = raw_u_1614 << 8; vD.39779.rawD.30221[0] = _1615; ;; v.raw[0] = _1615; (insn 389 388 390 (set (reg:HI 619) (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212 *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1 (nil)) (insn 390 389 391 (set (reg:SI 620) (ashift:SI (subreg:SI (reg:HI 619) 0) (const_int 8 [0x8]))) "test.cc":218:14 -1 (nil)) (insn 391 390 0 (set (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars) (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212 *)_10].rawD.30221[0]+0 S2 A128]) (subreg:HI (reg:SI 620) 2)) "test.cc":218:14 -1 (nil)) ========= Later, insn 388 gets removed (also insn 387 and 385), as the store value is exactly the same as what insn 384 has. And the scheduler doesn't consider there is a dependence between insn 389 and insn 384 then results in unexpected move. Hi Richi, do you think that this is exactly duplicated of known -fstack-reuse issue?