https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116783
Bug ID: 116783 Summary: [14/15 Regression] Wrong code at -O2 with late pair fusion pass (wrong alias analysis) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- Created attachment 59150 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59150&action=edit Executable reduced testcase for the testsuite The attached executable reproducer (exec.cc) is reduced from a Debian package (kf6-ktexttemplate) which is getting miscompiled on AArch64 (see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1080974). The problem can be reproduced on aarch64 as follows: $ g++ exec.cc -O2 -fstack-protector-strong -fno-late-combine-instructions -mno-late-ldp-fusion $ ./a.out $ g++ exec.cc -O2 -fstack-protector-strong -fno-late-combine-instructions $ ./a.out Aborted Note that late-combine hides the problem on the trunk, such that -fno-late-combine-instructions isn't needed to reproduce the problem with GCC 14 (but is on trunk). Looking at what's going on in late ldp_fusion, I see only a single pair getting formed: fusing pair [L=1] (92,94), base=19, hazards: (-,106), move_range: (94,94) and we have the following RTL fragment: 174: x1:DI=sp:DI+0x200 92: v30:V4SI=[x1:DI-0xb8] REG_DEAD x1:DI 176: x1:DI=x19:DI 106: [x1:DI]=const_vector REG_DEAD x1:DI 177: x1:DI=sp:DI+0x200 94: v29:V4SI=[x19:DI+0x10] REG_EQUIV [x19:DI+0x10] now looking back to the last assignment to x19, we have: x19:DI=sp:DI+0x148 so substituting through, we have: x1 - 0xb8 = sp + 0x200 - 0xb8 = sp + 0x148 = x19 i.e. the load i92 is to the exact same address as the store i106, yet we fail to detect this aliasing hazard (in the forward direction) and thus form the load pair at i94, incorrectly re-ordering the load (i92) over the store. The problem seems to be not necessarily in pair-fusion.cc itself, however, since memory_modified_in_insn_p fails to return true for the following arguments: (rr) pr mem (mem/c:V4SI (plus:DI (reg:DI 1 x1 [195]) (const_int -184 [0xffffffffffffff48])) [0 D.5008.d+0 S16 A64]) (rr) pr insn (insn 106 176 177 5 (set (mem/c:V4SI (reg:DI 1 x1 [198]) [0 MEM <unsigned char[25]> [(struct Private *)&D.5008]+0 S16 A64]) (const_vector:V4SI [ (const_int 0 [0]) repeated x4 ])) "exec.cc":20:13 discrim 1 1270 {*aarch64_simd_movv4si} (expr_list:REG_DEAD (reg:DI 1 x1 [198]) (nil))) where (naively) it looks like the MEM_EXPRs alias, so I would have expected the alias analysis machinery to figure this out. I'll try to dig into why memory_modified_in_insn_p ends up returning false here.