https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116783
Bug ID: 116783
Summary: [14/15 Regression] Wrong code at -O2 with late pair
fusion pass (wrong alias analysis)
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: acoplan at gcc dot gnu.org
Target Milestone: ---
Created attachment 59150
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59150&action=edit
Executable reduced testcase for the testsuite
The attached executable reproducer (exec.cc) is reduced from a Debian package
(kf6-ktexttemplate) which is getting miscompiled on AArch64 (see
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1080974).
The problem can be reproduced on aarch64 as follows:
$ g++ exec.cc -O2 -fstack-protector-strong -fno-late-combine-instructions
-mno-late-ldp-fusion
$ ./a.out
$ g++ exec.cc -O2 -fstack-protector-strong -fno-late-combine-instructions
$ ./a.out
Aborted
Note that late-combine hides the problem on the trunk, such that
-fno-late-combine-instructions isn't needed to reproduce the problem with GCC
14 (but is on trunk).
Looking at what's going on in late ldp_fusion, I see only a single pair getting
formed:
fusing pair [L=1] (92,94), base=19, hazards: (-,106), move_range: (94,94)
and we have the following RTL fragment:
174: x1:DI=sp:DI+0x200
92: v30:V4SI=[x1:DI-0xb8]
REG_DEAD x1:DI
176: x1:DI=x19:DI
106: [x1:DI]=const_vector
REG_DEAD x1:DI
177: x1:DI=sp:DI+0x200
94: v29:V4SI=[x19:DI+0x10]
REG_EQUIV [x19:DI+0x10]
now looking back to the last assignment to x19, we have:
x19:DI=sp:DI+0x148
so substituting through, we have:
x1 - 0xb8 = sp + 0x200 - 0xb8 = sp + 0x148 = x19
i.e. the load i92 is to the exact same address as the store i106, yet we fail
to detect this aliasing hazard (in the forward direction) and thus form the
load pair at i94, incorrectly re-ordering the load (i92) over the store.
The problem seems to be not necessarily in pair-fusion.cc itself, however,
since memory_modified_in_insn_p fails to return true for the following
arguments:
(rr) pr mem
(mem/c:V4SI (plus:DI (reg:DI 1 x1 [195])
(const_int -184 [0xffffffffffffff48])) [0 D.5008.d+0 S16 A64])
(rr) pr insn
(insn 106 176 177 5 (set (mem/c:V4SI (reg:DI 1 x1 [198]) [0 MEM <unsigned
char[25]> [(struct Private *)&D.5008]+0 S16 A64])
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
])) "exec.cc":20:13 discrim 1 1270 {*aarch64_simd_movv4si}
(expr_list:REG_DEAD (reg:DI 1 x1 [198])
(nil)))
where (naively) it looks like the MEM_EXPRs alias, so I would have expected the
alias analysis machinery to figure this out.
I'll try to dig into why memory_modified_in_insn_p ends up returning false
here.