https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100048
Bug ID: 100048 Summary: [10/11 Regression] Wrongful CSE'ing of SVE predicates. Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- Target: aarch64-* The following testcase #include "arm_sve.h" void foo(svfloat16_t in, float16_t *dst) { const svbool_t pg_q0 = svdupq_n_b16(1, 0, 1, 0, 0, 0, 0, 0); const svbool_t pg_f0 = svdupq_n_b16(1, 0, 0, 0, 0, 0, 0, 0); dst[0] = svaddv_f16(pg_f0, in); dst[1] = svaddv_f16(pg_q0, in); } generates the right code at -O1 with -march=armv8-a+sve but generates wrong code at -O2. >From this these expands are created (insn 22 21 23 2 (set (reg:VNx8BI 100) (subreg:VNx8BI (reg:VNx2BI 103) 0)) (expr_list:REG_EQUAL (const_vector:VNx8BI [ (const_int 1 [0x1]) (const_int 0 [0]) (const_int 1 [0x1]) (const_int 0 [0]) repeated x5 ]) (nil))) and (insn 15 14 16 2 (set (reg:VNx8BI 96) (subreg:VNx8BI (reg:VNx2BI 99) 0)) (expr_list:REG_EQUAL (const_vector:VNx8BI [ (const_int 1 [0x1]) (const_int 0 [0]) repeated x7 ]) (nil))) where the subregs are paradoxical. These incorrect paradoxical subregs cause CSE to think these two predicates are the same. As such it CSEs them away into foo: pfalse p2.b ptrue p1.d, all trn1 p1.d, p1.d, p2.d faddv h1, p1, z0.h str h1, [x0] ptrue p0.s, all trn1 p0.d, p0.d, p2.d faddv h0, p0, z0.h str h0, [x0, 2] ret instead of the expected foo: pfalse p2.b ptrue p1.d, all ptrue p0.s, all trn1 p1.s, p1.s, p2.s trn1 p0.s, p0.s, p2.s faddv h1, p1, z0.h faddv h0, p0, z0.h str h1, [x0] str h0, [x0, 2] ret