Kyrylo Tkachov <kyrylo.tkac...@arm.com> writes: > Hi Richard, > >> -----Original Message----- >> From: Gcc-patches <gcc-patches- >> bounces+kyrylo.tkachov=arm....@gcc.gnu.org> On Behalf Of Richard >> Sandiford via Gcc-patches >> Sent: Tuesday, May 9, 2023 7:48 AM >> To: gcc-patches@gcc.gnu.org >> Cc: Richard Sandiford <richard.sandif...@arm.com> >> Subject: [PATCH 2/6] aarch64: Allow moves after tied-register intrinsics >> >> Some ACLE intrinsics map to instructions that tie the output >> operand to an input operand. If all the operands are allocated >> to different registers, and if MOVPRFX can't be used, we will need >> a move either before the instruction or after it. Many tests only >> matched the "before" case; this patch makes them accept the "after" >> case too. >> >> gcc/testsuite/ >> * gcc.target/aarch64/advsimd-intrinsics/bfcvtnq2-untied.c: Allow >> moves to occur after the intrinsic instruction, rather than requiring >> them to happen before. >> * gcc.target/aarch64/advsimd-intrinsics/bfdot-1.c: Likewise. >> * gcc.target/aarch64/advsimd-intrinsics/vdot-3-1.c: Likewise. > > I'm seeing some dot-product intrinsics failures: > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O1 > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O1 > check-function-bodies ufooq_lane_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 > check-function-bodies ufooq_lane_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none check-function-bodies > ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none check-function-bodies > ufooq_lane_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O3 -g > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O3 -g > check-function-bodies ufooq_lane_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Og -g > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Og -g > check-function-bodies ufooq_lane_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Os > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Os > check-function-bodies ufooq_lane_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O1 > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O1 > check-function-bodies ufooq_laneq_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 > check-function-bodies ufooq_laneq_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none check-function-bodies > ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none check-function-bodies > ufooq_laneq_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O3 -g > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O3 -g > check-function-bodies ufooq_laneq_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Og -g > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Og -g > check-function-bodies ufooq_laneq_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Os > check-function-bodies ufoo_untied > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Os > check-function-bodies ufooq_laneq_untied
Ugh. Big-endian. Hadn't thought about that being an issue. Was testing natively on little-endian aarch64-linux-gnu and didn't see these. > From a quick inspection it looks like it's just an alternative regalloc that > moves the mov + dot instructions around, similar to what you fixed in > bfdot-2.c and vdot-3-2.c. > I guess they need a similar adjustment? Yeah, will fix. Thanks, Richard