Kyrylo Tkachov <kyrylo.tkac...@arm.com> writes:
> Hi Richard,
>
>> -----Original Message-----
>> From: Gcc-patches <gcc-patches-
>> bounces+kyrylo.tkachov=arm....@gcc.gnu.org> On Behalf Of Richard
>> Sandiford via Gcc-patches
>> Sent: Tuesday, May 9, 2023 7:48 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Richard Sandiford <richard.sandif...@arm.com>
>> Subject: [PATCH 2/6] aarch64: Allow moves after tied-register intrinsics
>>
>> Some ACLE intrinsics map to instructions that tie the output
>> operand to an input operand.  If all the operands are allocated
>> to different registers, and if MOVPRFX can't be used, we will need
>> a move either before the instruction or after it.  Many tests only
>> matched the "before" case; this patch makes them accept the "after"
>> case too.
>>
>> gcc/testsuite/
>>       * gcc.target/aarch64/advsimd-intrinsics/bfcvtnq2-untied.c: Allow
>>       moves to occur after the intrinsic instruction, rather than requiring
>>       them to happen before.
>>       * gcc.target/aarch64/advsimd-intrinsics/bfdot-1.c: Likewise.
>>       * gcc.target/aarch64/advsimd-intrinsics/vdot-3-1.c: Likewise.
>
> I'm seeing some dot-product intrinsics failures:
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O1   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O1   
> check-function-bodies ufooq_lane_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O2   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O2   
> check-function-bodies ufooq_lane_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none   check-function-bodies 
> ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none   check-function-bodies 
> ufooq_lane_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O3 -g   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -O3 -g   
> check-function-bodies ufooq_lane_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -Og -g   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -Og -g   
> check-function-bodies ufooq_lane_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -Os   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c   -Os   
> check-function-bodies ufooq_lane_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O1   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O1   
> check-function-bodies ufooq_laneq_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O2   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O2   
> check-function-bodies ufooq_laneq_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none   check-function-bodies 
> ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none   check-function-bodies 
> ufooq_laneq_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O3 -g   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -O3 -g   
> check-function-bodies ufooq_laneq_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -Og -g   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -Og -g   
> check-function-bodies ufooq_laneq_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -Os   
> check-function-bodies ufoo_untied
> FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   -Os   
> check-function-bodies ufooq_laneq_untied

Ugh.  Big-endian.  Hadn't thought about that being an issue.
Was testing natively on little-endian aarch64-linux-gnu and
didn't see these.

> From a quick inspection it looks like it's just an alternative regalloc that 
> moves the mov + dot instructions around, similar to what you fixed in 
> bfdot-2.c and vdot-3-2.c.
> I guess they need a similar adjustment?

Yeah, will fix.

Thanks,
Richard

Reply via email to