Hi all, In this PR the vxarq_u64 intrinisc gets passed a rotate amount of 0 and the patterns don't handle it right. Because we adjust RTL amount during expand to account for the canonical representation we end up emitting a V2DImode rotate of 64, which the output instruction is not prepared to handle. What we should be doing is leaving it as 0 in that case, which is what this patch does.
A XAR with a rotate of 0 is really just an EOR and we could have emitted it as such but I thought that, at least at -O0, it would be nicer to emit the XAR-0 form as it's still a legal instruction and the user did ask for it through the intrinsic. At -O1 and above the optimisers kick in and simplify it to an EOR anyway. Note: the SVE2 XAR instruction doesn't suffer from this problem because a rotate amount of 0 is actually not allowed by the instruction itself and the early intrinsic validation rejects it anyway. Bootstrapped and tested on aarch64-none-linux-gnu. Any comments? Will push to trunk next week if no objections. Thanks, Kyrill Signed-off-by: Kyrylo Tkachov <[email protected]> gcc/ PR target/123584 * config/aarch64/aarch64-simd.md (aarch64_xarqv2di): Leave zero rotate amounts as zero during expansion. (*aarch64_xarqv2di_insn): Account for zero rotate amounts. Print # in rotate immediate. gcc/testsuite/ PR target/123584 * gcc.target/aarch64/torture/xar-zero.c: New test.
0001-aarch64-PR-target-123584-Fix-expansion-of-SHA3-XAR-w.patch
Description: 0001-aarch64-PR-target-123584-Fix-expansion-of-SHA3-XAR-w.patch
