[llvm-bugs] [Bug 75822] [AArch64] Incorrectly combining MOVI/CMGT of differing vector lane widths in backend

LLVM Bugs via llvm-bugs Mon, 18 Dec 2023 08:14:07 -0800

Issue	75822
Summary	[AArch64] Incorrectly combining MOVI/CMGT of differing vector lane widths in backend
Labels	new issue
Assignees
Reporter	Benjins

    The following code is miscompiled on the AArch64 backend with optimisations enabled:
```llvm
define dso_local <16 x i1> @do_stuff(<16 x i8> %0) local_unnamed_addr #0 {
entry:
 %cmp.i = icmp slt <16 x i8> %0, <i8 1, i8 0, i8 0, i8 0, i8 1, i8 0, i8 0, i8 0, i8 1, i8 0, i8 0, i8 0, i8 1, i8 0, i8 0, i8 0>
  ret <16 x i1> %cmp.i
}
```


This (with `llc -O1`) generates the following assembly:
```asm
do_stuff:
        cmle    v0.16b, v0.16b, #0
 ret
```

However, this is not correct. 17.0.1 generates the following:
```asm
do_stuff:
        movi    v1.4s, #1
 cmgt    v0.16b, v1.16b, v0.16b
        ret
```

Note that here, the `movi` is broadcasting a 32-bit 1 (so `1, 0, 0, 0` in terms of bytes), but the comparison is done byte-wise. However, the trunk version of assembly is comparing each byte to 0, which is different

Godbolt (comparing w/ and w/o optimisations on trunk, and to 17.0.1):
https://godbolt.org/z/b1a15Me4r

A bisect showed this started in 3acbd38492c394dec32ccde3f11885e5b59d5aa9 (PR https://github.com/llvm/llvm-project/pull/74499 )

I suspect that the fix is to verify that the `MOVI` and the comparisons are at the same lane width if they're vectors, or something along those lines

For triage/priority: this was a bug found by a fuzzer testing SIMD codegen, it was not in manually-written code

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 75822] [AArch64] Incorrectly combining MOVI/CMGT of differing vector lane widths in backend

Reply via email to