Issue 53120
Summary Aarch64: why clang combines sqadd + sqrdmulh into sqrdmlah?
Labels new issue
Assignees
Reporter Nekotekina
    Hello, I accidentally peeked at arm_neon.h definitions in clang and noticed that vqrdmlahq_s16 iintrinsic is composed from saturating add and saturating rounding doubling multiplication. I tried to test it, and it seems that in the edge case (saturated multiplication of INT16_MIN * INT16_MIN producing INT16_MAX) the result of saturating addition is wrong if MLA would not saturate at all. I'm only learning the basics and can only test in emulator, so I may be missing something. In the godbolt example gcc does not combine.

https://godbolt.org/z/E9WvPTbWG
```C++
#include <arm_neon.h>

int16x8_t good(int16x8_t a, int16x8_t b, int16x8_t c)
{
    return vqrdmlahq_s16(c, a, b);
}

int16x8_t bad(int16x8_t a, int16x8_t b, int16x8_t c)
{
    return vqaddq_s16(c, vqrdmulhq_s16(a, b));
}
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to