| Issue |
181454
|
| Summary |
[X86] Vector 64-bit `icmp ugt + blend` with constant should use arithmetic to avoid compare
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
WalterKruger
|
Due to gaps in support on x86, unsigned 64-bit vector compares with constants are implemented by flipping the sign of a signed compare when SSE4.2 is available (it is much worse without it). This is often paired with `blendvpd`, which performs a conditional selection:
```asm
selectIfGreater:
movapd xmm3, xmm0
pxor xmm2, xmmword ptr [rip + .LCPI0_0]
pcmpgtq xmm2, xmmword ptr [rip + .LCPI0_1]
movdqa xmm0, xmm2
blendvpd xmm3, xmm1, xmm0
movapd xmm0, xmm3
ret
```
https://godbolt.org/z/jjc7Ga3Eo
Due to `blendvpd` only checking the most significant bits, it is faster to use an addition with a bitwise operation to emulate the compare. This is due to `paddq` having better latency/throughput than `pcmpgtq` on most CPUs and it uses one less constant. A different bitwise operation is needed depending on the size of the constant, because it needs to deal with the addition either overflowing with too large `x` or setting it with a too small `x`:
```
C < INT64_MAX: (x + (INT64_MAX - C)) | x
C > INT64_MAX: (x + (INT64_MAX - C)) & x
```
## Alive2 Proofs
- `C < INT64_MAX`: https://alive2.llvm.org/ce/z/e1beWd
- `C > INT64_MAX`: https://alive2.llvm.org/ce/z/EL_twf
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs