| Issue |
174169
|
| Summary |
LLVM fails to optimize out i32/i64 smin/smax down to PMINSW/PMAXSW on x86 with SSE2 when values are known to be within i16 range
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
johnplatts
|
`@llvm.smin.v4i32(%a, %b)` and `@llvm.smin.v2i64(%a, %b)` can both be optimized to PMINSW (or `@llvm.smin.v8i16`) on x86 with SSE2 if the values in each lane of both `%a` and `%b` are known to be either in the [-32768, 32767] range, undef, or poison.
Likewise, `@llvm.smax.v4i32(%a, %b)` and `@llvm.smax.v2i64(%a, %b)` can both be optimized to PMAXSW (or `@llvm.smax.v8i16`) on x86 with SSE2 if the values in each lane of both `%a` and `%b` are known to be either in the [-32768, 32767] range, undef, or poison.
Here is a link to a snippet that demonstrates the validity of this transformation to PMINSW/PMAXSW if the values of each lane of `%a` and `%b` are known to be either in the [-32768, 32767] range, undef, or poison:
https://alive2.llvm.org/ce/z/py_J9L
The failure to vectorize int16_t[8] min to pminsw pattern was reported in issue #48223 raised back in 2021.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs