PMAXSW on x86 with SSE2 when values are known to be within i16 range

LLVM Bugs via llvm-bugs Thu, 01 Jan 2026 16:07:55 -0800

Issue	174169
Summary	LLVM fails to optimize out i32/i64 smin/smax down to PMINSW/PMAXSW on x86 with SSE2 when values are known to be within i16 range
Labels	new issue
Assignees
Reporter	johnplatts

    `@llvm.smin.v4i32(%a, %b)` and `@llvm.smin.v2i64(%a, %b)` can both be optimized to PMINSW (or `@llvm.smin.v8i16`) on x86 with SSE2 if the values in each lane of both `%a` and `%b` are known to be either in the [-32768, 32767] range, undef, or poison.


Likewise, `@llvm.smax.v4i32(%a, %b)` and `@llvm.smax.v2i64(%a, %b)` can both be optimized to PMAXSW (or `@llvm.smax.v8i16`) on x86 with SSE2 if the values in each lane of both `%a` and `%b` are known to be either in the [-32768, 32767] range, undef, or poison.

Here is a link to a snippet that demonstrates the validity of this transformation to PMINSW/PMAXSW if the values of each lane of `%a` and `%b` are known to be either in the [-32768, 32767] range, undef, or poison:
https://alive2.llvm.org/ce/z/py_J9L

The failure to vectorize int16_t[8] min to pminsw pattern was reported in issue #48223 raised back in 2021.

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 174169] LLVM fails to optimize out i32/i64 smin/smax down to PMINSW/PMAXSW on x86 with SSE2 when values are known to be within i16 range

Reply via email to