[Bug target/121099] GCC doesn't optimize `_mm_set_ps()` very well
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121099 --- Comment #3 from LIU Hao --- Yes, INSERTPS requires SSE4.1. However code is compiled with AVX so it should be preferred.
[Bug target/121099] GCC doesn't optimize `_mm_set_ps()` very well
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121099
Richard Biener changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed||2025-07-16
--- Comment #2 from Richard Biener ---
We expand from
_1 = -y_2(D);
_5 = {x_4(D), 0.0, _1, 0.0};
_6 = {y_2(D), y_2(D), x_4(D), x_4(D)};
_7 = __builtin_ia32_cmpgtps (_6, _5);
_8 = __builtin_ia32_movmskps (_7); [tail call]
return _8;
{y_2(D), y_2(D), x_4(D), x_4(D)} should be handled by target vec_init.
The quoted clang code needs more than just SSE2.
[Bug target/121099] GCC doesn't optimize `_mm_set_ps()` very well
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121099 --- Comment #1 from LIU Hao --- Given `y` in XMM0 and `x` in XMM1, `_mm_set_ps(x, x, y, y)` is clearly just `vshufps xmm2, xmm0, xmm1, 0` no matter what.
[Bug target/121099] GCC doesn't optimize `_mm_set_ps()` very well
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121099 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
