================
@@ -1558,10 +1559,10 @@ _mm_cvttss_si64(__m128 __a)
 /// \param __a
 ///    A 128-bit vector of [4 x float].
 /// \returns A 64-bit integer vector containing the converted values.
-static __inline__ __m64 __DEFAULT_FN_ATTRS_MMX
+static __inline__ __m64 __DEFAULT_FN_ATTRS_SSE2
 _mm_cvttps_pi32(__m128 __a)
 {
-  return (__m64)__builtin_ia32_cvttps2pi((__v4sf)__a);
+  return __trunc64(__builtin_ia32_cvttps2dq((__v4sf)__zeroupper64(__a)));
----------------
jyknight wrote:

I'm not sure: is `__builtin_convertvector` from float->int guaranteed to have 
the same semantics as this requires?

Even if feasible, I'd prefer to leave that change to some future work that 
eliminates the `__builtin_ia32_cvttps2dq` (and similar functions), since the 
same should be done to `_mm_cvttps_epi32`, `_mm256_cvttps_epi32`, 
`_mm_cvtpd_epi32`, `_mm_cvtpd_pi32`, and `_mm256_cvtpd_epi32`, at least.

https://github.com/llvm/llvm-project/pull/96540
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to