Issue 91247
Summary [ppc64] SSE/VSX wrapper missing _mm_loadu_si64 function
Labels new issue
Reporter madscientist159
    The SSE/VSX wrappers for ppc64[el] are missing the `_mm_loadu_si64()` function.  This function appears to largely be an alias of `_mm_set_epi64()`, with an explicit unaligned load capability. However, `_mm_set_epi64()` also allows unaligned load in practice, and the ppc64[el] wrapper function for `_mm_set_epi64()` already enabled unaligned loads on POWER7+.

It appears the needed function is as follows -- this was tested on a Talos II workstation (POWER9) in Skia and functions correctly:

/* Load signed 64-bit integer from P into vector element 0.  The address need not be 16-byte aligned.  */
extern __inline __m128i
    __attribute__((__gnu_inline__, __always_inline__, __artificial__))
    _mm_loadu_si64 (void const *__P)
  return _mm_set_epi64((__m64)0LL, *(__m64 *)__P);

If desired I can create a merge request to add this function in to `emmintrin.h`.
llvm-bugs mailing list

Reply via email to