Hi,

I am working on the LLVM IR of x86 vector intrinsics. In the target-specific 
header file <xmmintrin.h>
some intrinsics are defined using GCC builtins, and some other are implemented 
using the vector support
provided by LLVM, which I guess is the preferred method whenever it is possible.

What determines when to use LLVM vector instructions and when to use GCC 
builtins?
For example, the store low value intrinsic is defined as:

static __inline__ void __attribute__((__always_inline__))
_mm_store_ss(float *__p, __m128 __a)
{
  struct __mm_store_ss_struct {
    float __u;
  } __attribute__((__packed__, __may_alias__));
  ((struct __mm_store_ss_struct*)__p)->__u = __a[0];
}


But the store packed intrinsic is defined as:

static __inline__ void __attribute__((__always_inline__, __nodebug__))
_mm_storeu_ps(float *__p, __m128 __a)
{
  __builtin_ia32_storeups(__p, __a);
}

Why not?

static __inline__ void __attribute__((__always_inline__, __nodebug__))
_mm_storeu_ps(float *__p, __m128 __a)
{

 struct __mm_store_ps_struct {
    __m128  __u;
  } __attribute__((__packed__, __may_alias__));
  ((struct __mm_store_ps_struct*)__p)->__u = __a;

}

Loads are defined this way. That would generate more consisten LLVM IR,
because at the moment vector loads are translated into native vector
operations in LLVM IR, but vector stores are translated into calls to
external intrinsics.

Many thanks in advance,

Victoria

_______________________________________________
cfe-users mailing list
cfe-users@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-users

Reply via email to