https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*, i?86-*-*
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2015-11-23
                 CC|                            |hjl.tools at gmail dot com
          Component|c++                         |target
     Ever confirmed|0                           |1

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
As the summary mentions 'volatile' I'll also point to the implementation of the
intrinsics which have

/* Store four SPFP values.  The address must be 16-byte aligned.  */
extern __inline void __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
_mm_store_ps (float *__P, __m128 __A)
{
  *(__v4sf *)__P = (__v4sf)__A;
}

so they are not using a volatile qualified type to access *__P which means
the stores are not considered volatile by GCC.

The arguments about strict-aliasing requirements still hold, only __m128 is
declared as __may_alias__:

/* The Intel API is flexible enough that we must allow aliasing with other
   vector types, and their scalar components.  */
typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));

/* Internal data types for implementing the intrinsics.  */
typedef float __v4sf __attribute__ ((__vector_size__ (16)));

so the v4sf store has regular TBAA rules applied (and the __may_alias__ on
the by value passed __A has no effect).

-> target "bug", but I'd say an INVALID one.

HJ, I remember the "master" copy of the intrinsics documentation is somewhere
at Intel - what does that say to the two above issues?

Thus all of this boils down to the question whether the intrinsics are
implemented correctly (as documented).  The volatile part of it would
mean to either pessimize all users or that we can't implement the
intrinsics as C functions.

Reply via email to