https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113025

--- Comment #4 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Xi Ruoyao from comment #3)
> (In reply to juki from comment #2)
> > Unfortunately alignment of the cast type was not causing this issue.
> > 
> > I changed all calls that were defined in GCC headers to use __m128i_u or
> > __m128d_u types to use those types before unaligned intrinsic.
> > 
> > For example LOAD_SI128 macro looks like the following:
> > 
> > #define LOAD_SI128(ptr) \
> >         ( ((uintptr_t)(ptr) & 15) == 0 ) ? _mm_load_si128((__m128i*)(ptr)) :
> > _mm_loadu_si128((__m128i_u*)(ptr))
> 
> This won't work if ptr is a __m128i *.  It is allowed to optimize
> (uintptr_t)(__m128i *)foo % 15 to 0 because the standard says (__m128i *)foo

I mean % 16, not % 15.

> invokes undefined behavior when foo is a pointer not aligned to 16-byte
> boundary (C23 section 6.3.2.3p6).

Reply via email to