https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908

--- Comment #10 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #9)
> One issue with
> 
> V load3(const unsigned long* ptr)
> {
>   V ret = {};
>   __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long));
> 
> is that we cannot load a vector worth of data from ptr because that might
> trap

Unless the target has a masked load instruction (e.g. AVX512) or ptr is known
to be aligned to at least 16 Bytes (in which case we know there cannot be a
page boundary at ptr + 24 Bytes). No? In this specific example, ptr is pointing
to a 32-Byte vector object.

The library can do this and it makes a difference:

    if (__builtin_object_size(ptr, 0) >= 4 * sizeof(T))
      __builtin_memcpy(&ret, ptr, 4 * sizeof(T));
    else
      __builtin_memcpy(&ret, ptr, 3 * sizeof(T));

Reply via email to