16 regression] Wrong alignment checking for VMAT_ELEMENTWISE memory accesses

cvs-commit at gcc dot gnu.org via Gcc-bugs Thu, 05 Mar 2026 06:03:44 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124037


--- Comment #9 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Victor Do Nascimento
<[email protected]>:

https://gcc.gnu.org/g:4a30b45ffe3cb4ad2e35d73f1714f1a80e32edd7

commit r16-7915-g4a30b45ffe3cb4ad2e35d73f1714f1a80e32edd7
Author: Victor Do Nascimento <[email protected]>
Date:   Tue Feb 10 16:36:31 2026 +0000

    vect: fix vectorization of non-gather elementwise loads [PR124037]

    For the vectorization of non-contiguous memory accesses such as the
    vectorization of loads from a particular struct member, specifically
    when vectorizing with unknown bounds (thus using a pointer and not an
    array) it is observed that inadequate alignment checking allows for
    the crossing of a page boundary within a single vectorized loop
    iteration. This leads to potential segmentation faults in the
    resulting binaries.

    For example, for the given datatype:

        typedef struct {
          uint64_t a;
          uint64_t b;
          uint32_t flag;
          uint32_t pad;
        } Data;

    and a loop such as:

    int
    foo (Data *ptr) {
      if (ptr == NULL)
        return -1;

      int cnt;
      for (cnt = 0; cnt < MAX; cnt++) {
        if (ptr->flag == 0)
          break;
        ptr++;
      }
      return cnt;
    }

    the vectorizer yields the following cfg on armhf:

    <bb 1>:
    _41 = ptr_4(D) + 16;
    <bb 2>:
    _44 = MEM[(unsigned int *)ivtmp_42];
    ivtmp_45 = ivtmp_42 + 24;
    _46 = MEM[(unsigned int *)ivtmp_45];
    ivtmp_47 = ivtmp_45 + 24;
    _48 = MEM[(unsigned int *)ivtmp_47];
    ivtmp_49 = ivtmp_47 + 24;
    _50 = MEM[(unsigned int *)ivtmp_49];
    vect_cst__51 = {_44, _46, _48, _50};
    mask_patt_6.17_52 = vect_cst__51 == { 0, 0, 0, 0 };
    if (mask_patt_6.17_52 != { 0, 0, 0, 0 })
      goto <bb 4>;
    else
      ivtmp_43 = ivtmp_42 + 96;
      goto <bb 2>;
    <bb4>
    ...

    without any proper address alignment checks on the starting address
    or on whether alignment is preserved across iterations. We therefore
    fix the handling of such cases.

    To correct this, we modify the logic in `get_load_store_type',
    particularly the logic responsible for ensuring we don't read more
    than the scalar code would in the context of early breaks, extending
    it from handling not only gather-scatter and strided SLP accesses but
    also allowing it to properly handle element-wise accesses, wherein we
    specify that these need correct block alignment, thus promoting their
    `alignment_support_scheme' from `dr_unaligned_supported' to
    `dr_aligned'.

    gcc/ChangeLog:

            PR tree-optimization/124037
            * tree-vect-stmts.cc (get_load_store_type): Fix
            alignment_support_scheme categorization for early
            break VMAT_ELEMENTWISE accesses.

    gcc/testsuite/ChangeLog:

            * gcc.dg/vect/vect-pr124037.c: New.
            * g++.dg/vect/vect-pr124037.cc: New.

[Bug tree-optimization/124037] [15/16 regression] Wrong alignment checking for VMAT_ELEMENTWISE memory accesses

Reply via email to