https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So what goes wrong is the single-element interleaving code-gen for the pointer
copy.  We have

t.c:18:21: note:   Detected single element interleaving
picture_7(D)->p[i_18].p_pixels step 16

but for the store:

t.c:18:21: missed:   not consecutive access res_8(D)->p[i_18].p_pixels = _1;
t.c:18:21: note:   using strided accesses

...

t.c:18:21: note:   ==> examining statement: _1 =
picture_7(D)->p[i_18].p_pixels;
t.c:18:21: note:   vect_model_load_cost: aligned.
t.c:18:21: note:   vect_model_load_cost: inside_cost = 24, prologue_cost = 0 .

and in group get-load-store type we handle it as (V1DI)

      if (!STMT_VINFO_STRIDED_P (first_stmt_info)
          && (can_overrun_p || !would_overrun_p)
          && compare_step_with_zero (vinfo, stmt_info) > 0)
        {
          /* First cope with the degenerate case of a single-element
             vector.  */
          if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 1U))
            *memory_access_type = VMAT_CONTIGUOUS;

looks like this needs to be conditional on gap == 0?  Adding that fixes
the testcase.  This was added by g:6737facbb3c53a1f0158b05e4116c161ed9bc319
Richard?  It looks like the !STMT_VINFO_STRIDED_P check might have supposed
to prevent this?  In vectorizable_load we're also doing

  if (memory_access_type == VMAT_GATHER_SCATTER
      || (!slp && memory_access_type == VMAT_CONTIGUOUS))
    grouped_load = false;

but vect_transform_grouped_load doesn't like this case, possibly because
there's nothing to "permute" (eliding that alone doesn't fix the code-gen
issue).

Reply via email to