On Wed, 26 Feb 2025, Tamar Christina wrote:
> > > >
> > > > No, I don't think so. The code that eventually performs a
> > > > contiguous sub-group access directly should never extend
> > > > the load beyond GROUP_SIZE - or should be gated on the DR
> > > > not executed speculatively. That is, we should "fix" this
> > > > elsewhere.
> > > >
> > >
> > > It doesn't, it's just not aligned within the range of GROUP_SIZE
> > > from what I remember.
> > >
> > > > If you have an updated patch I can look at what's wrong here if you
> > > > tell me how to reproduce (after applying the patch I suppose).
> > >
> > > Yes, applying the patch and running:
> > >
> > > /work/build/gcc/xgcc -B/work/build/gcc/
> > /work/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c -m64
> > -fdiagnostics-
> > plain-output -flto -ffat-lto-objects -msse2 -ftree-vectorize
> > -fno-tree-loop-
> > distribute-patterns -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-
> > details -msse4.1 -lm -o ./vect-early-break_26.exe
> >
> > So it works as in executing fine. We have a VF of 4 and
> >
> > note: recording new base alignment for &b
> > alignment: 32
> > misalignment: 0
> > based on: _1 = b[i_32];
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note: recording new base alignment for &a
> > alignment: 32
> > misalignment: 0
> > based on: _2 = a[i_32];
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note: vect_compute_data_ref_alignment:
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note: alignment increased due to early break to 32 bytes.
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > missed: misalign = 8 bytes of ref b[i_32]
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note: vect_compute_data_ref_alignment:
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note: alignment increased due to early break to 32 bytes.
> >
> > so no peeling necessary. But we also have like
> >
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > missed: misalign = 12 bytes of ref b[_6]
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note: vect_compute_data_ref_alignment:
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note: alignment increased due to early break to 32 bytes.
> >
> > and we are correctly saying we vectorize an unaligned access.
> >
> > The "issue" is we're having SLP nodes with a load permutation, their
> > expansion might not happen with the whole DR group in mind. I'd say
> > we simply refuse to do early break speculative load vectorization
> > for SLP nodes with a load permutation.
>
> This is what I was trying to say on IRC when I mentioned that the permutes
> can end up creating an unaligned access wrt to the original address.
>
> But the reason I was still trying to allow this case is because conceptually
> my assumption was that the permutes still maintain the access within
> the group. After all, they're just shifting elements around.
>
> In other words, I was assuming that the group a[i] - a[i-2] still stays within
> the group alignment of 32-bytes, even if the permute can make the second
> load in the group start at say, byte 28. My assumption was though that it
> can't
> make it start at byte 36.
>
> Are you saying that this is the case? that it can? Then I agree the load
> permutations
> on group loads are unsafe to speculate for unmasked loops...
Yes, looking at the code generation the loads do not stay within the
original properly aligned boundary.
Richard.
> Thanks,
> Tamar
> >
> > It looks like a latent issue to me which could also interfere with
> > gap peeling, I have to dig a bit further what code is responsible
> > for the current behavior ...
> >
> >
> >
> > > Thanks,
> > > Tamar
> > >
> > > >
> > > > > Enforcing the alignment on every group member would be wrong I think
> > > > > since
> > > > > that ends up higher overall alignment than they need.
> > > > >
> > > > > > So besides these issues in get_load_store_type the change looks
> > > > > > good now.
> > > > > >
> > > > >
> > > > > Thanks for the reviews.
> > > > >
> > > > > Tamar
> > > > > > Richard.
> > > > > >
> > > > > > > + else
> > > > > > > + *alignment_support_scheme = dr_aligned;
> > > > > > > + }
> > > > > > > +
> > > > > > > if (*alignment_support_scheme == dr_unaligned_unsupported)
> > > > > > > {
> > > > > > > if (dump_enabled_p ())
> > > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > > > > > index
> > > > > >
> > > >
> > b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..97caf61b345735d297ec49fd6ca
> > > > > > 64797435b46fc 100644
> > > > > > > --- a/gcc/tree-vectorizer.h
> > > > > > > +++ b/gcc/tree-vectorizer.h
> > > > > > > @@ -1281,7 +1281,11 @@ public:
> > > > > > >
> > > > > > > /* Set by early break vectorization when this DR needs peeling
> > > > > > > for
> > alignment
> > > > > > > for correctness. */
> > > > > > > - bool need_peeling_for_alignment;
> > > > > > > + bool safe_speculative_read_required;
> > > > > > > +
> > > > > > > + /* Set by early break vectorization when this DR's scalar
> > > > > > > accesses are
> > known
> > > > > > > + to be inbounds of a known bounds loop. */
> > > > > > > + bool scalar_access_known_in_bounds;
> > > > > > >
> > > > > > > tree base_decl;
> > > > > > >
> > > > > > > @@ -1997,6 +2001,35 @@ dr_target_alignment (dr_vec_info *dr_info)
> > > > > > > return dr_info->target_alignment;
> > > > > > > }
> > > > > > > #define DR_TARGET_ALIGNMENT(DR) dr_target_alignment (DR)
> > > > > > > +#define DR_SCALAR_KNOWN_BOUNDS(DR) (DR)-
> > > > > > >scalar_access_known_in_bounds
> > > > > > > +
> > > > > > > +/* Return if the stmt_vec_info requires peeling for alignment.
> > > > > > > */
> > > > > > > +inline bool
> > > > > > > +dr_safe_speculative_read_required (stmt_vec_info stmt_info)
> > > > > > > +{
> > > > > > > + dr_vec_info *dr_info;
> > > > > > > + if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > > > > + dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > > > (stmt_info));
> > > > > > > + else
> > > > > > > + dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > > > > +
> > > > > > > + return dr_info->safe_speculative_read_required;
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* Set the safe_speculative_read_required for the the
> > > > > > > stmt_vec_info, if
> > group
> > > > > > > + access then set on the fist element otherwise set on DR
> > > > > > > directly. */
> > > > > > > +inline void
> > > > > > > +dr_set_safe_speculative_read_required (stmt_vec_info stmt_info,
> > > > > > > + bool requires_alignment)
> > > > > > > +{
> > > > > > > + dr_vec_info *dr_info;
> > > > > > > + if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > > > > + dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > > > (stmt_info));
> > > > > > > + else
> > > > > > > + dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > > > > +
> > > > > > > + dr_info->safe_speculative_read_required = requires_alignment;
> > > > > > > +}
> > > > > > >
> > > > > > > inline void
> > > > > > > set_dr_target_alignment (dr_vec_info *dr_info, poly_uint64 val)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <[email protected]>
> > > > > > SUSE Software Solutions Germany GmbH,
> > > > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > > Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <[email protected]>
> > > > SUSE Software Solutions Germany GmbH,
> > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> > >
> >
> > --
> > Richard Biener <[email protected]>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)