On Wed, 14 Feb 2024, Richard Sandiford wrote:

> Richard Biener <rguent...@suse.de> writes:
> > The following avoids accessing out-of-bound vector elements when
> > native encoding a boolean vector with sub-BITS_PER_UNIT precision
> > elements.  The error was basing the number of elements to extract
> > on the rounded up total byte size involved and the patch bases
> > everything on the total number of elements to extract instead.
> 
> It's too long ago to be certain, but I think this was a deliberate choice.
> The point of the new vector constant encoding is that it can give an
> allegedly sensible value for any given index, even out-of-range ones.
> 
> Since the padding bits are undefined, we should in principle have a free
> choice of what to use.  And for VLA, it's often better to continue the
> existing pattern rather than force to zero.
> 
> I don't strongly object to changing it.  I think we should be careful
> about relying on zeroing for correctness though.  The bits are in principle
> undefined and we can't rely on reading zeros from equivalent memory or
> register values.

The main motivation for a change here is to allow catching out-of-bound
indices again for VECTOR_CST_ELT, at least for constant nunits because
it might be a programming error like fat-fingering the index.  I do
think it's a regression that we no longer catch those.

It's probably also a bit non-obvious how an encoding continues and
there might be DImode masks that can be represented by a 
zero-extended QImode immediate but "continued" it would require
a larger immediate.

The change also effectively only changes something for 1 byte
encodings since nunits is a power of two and so is the element
size in bits.

A patch restoring the VECTOR_CST_ELT checking might be the
following

diff --git a/gcc/tree.cc b/gcc/tree.cc
index 046a558d1b0..4c9b05167fd 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -10325,6 +10325,9 @@ vector_cst_elt (const_tree t, unsigned int i)
   if (i < encoded_nelts)
     return VECTOR_CST_ENCODED_ELT (t, i);
 
+  /* Catch out-of-bound element accesses.  */
+  gcc_checking_assert (maybe_gt (VECTOR_CST_NELTS (t), i));
+
   /* If there are no steps, the final encoded value is the right one.  */
   if (!VECTOR_CST_STEPPED_P (t))
     {

but it triggers quite a bit via const_binop for, for example

#2  0x00000000011c1506 in const_binop (code=PLUS_EXPR, 
    arg1=<vector_cst 0x7ffff6f40b40>, arg2=<vector_cst 0x7ffff6e731e0>)
(gdb) p debug_generic_expr (arg1)
{ 12, 13, 14, 15 }
$5 = void
(gdb) p debug_generic_expr (arg2)
{ -2, -2, -2, -3 }
(gdb) p count
$4 = 6
(gdb) l
1711          if (!elts.new_binary_operation (type, arg1, arg2, 
step_ok_p))
1712            return NULL_TREE;
1713          unsigned int count = elts.encoded_nelts ();
1714          for (unsigned int i = 0; i < count; ++i)
1715            {
1716              tree elem1 = VECTOR_CST_ELT (arg1, i);
1717              tree elem2 = VECTOR_CST_ELT (arg2, i);
1718
1719              tree elt = const_binop (code, elem1, elem2);

this seems like an error to me - why would we, for fixed-size
vectors and for PLUS ever create a vector encoding with 6 elements?!
That seems at least inefficient to me?

Richard.

> Thanks,
> Richard
> >
> > As a side-effect this now consistently results in zeros in the
> > padding of the last encoded byte which also avoids the failure
> > mode seen in PR113576.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > OK?
> >
> > Thanks,
> > Richard.
> >
> >     PR middle-end/113576
> >     * fold-const.cc (native_encode_vector_part): Avoid accessing
> >     out-of-bound elements.
> > ---
> >  gcc/fold-const.cc | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 80e211e18c0..8638757312b 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -8057,13 +8057,13 @@ native_encode_vector_part (const_tree expr, 
> > unsigned char *ptr, int len,
> >     off = 0;
> >  
> >        /* Zero the buffer and then set bits later where necessary.  */
> > -      int extract_bytes = MIN (len, total_bytes - off);
> > +      unsigned elts_per_byte = BITS_PER_UNIT / elt_bits;
> > +      unsigned first_elt = off * elts_per_byte;
> > +      unsigned extract_elts = MIN (len * elts_per_byte, count - first_elt);
> > +      unsigned extract_bytes = CEIL (elt_bits * extract_elts, 
> > BITS_PER_UNIT);
> >        if (ptr)
> >     memset (ptr, 0, extract_bytes);
> >  
> > -      unsigned int elts_per_byte = BITS_PER_UNIT / elt_bits;
> > -      unsigned int first_elt = off * elts_per_byte;
> > -      unsigned int extract_elts = extract_bytes * elts_per_byte;
> >        for (unsigned int i = 0; i < extract_elts; ++i)
> >     {
> >       tree elt = VECTOR_CST_ELT (expr, first_elt + i);
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to