On Wed, 14 Feb 2024, Richard Sandiford wrote:
> Richard Biener <[email protected]> writes:
> > The following avoids accessing out-of-bound vector elements when
> > native encoding a boolean vector with sub-BITS_PER_UNIT precision
> > elements. The error was basing the number of elements to extract
> > on the rounded up total byte size involved and the patch bases
> > everything on the total number of elements to extract instead.
>
> It's too long ago to be certain, but I think this was a deliberate choice.
> The point of the new vector constant encoding is that it can give an
> allegedly sensible value for any given index, even out-of-range ones.
>
> Since the padding bits are undefined, we should in principle have a free
> choice of what to use. And for VLA, it's often better to continue the
> existing pattern rather than force to zero.
>
> I don't strongly object to changing it. I think we should be careful
> about relying on zeroing for correctness though. The bits are in principle
> undefined and we can't rely on reading zeros from equivalent memory or
> register values.
The main motivation for a change here is to allow catching out-of-bound
indices again for VECTOR_CST_ELT, at least for constant nunits because
it might be a programming error like fat-fingering the index. I do
think it's a regression that we no longer catch those.
It's probably also a bit non-obvious how an encoding continues and
there might be DImode masks that can be represented by a
zero-extended QImode immediate but "continued" it would require
a larger immediate.
The change also effectively only changes something for 1 byte
encodings since nunits is a power of two and so is the element
size in bits.
A patch restoring the VECTOR_CST_ELT checking might be the
following
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 046a558d1b0..4c9b05167fd 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -10325,6 +10325,9 @@ vector_cst_elt (const_tree t, unsigned int i)
if (i < encoded_nelts)
return VECTOR_CST_ENCODED_ELT (t, i);
+ /* Catch out-of-bound element accesses. */
+ gcc_checking_assert (maybe_gt (VECTOR_CST_NELTS (t), i));
+
/* If there are no steps, the final encoded value is the right one. */
if (!VECTOR_CST_STEPPED_P (t))
{
but it triggers quite a bit via const_binop for, for example
#2 0x00000000011c1506 in const_binop (code=PLUS_EXPR,
arg1=<vector_cst 0x7ffff6f40b40>, arg2=<vector_cst 0x7ffff6e731e0>)
(gdb) p debug_generic_expr (arg1)
{ 12, 13, 14, 15 }
$5 = void
(gdb) p debug_generic_expr (arg2)
{ -2, -2, -2, -3 }
(gdb) p count
$4 = 6
(gdb) l
1711 if (!elts.new_binary_operation (type, arg1, arg2,
step_ok_p))
1712 return NULL_TREE;
1713 unsigned int count = elts.encoded_nelts ();
1714 for (unsigned int i = 0; i < count; ++i)
1715 {
1716 tree elem1 = VECTOR_CST_ELT (arg1, i);
1717 tree elem2 = VECTOR_CST_ELT (arg2, i);
1718
1719 tree elt = const_binop (code, elem1, elem2);
this seems like an error to me - why would we, for fixed-size
vectors and for PLUS ever create a vector encoding with 6 elements?!
That seems at least inefficient to me?
Richard.
> Thanks,
> Richard
> >
> > As a side-effect this now consistently results in zeros in the
> > padding of the last encoded byte which also avoids the failure
> > mode seen in PR113576.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > OK?
> >
> > Thanks,
> > Richard.
> >
> > PR middle-end/113576
> > * fold-const.cc (native_encode_vector_part): Avoid accessing
> > out-of-bound elements.
> > ---
> > gcc/fold-const.cc | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 80e211e18c0..8638757312b 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -8057,13 +8057,13 @@ native_encode_vector_part (const_tree expr,
> > unsigned char *ptr, int len,
> > off = 0;
> >
> > /* Zero the buffer and then set bits later where necessary. */
> > - int extract_bytes = MIN (len, total_bytes - off);
> > + unsigned elts_per_byte = BITS_PER_UNIT / elt_bits;
> > + unsigned first_elt = off * elts_per_byte;
> > + unsigned extract_elts = MIN (len * elts_per_byte, count - first_elt);
> > + unsigned extract_bytes = CEIL (elt_bits * extract_elts,
> > BITS_PER_UNIT);
> > if (ptr)
> > memset (ptr, 0, extract_bytes);
> >
> > - unsigned int elts_per_byte = BITS_PER_UNIT / elt_bits;
> > - unsigned int first_elt = off * elts_per_byte;
> > - unsigned int extract_elts = extract_bytes * elts_per_byte;
> > for (unsigned int i = 0; i < extract_elts; ++i)
> > {
> > tree elt = VECTOR_CST_ELT (expr, first_elt + i);
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)