On Fri, Jun 28, 2024 at 3:15 AM liuhongt <[email protected]> wrote:
>
> for the testcase in the PR115406, here is part of the dump.
>
> char D.4882;
> vector(1) <signed-boolean:8> _1;
> vector(1) signed char _2;
> char _5;
>
> <bb 2> :
> _1 = { -1 };
>
> When assign { -1 } to vector(1} {signed-boolean:8},
> Since TYPE_PRECISION (itype) <= BITS_PER_UNIT, so it set each bit of dest
> with each vector elemnet. But i think the bit setting should only apply for
> TYPE_PRECISION (itype) < BITS_PER_UNIT. .i.e for vector(1).
> <signed-boolean:16>, it will be assigned as -1, instead of 1.
> Is there any specific reason vector(1) <signed-boolean:8> is handled
> differently from vector<1> <signed-boolean:16>?
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
I agree that <= BITS_PER_UNIT is suspicious, but the bit-precision
code should work for 8 bit
entities as well, it seems we only set the LSB of each element in the
"mask". ISTR that SVE
masks can have up to 8 bit elements (for 8 byte data elements), so
maybe that's why
<= BITS_PER_UNIT. So maybe instead of just setting one bit in
ptr[bit / BITS_PER_UNIT] |= 1 << (bit % BITS_PER_UNIT);
we should set elt_bits bits, aka (without testing)
ptr[bit / BITS_PER_UNIT] |= (1 << elt_bits - 1) << (bit
% BITS_PER_UNIT);
?
> gcc/ChangeLog:
>
> PR middle-end/115406
> * fold-const.cc (native_encode_vector_part): Don't set each
> bit to the dest when TYPE_PRECISION (itype) == BITS_PER_UNIT.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr115406.c: New test.
> ---
> gcc/fold-const.cc | 2 +-
> gcc/testsuite/gcc.target/i386/pr115406.c | 23 +++++++++++++++++++++++
> 2 files changed, 24 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr115406.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 710d697c021..0f045f851d1 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -8077,7 +8077,7 @@ native_encode_vector_part (const_tree expr, unsigned
> char *ptr, int len,
> {
> tree itype = TREE_TYPE (TREE_TYPE (expr));
> if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr))
> - && TYPE_PRECISION (itype) <= BITS_PER_UNIT)
> + && TYPE_PRECISION (itype) < BITS_PER_UNIT)
> {
> /* This is the only case in which elements can be smaller than a byte.
> Element 0 is always in the lsb of the containing byte. */
> diff --git a/gcc/testsuite/gcc.target/i386/pr115406.c
> b/gcc/testsuite/gcc.target/i386/pr115406.c
> new file mode 100644
> index 00000000000..623dff06fc3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr115406.c
> @@ -0,0 +1,23 @@
> +/* { dg-do run } */
> +/* { dg-options "-O0 -mavx512f" } */
> +/* { dg-require-effective-target avx512f } */
> +
> +typedef __attribute__((__vector_size__ (1))) char V;
> +
> +char
> +foo (V v)
> +{
> + return ((V) v == v)[0];
> +}
> +
> +int
> +main ()
> +{
> + if (!__builtin_cpu_supports ("avx512f"))
> + return 0;
> +
> + char x = foo ((V) { });
> + if (x != -1)
> + __builtin_abort ();
> + return 0;
> +}
> --
> 2.31.1
>