On Fri, Jun 28, 2024 at 3:15 AM liuhongt <hongtao....@intel.com> wrote:
>
> for the testcase in the PR115406, here is part of the dump.
>
>   char D.4882;
>   vector(1) <signed-boolean:8> _1;
>   vector(1) signed char _2;
>   char _5;
>
>   <bb 2> :
>   _1 = { -1 };
>
> When assign { -1 } to vector(1} {signed-boolean:8},
> Since TYPE_PRECISION (itype) <= BITS_PER_UNIT, so it set each bit of dest
> with each vector elemnet. But i think the bit setting should only apply for
> TYPE_PRECISION (itype) < BITS_PER_UNIT. .i.e for vector(1).
> <signed-boolean:16>, it will be assigned as -1, instead of 1.
> Is there any specific reason vector(1) <signed-boolean:8> is handled
> differently from vector<1> <signed-boolean:16>?
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?

I agree that <= BITS_PER_UNIT is suspicious, but the bit-precision
code should work for 8 bit
entities as well, it seems we only set the LSB of each element in the
"mask".  ISTR that SVE
masks can have up to 8 bit elements (for 8 byte data elements), so
maybe that's why
<= BITS_PER_UNIT.  So maybe instead of just setting one bit in

              ptr[bit / BITS_PER_UNIT] |= 1 << (bit % BITS_PER_UNIT);

we should set elt_bits bits, aka (without testing)

              ptr[bit / BITS_PER_UNIT] |= (1 << elt_bits - 1) << (bit
% BITS_PER_UNIT);

?

> gcc/ChangeLog:
>
>         PR middle-end/115406
>         * fold-const.cc (native_encode_vector_part): Don't set each
>         bit to the dest when TYPE_PRECISION (itype) == BITS_PER_UNIT.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/i386/pr115406.c: New test.
> ---
>  gcc/fold-const.cc                        |  2 +-
>  gcc/testsuite/gcc.target/i386/pr115406.c | 23 +++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr115406.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 710d697c021..0f045f851d1 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -8077,7 +8077,7 @@ native_encode_vector_part (const_tree expr, unsigned 
> char *ptr, int len,
>  {
>    tree itype = TREE_TYPE (TREE_TYPE (expr));
>    if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr))
> -      && TYPE_PRECISION (itype) <= BITS_PER_UNIT)
> +      && TYPE_PRECISION (itype) < BITS_PER_UNIT)
>      {
>        /* This is the only case in which elements can be smaller than a byte.
>          Element 0 is always in the lsb of the containing byte.  */
> diff --git a/gcc/testsuite/gcc.target/i386/pr115406.c 
> b/gcc/testsuite/gcc.target/i386/pr115406.c
> new file mode 100644
> index 00000000000..623dff06fc3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr115406.c
> @@ -0,0 +1,23 @@
> +/* { dg-do run } */
> +/* { dg-options "-O0 -mavx512f" } */
> +/* { dg-require-effective-target avx512f } */
> +
> +typedef __attribute__((__vector_size__ (1))) char V;
> +
> +char
> +foo (V v)
> +{
> +  return ((V) v == v)[0];
> +}
> +
> +int
> +main ()
> +{
> +  if (!__builtin_cpu_supports ("avx512f"))
> +    return 0;
> +
> +  char x = foo ((V) { });
> +  if (x != -1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --
> 2.31.1
>

Reply via email to