[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-07-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

--- Comment #9 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:348d890c287a7ec4c88d3082ae6105537bd39398

commit r15-2161-g348d890c287a7ec4c88d3082ae6105537bd39398
Author: Richard Sandiford 
Date:   Fri Jul 19 19:09:37 2024 +0100

Treat boolean vector elements as 0/-1 [PR115406]

Previously we built vector boolean constants using 1 for true
elements and 0 for false elements.  This matches the predicates
produced by SVE's PTRUE instruction, but leads to a miscompilation
on AVX, where all bits of a boolean element should be set.

One option for RTL would be to make this target-configurable.
But that isn't really possible at the tree level, where vectors
should work in a more target-independent way.  (There is currently
no way to create a "generic" packed boolean vector, but never say
never :))  And, if we were going to pick a generic behaviour,
it would make sense to use 0/-1 rather than 0/1, for consistency
with integer vectors.

Both behaviours should work with SVE on read, since SVE ignores
the upper bits in each predicate element.  And the choice shouldn't
make much difference for RTL, since all SVE predicate modes are
expressed as vectors of BI, rather than of multi-bit booleans.

I suspect there might be some fallout from this change on SVE.
But I think we should at least give it a go, and see whether any
fallout provides a strong counterargument against the approach.

gcc/
PR middle-end/115406
* fold-const.cc (native_encode_vector_part): For vector booleans,
check whether an element is nonzero and, if so, set all of the
correspending bits in the target image.
* simplify-rtx.cc (native_encode_rtx): Likewise.

gcc/testsuite/
PR middle-end/115406
* gcc.dg/torture/pr115406.c: New test.

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-07-19 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

Richard Sandiford  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Richard Sandiford  ---
Fixed.  Although the problem is latent on branches, I think we should only
consider a backport after a few months, to give plenty of time for fallout to
be detected.

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-07-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed|2024-06-10 00:00:00 |2024-7-18
 CC||rsandifo at gcc dot gnu.org

--- Comment #8 from Richard Biener  ---
IIRC Richard S. had a patch to native_encode_vector_part to fix this by making
it set all bits instead of just the LSB?

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

--- Comment #7 from Hongtao Liu  ---

> 
> BTW, when assign -1 to vector(1) , should the upper bit be
> cleared? Look like only 1 element boolean vector is cleared, but not
> vector(2) .
> If the upper bits are not cleared, both 2 cases are equal.

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 710d697c021..0f045f851d1 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -8077,7 +8077,7 @@ native_encode_vector_part (const_tree expr, unsigned char
*ptr, int len,
 {
   tree itype = TREE_TYPE (TREE_TYPE (expr));
   if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr))
-  && TYPE_PRECISION (itype) <= BITS_PER_UNIT)
+  && TYPE_PRECISION (itype) < BITS_PER_UNIT)
 {
   /* This is the only case in which elements can be smaller than a byte.
 Element 0 is always in the lsb of the containing byte.  */


Can fix this.

It looks like it supposed to handle for itype *less than* but not *less equal*
BITS_PER_UNIT?

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

--- Comment #6 from Hongtao Liu  ---
For 1 element vector, when backend doesn't support it's vector mode, the scalar
mode is used for the type, which makes expand_vec_cond_expr_p use QImode for
icode check.(vcond_mask_qiqi)

It could also be the case when both data type and cmp_type are
vector_boolean_type.

It looks like vcond_mask_qiqi is dichotomous.
For the former, it should be 
  operands[3] == 1 ? operands[1] : operands[2]

since mask is vector 1 boolean.

For the latter, it should be
 (operand[1] & operand[3]) | (operand[2] & ~operand[3]) 


BTW, when assign -1 to vector(1) , should the upper bit be
cleared? Look like only 1 element boolean vector is cleared, but not vector(2)
.
If the upper bits are not cleared, both 2 cases are equal.

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

--- Comment #5 from Hongtao Liu  ---
>   _2 = VEC_COND_EXPR <_1, { -1 }, { 0 }>;

Hmm, it should check vcond_mask_qiv1qi instead of vcond_mask_qiqi, I guess
since the backend doesn't supports v1qi, TYPE_MODE of V is QImode, then it
wrongly checked vcond_mask_qiqi.

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

--- Comment #4 from Hongtao Liu  ---

> 
> and for _2 = VIEW_CONVERT_EXPR(_1); we explicitly
> clear the upper bits due to PR113576, and then we get 1 hit the abort.
It's not VIEW_CONVERT_EXPR clear the uppper bits, but _1 = { -1 };

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

--- Comment #3 from Hongtao Liu  ---
typedef __attribute__((__vector_size__ (1))) char V;

char
foo (V v)
{
  return ((V) v == v)[0];
}

int
main ()
{
  char x = foo ((V) { });
  if (x != -1)
__builtin_abort ();
}

w/ vcond_mask_qiqi, it's not lowered by veclower, and we get

char foo (V v)
{
  vector(1) signed char D.5142;
  char D.5141;
  vector(1)  _1;
  vector(1) signed char _2;
  char _5;

   :
  _1 = { -1 };
  _2 = VEC_COND_EXPR <_1, { -1 }, { 0 }>;
  D.5142 = _2;
  _5 = VIEW_CONVERT_EXPR(D.5142);

   :
:
  return _5;
}

But it's further simplified to 

char foo (V v)
{
  vector(1) signed char D.3765;
  char D.3764;
  vector(1)  _1;
  vector(1) signed char _2;
  char _5;

   :
  _1 = { -1 };
  _2 = VIEW_CONVERT_EXPR(_1);
  D.3765 = _2;
  _5 = VIEW_CONVERT_EXPR(D.3765);

   :
:
  return _5;

}

by isel

and for _2 = VIEW_CONVERT_EXPR(_1); we explicitly clear
the upper bits due to PR113576, and then we get 1 hit the abort.

It sound to me 
  _1 = { -1 };
  _2 = VEC_COND_EXPR <_1, { -1 }, { 0 }>;
shouldn't be simplified to 
_2 = VIEW_CONVERT_EXPR(_1);

when nunits is less than mode precision since the upper bit will be cleared.

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

Hongtao Liu  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #2 from Hongtao Liu  ---
I'll take a look.

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406

Sam James  changed:

   What|Removed |Added

Summary|[15 Regression] wrong code  |[15 Regression] wrong code
   |with vector compare at -O0  |with vector compare at -O0
   |with -mavx512f  |with -mavx512f since
   ||r15-920-gb6c6d5abf0d31c
   Last reconfirmed||2024-06-10
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||liuhongt at gcc dot gnu.org

--- Comment #1 from Sam James  ---
r15-920-gb6c6d5abf0d31c