[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

--- Comment #10 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:f725d6765373f7884a2ea23bc11409b15545958b

commit r16-809-gf725d6765373f7884a2ea23bc11409b15545958b
Author: Andrew Pinski 
Date:   Mon May 5 09:46:14 2025 -0700

combine: gen_lowpart_no_emit vs CLOBBER [PR120090]

The problem here is simplify-rtx.cc expects gen_lowpart_no_emit
to return NULL on failure but combine's hook was returning CLOBBER.
After r16-160-ge6f89d78c1a7528e93458278,
gcc.target/i386/avx512bw-pr103750-2.c
started to fail at -m32 due to this as new simplify code would return
a RTL with a clobber in it rather than returning NULL.
To fix this gen_lowpart_no_emit should return NULL when there was an
failure
instead of a clobber. This only changes the gen_lowpart_no_emit hook and
not the
generic gen_lowpart hook as parts of combine just pass gen_lowpart result
directly
without checking the return value.

Bootstrapped and tested on x86_64-linux-gnu.

PR rtl-optimization/120090
gcc/ChangeLog:

* combine.cc (gen_lowpart_for_combine_no_emit): New function.
(RTL_HOOKS_GEN_LOWPART_NO_EMIT): Set to
gen_lowpart_for_combine_no_emit.

Signed-off-by: Andrew Pinski 

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #11 from Andrew Pinski  ---
Fixed.

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

--- Comment #9 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2025-May/682674.html

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

--- Comment #8 from Andrew Pinski  ---
Created attachment 61322
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61322&action=edit
Patch which I am testing

I did a quick test with the patch and compile.exp had no failures on x86_64.
Will do a full bootstrap/test before submitting.

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

--- Comment #7 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #6)
> 
> But nowhere in simplify-rtx.cc checks that gen_lowpart_no_emit will return
> CLOBBER. Or should we wrap gen_lowpart_for_combine and return NULL when it
> is a clobber ...
> 

That is this:
```
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 67cf0447607..366886020e7 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -458,6 +458,7 @@ static rtx simplify_shift_const (rtx, enum rtx_code,
machine_mode, rtx,
 int);
 static int recog_for_combine (rtx *, rtx_insn *, rtx *, unsigned = 0, unsigned
= 0);
 static rtx gen_lowpart_for_combine (machine_mode, rtx);
+static rtx gen_lowpart_for_combine_no_emit (machine_mode, rtx);
 static enum rtx_code simplify_compare_const (enum rtx_code, machine_mode,
 rtx *, rtx *);
 static enum rtx_code simplify_comparison (enum rtx_code, rtx *, rtx *);
@@ -491,7 +492,7 @@ static rtx gen_lowpart_or_truncate (machine_mode, rtx);

 /* Our implementation of gen_lowpart never emits a new pseudo.  */
 #undef RTL_HOOKS_GEN_LOWPART_NO_EMIT
-#define RTL_HOOKS_GEN_LOWPART_NO_EMIT  gen_lowpart_for_combine
+#define RTL_HOOKS_GEN_LOWPART_NO_EMIT  gen_lowpart_for_combine_no_emit

 #undef RTL_HOOKS_REG_NONZERO_REG_BITS
 #define RTL_HOOKS_REG_NONZERO_REG_BITS reg_nonzero_bits_for_combine
@@ -11890,6 +11891,16 @@ gen_lowpart_for_combine (machine_mode omode, rtx x)
  fail:
   return gen_rtx_CLOBBER (omode, const0_rtx);
 }
+
+static rtx
+gen_lowpart_for_combine_no_emit (machine_mode omode, rtx x)
+{
+  rtx tem = gen_lowpart_for_combine (omode, x);
+  if (!tem || GET_CODE (tem) == CLOBBER)
+return NULL_RTX;
+  return tem;
+}
+
 ^L
 /* Try to simplify a comparison between OP0 and a constant OP1,
where CODE is the comparison code that will be tested, into a

```

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #6 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> before:(and:SI (subreg:SI (unspec:QI [
> (mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0 S16
> A128])
> (reg:V8HI 110 [ _16 ])
> (const_int 0 [0])
> ] UNSPEC_UNSIGNED_PCMP) 0)
> (const_int 255 [0xff]))
> after:(and:DI (clobber:DI (const_int 0 [0]))
> (const_int 255 [0xff]))
> 
> 
> clobber means we can't do it ...

Instead of rtl_hooks.gen_lowpart_no_emit return NULL, combine's gen_lowpart
returns a clobber. 

This fixes the issue at hand:
```
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 7bcbe11370f..ea730c4d9aa 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -1716,7 +1716,7 @@ simplify_context::simplify_unary_operation_1 (rtx_code
code, machine_mode mode,
  && INTVAL (XEXP (op, 1)) > 0)
{
  rtx tem = rtl_hooks.gen_lowpart_no_emit (mode, XEXP (op, 0));
- if (tem)
+ if (tem && GET_CODE (tem) != CLOBBER)
return simplify_gen_binary (AND, mode, tem, XEXP (op, 1));
}


```

But nowhere in simplify-rtx.cc checks that gen_lowpart_no_emit will return
CLOBBER. Or should we wrap gen_lowpart_for_combine and return NULL when it is a
clobber ...

Still deciding.

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

--- Comment #5 from Andrew Pinski  ---

before:(and:SI (subreg:SI (unspec:QI [
(mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0 S16
A128])
(reg:V8HI 110 [ _16 ])
(const_int 0 [0])
] UNSPEC_UNSIGNED_PCMP) 0)
(const_int 255 [0xff]))
after:(and:DI (clobber:DI (const_int 0 [0]))
(const_int 255 [0xff]))


clobber means we can't do it ...

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

--- Comment #4 from Andrew Pinski  ---
Created attachment 61321
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61321&action=edit
reduced test which show the kmov

This removes the other functions and leaves one which shows the issue.

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

--- Comment #3 from Andrew Pinski  ---
Before we could do this
```
Trying 15 -> 16:
   15: {r121:SI=zero_extend(unspec[[r98:SI],r112:V8HI,0]
157);r115:QI=unspec[[r98:SI],r112:V8HI,0] 157;}
  REG_DEAD r112:V8HI
  REG_DEAD r98:SI
   16: r120:DI=zero_extend(r121:SI)
  REG_DEAD r121:SI
Successfully matched this instruction:
(parallel [
(set (reg:DI 120 [ _6 ])
(zero_extend:DI (unspec:QI [
(mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0
S16 A128])
(reg:V8HI 112 [ _18 ])
(const_int 0 [0])
] UNSPEC_UNSIGNED_PCMP)))
(set (reg:QI 115)
(unspec:QI [
(mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0 S16
A128])
(reg:V8HI 112 [ _18 ])
(const_int 0 [0])
] UNSPEC_UNSIGNED_PCMP))
])
allowing combination of insns 15 and 16
original costs 0 + 4 = 0
replacement cost 0
deferring deletion of insn with uid = 15.
deferring deletion of insn with uid = 15.
modifying insn i316: {r120:DI=zero_extend(unspec[[r98:SI],r112:V8HI,0]
157);r115:QI=unspec[[r98:SI],r112:V8HI,0] 157;}
  REG_DEAD r98:SI
  REG_DEAD r112:V8HI
deferring rescan insn with uid = 16.
```

But after it combine does not and does not say why though.

[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160

2025-05-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||jakub at gcc dot gnu.org,
   ||pinskia at gcc dot gnu.org
Summary|[16 Regression] |[16 Regression]
   |gcc.target/i386/avx512bw-pr |gcc.target/i386/avx512bw-pr
   |103750-2.c  |103750-2.c since r16-160

--- Comment #2 from Jakub Jelinek  ---
Started with r16-160-ge6f89d78c1a7528e93458278e35d365544a18c26