[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090 --- Comment #10 from GCC Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:f725d6765373f7884a2ea23bc11409b15545958b commit r16-809-gf725d6765373f7884a2ea23bc11409b15545958b Author: Andrew Pinski Date: Mon May 5 09:46:14 2025 -0700 combine: gen_lowpart_no_emit vs CLOBBER [PR120090] The problem here is simplify-rtx.cc expects gen_lowpart_no_emit to return NULL on failure but combine's hook was returning CLOBBER. After r16-160-ge6f89d78c1a7528e93458278, gcc.target/i386/avx512bw-pr103750-2.c started to fail at -m32 due to this as new simplify code would return a RTL with a clobber in it rather than returning NULL. To fix this gen_lowpart_no_emit should return NULL when there was an failure instead of a clobber. This only changes the gen_lowpart_no_emit hook and not the generic gen_lowpart hook as parts of combine just pass gen_lowpart result directly without checking the return value. Bootstrapped and tested on x86_64-linux-gnu. PR rtl-optimization/120090 gcc/ChangeLog: * combine.cc (gen_lowpart_for_combine_no_emit): New function. (RTL_HOOKS_GEN_LOWPART_NO_EMIT): Set to gen_lowpart_for_combine_no_emit. Signed-off-by: Andrew Pinski
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #11 from Andrew Pinski --- Fixed.
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090 --- Comment #9 from Andrew Pinski --- Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2025-May/682674.html
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090 --- Comment #8 from Andrew Pinski --- Created attachment 61322 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61322&action=edit Patch which I am testing I did a quick test with the patch and compile.exp had no failures on x86_64. Will do a full bootstrap/test before submitting.
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090
--- Comment #7 from Andrew Pinski ---
(In reply to Andrew Pinski from comment #6)
>
> But nowhere in simplify-rtx.cc checks that gen_lowpart_no_emit will return
> CLOBBER. Or should we wrap gen_lowpart_for_combine and return NULL when it
> is a clobber ...
>
That is this:
```
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 67cf0447607..366886020e7 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -458,6 +458,7 @@ static rtx simplify_shift_const (rtx, enum rtx_code,
machine_mode, rtx,
int);
static int recog_for_combine (rtx *, rtx_insn *, rtx *, unsigned = 0, unsigned
= 0);
static rtx gen_lowpart_for_combine (machine_mode, rtx);
+static rtx gen_lowpart_for_combine_no_emit (machine_mode, rtx);
static enum rtx_code simplify_compare_const (enum rtx_code, machine_mode,
rtx *, rtx *);
static enum rtx_code simplify_comparison (enum rtx_code, rtx *, rtx *);
@@ -491,7 +492,7 @@ static rtx gen_lowpart_or_truncate (machine_mode, rtx);
/* Our implementation of gen_lowpart never emits a new pseudo. */
#undef RTL_HOOKS_GEN_LOWPART_NO_EMIT
-#define RTL_HOOKS_GEN_LOWPART_NO_EMIT gen_lowpart_for_combine
+#define RTL_HOOKS_GEN_LOWPART_NO_EMIT gen_lowpart_for_combine_no_emit
#undef RTL_HOOKS_REG_NONZERO_REG_BITS
#define RTL_HOOKS_REG_NONZERO_REG_BITS reg_nonzero_bits_for_combine
@@ -11890,6 +11891,16 @@ gen_lowpart_for_combine (machine_mode omode, rtx x)
fail:
return gen_rtx_CLOBBER (omode, const0_rtx);
}
+
+static rtx
+gen_lowpart_for_combine_no_emit (machine_mode omode, rtx x)
+{
+ rtx tem = gen_lowpart_for_combine (omode, x);
+ if (!tem || GET_CODE (tem) == CLOBBER)
+return NULL_RTX;
+ return tem;
+}
+
^L
/* Try to simplify a comparison between OP0 and a constant OP1,
where CODE is the comparison code that will be tested, into a
```
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090
Andrew Pinski changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot
gnu.org
--- Comment #6 from Andrew Pinski ---
(In reply to Andrew Pinski from comment #5)
> before:(and:SI (subreg:SI (unspec:QI [
> (mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0 S16
> A128])
> (reg:V8HI 110 [ _16 ])
> (const_int 0 [0])
> ] UNSPEC_UNSIGNED_PCMP) 0)
> (const_int 255 [0xff]))
> after:(and:DI (clobber:DI (const_int 0 [0]))
> (const_int 255 [0xff]))
>
>
> clobber means we can't do it ...
Instead of rtl_hooks.gen_lowpart_no_emit return NULL, combine's gen_lowpart
returns a clobber.
This fixes the issue at hand:
```
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 7bcbe11370f..ea730c4d9aa 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -1716,7 +1716,7 @@ simplify_context::simplify_unary_operation_1 (rtx_code
code, machine_mode mode,
&& INTVAL (XEXP (op, 1)) > 0)
{
rtx tem = rtl_hooks.gen_lowpart_no_emit (mode, XEXP (op, 0));
- if (tem)
+ if (tem && GET_CODE (tem) != CLOBBER)
return simplify_gen_binary (AND, mode, tem, XEXP (op, 1));
}
```
But nowhere in simplify-rtx.cc checks that gen_lowpart_no_emit will return
CLOBBER. Or should we wrap gen_lowpart_for_combine and return NULL when it is a
clobber ...
Still deciding.
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090 --- Comment #5 from Andrew Pinski --- before:(and:SI (subreg:SI (unspec:QI [ (mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0 S16 A128]) (reg:V8HI 110 [ _16 ]) (const_int 0 [0]) ] UNSPEC_UNSIGNED_PCMP) 0) (const_int 255 [0xff])) after:(and:DI (clobber:DI (const_int 0 [0])) (const_int 255 [0xff])) clobber means we can't do it ...
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090 --- Comment #4 from Andrew Pinski --- Created attachment 61321 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61321&action=edit reduced test which show the kmov This removes the other functions and leaves one which shows the issue.
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090
--- Comment #3 from Andrew Pinski ---
Before we could do this
```
Trying 15 -> 16:
15: {r121:SI=zero_extend(unspec[[r98:SI],r112:V8HI,0]
157);r115:QI=unspec[[r98:SI],r112:V8HI,0] 157;}
REG_DEAD r112:V8HI
REG_DEAD r98:SI
16: r120:DI=zero_extend(r121:SI)
REG_DEAD r121:SI
Successfully matched this instruction:
(parallel [
(set (reg:DI 120 [ _6 ])
(zero_extend:DI (unspec:QI [
(mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0
S16 A128])
(reg:V8HI 112 [ _18 ])
(const_int 0 [0])
] UNSPEC_UNSIGNED_PCMP)))
(set (reg:QI 115)
(unspec:QI [
(mem:V8HI (reg/f:SI 98 [ pi128.0_1 ]) [0 *pi128.0_1+0 S16
A128])
(reg:V8HI 112 [ _18 ])
(const_int 0 [0])
] UNSPEC_UNSIGNED_PCMP))
])
allowing combination of insns 15 and 16
original costs 0 + 4 = 0
replacement cost 0
deferring deletion of insn with uid = 15.
deferring deletion of insn with uid = 15.
modifying insn i316: {r120:DI=zero_extend(unspec[[r98:SI],r112:V8HI,0]
157);r115:QI=unspec[[r98:SI],r112:V8HI,0] 157;}
REG_DEAD r98:SI
REG_DEAD r112:V8HI
deferring rescan insn with uid = 16.
```
But after it combine does not and does not say why though.
[Bug target/120090] [16 Regression] gcc.target/i386/avx512bw-pr103750-2.c since r16-160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120090 Jakub Jelinek changed: What|Removed |Added Keywords|needs-bisection | CC||jakub at gcc dot gnu.org, ||pinskia at gcc dot gnu.org Summary|[16 Regression] |[16 Regression] |gcc.target/i386/avx512bw-pr |gcc.target/i386/avx512bw-pr |103750-2.c |103750-2.c since r16-160 --- Comment #2 from Jakub Jelinek --- Started with r16-160-ge6f89d78c1a7528e93458278e35d365544a18c26
