Avinash Jayakar <[email protected]> writes:
> Hi,
>
> This is a patch proposed to fix the latent scheduler issue seen in PR123144,
> PR80357 and PR94014.
>
> I was not able to generate a test case for this specific scenario. All the
> test
> cases mentioned in this PR mostly compile fine due to changes in gimple
> passes.
> I tried to generate gimple of the optimized and compile using -fgimple, but
> there were many parsing issues. Any suggestions on how to test this robustly
> would be welcome.
>
> Bootstrapped and regtested on powerpc64le and x86_64 without any regressions.
> Also tested on gcc-14 which fixed PR113390. Ok for trunk and release-14?
>
> Thanks and regards,
> Avinash Jayakar
>
> A latent issue in haifa scheduler cause ICE sometimes due to
> inconsistent register pressure values. PR123144, PR80357 and PR94014
> showed how this issue caused ICE.
>
> The problem was in the model_recompute function. During region
> scheduling, the ready list may have instruction from multiple basic
> blocks, but the model_schedule always has insns within the target_bb.
> So when model_recompute is run for instruction that is not in the
> model_schedule, the register pressure values would go negative in some
> cases leading to assertion error in model_update_limit_points_in_group.
>
> This patch just adds a guard in model_recompute function to not do it
> when insn is not in model.
>
> gcc/ChangeLog:
> PR 123144
> PR 80357
> PR 94014
> * haifa-sched.cc (model_recompute): Return if insn not in
> model_schedule.
> ---
> gcc/haifa-sched.cc | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
> index 4daa4d7b18a..b641d7a21a2 100644
> --- a/gcc/haifa-sched.cc
> +++ b/gcc/haifa-sched.cc
> @@ -2130,6 +2130,14 @@ model_recompute (rtx_insn *insn)
> /* The destinations of INSN were previously live from POINT onwards, but
> are
> now live from model_curr_point onwards. Set up DELTA accordingly. */
> point = model_index (insn);
> +
> + /* In case if insn is not in the model, no need to recompute.
> + This may happen in region scheduling, where the model schedule only has
> insns
> + from the basic block, but scheduled insn may be in a different block in
> the
> + same region. Recompute in this case would result in inconsistent pressure
> + values. */
> + if (point == model_num_insns)
> + return;
> reg_pressure = INSN_REG_PRESSURE (insn);
> for (pci = 0; pci < ira_pressure_classes_num; pci++)
> {
Thanks for looking at this. I'm not sure that the patch is the right
approach though. IIRC, the idea is that, when insns are pulled from
other blocks, the pressures should be updated to account for any change
in the live registers. This can be seen with the first insn in the block
that triggers the ICE in PR 123144:
;; Ready list (t = 0): 303/b4:108(cost=0:prio=39:delay=1:idx=5)
301/b4:106(cost=0:prio=39:delay=1:idx=5) 129:92(cost=0:prio=13:idx=0)
308/b4:118(cost=0:prio=25:idx=5) 306/b4:115(cost=0:prio=29:idx=5)
304/b4:112(cost=0:prio=33:idx=5) 160/b4:110(cost=0:prio=35:idx=5)
// rs6000_variable_issue (more = 7) = 6
;; 0--> b 4: i 160 {r244=vec_select(r234#0,parallel);clobber
scratch;}:DU_any_power8,VSU_power8:GENERAL_REGS+1(1)ALTIVEC_REGS+0(0)VSX_REGS+0(0)CR_REGS+0(0)SPECIAL_REGS+0(0)
;; +------------------------------------------------------
;; | New pressure for model schedule
;; +------------------------------------------------------
;; | 4 133 pc={(r229==0)?L138:pc}
GENERAL_REGS:[19->20] ALTIVEC_REGS:[2->2] VSX_REGS:[0->0] CR_REGS:[0->0]
SPECIAL_REGS:[0->0]
;; | 3 132 r229=cmp(r228,0)
GENERAL_REGS:[19->20] ALTIVEC_REGS:[2->2] VSX_REGS:[0->0] CR_REGS:[0->0]
SPECIAL_REGS:[0->0]
;; | 2 131 r228=zxn([r227+0x60])
GENERAL_REGS:[19->20] ALTIVEC_REGS:[2->2] VSX_REGS:[0->0] CR_REGS:[0->0]
SPECIAL_REGS:[0->0]
;; | 1 130 r227=r161+r226
GENERAL_REGS:[19->20] ALTIVEC_REGS:[2->2] VSX_REGS:[0->0] CR_REGS:[0->0]
SPECIAL_REGS:[0->0]
;; | 0 129 r226=sxn(r132)
GENERAL_REGS:[19->20] ALTIVEC_REGS:[2->2] VSX_REGS:[0->0] CR_REGS:[0->0]
SPECIAL_REGS:[0->0]
;; +------------------------------------------------------
;; GENERAL_REGS:20(-7) ALTIVEC_REGS:2(-29) VSX_REGS:0(-61)
CR_REGS:0(-8) SPECIAL_REGS:0(-2)
That is, r244 (the destination of insn 160) is assumed to occupy
GENERAL_REGS. Since insn 160 came from a different block, there is an
increase in the GENERAL_REGS pressure estimates for each unscheduled
insn in the block. The loop:
reg_pressure = INSN_REG_PRESSURE (insn);
for (pci = 0; pci < ira_pressure_classes_num; pci++)
{
cl = ira_pressure_classes[pci];
delta[cl] = reg_pressure[pci].set_increase;
}
sets up this increase. If, later, the last use of r244 in the region
is scheduled, the following loop:
/* Record which registers previously died at POINT, but which now die
before POINT. Adjust DELTA so that it represents the effect of
this change after POINT - 1. Set NUM_PENDING_BIRTHS to the number of
registers that will be born in the range [model_curr_point, POINT). */
num_uses = 0;
num_pending_births = 0;
bitmap_clear (tmp_bitmap);
for (use = INSN_REG_USE_LIST (insn); use != NULL; use = use->next_insn_use)
{
new_last = model_last_use_except (use);
if (new_last < point && bitmap_set_bit (tmp_bitmap, use->regno))
{
gcc_assert (num_uses < ARRAY_SIZE (uses));
uses[num_uses].last_use = new_last;
uses[num_uses].regno = use->regno;
/* This register is no longer live after POINT - 1. */
mark_regno_birth_or_death (NULL, delta, use->regno, false);
num_uses++;
if (new_last >= 0)
num_pending_births++;
}
}
should bring about a corresponding decrease in pressure.
Similarly, if the region initially has uses of a register both in the
block being scheduled ("block B") and in other blocks, scheduling the
last use in other blocks would make the last use in block B a dying use.
This would likewise affect the model schedule, with the pressure
decreasing after the last use in block B.
Like you say in the covering note, something has gone wrong if the
pressures end up negative. I think the problem is that the initial
start-of-block pressures are based on the original live-in sets for
the basic block, with no updates for insns that were moved by previous
schedules in the same region. model_recompute can therefore count
a dying use for something that wasn't assumed to be live.
The patch below tries to work around that. Bootstrapped &
regression-tested on powerpc64le-linux-gnu (gcc120). OK to install?
Although the testcase no longer triggers an ICE, I think it's still
worth including for defensive reasons.
Thanks,
Richard
>From 72182d9561875ec12294adc67e0fa56e019e0eae Mon Sep 17 00:00:00 2001
From: Richard Sandiford <[email protected]>
Date: Thu, 22 Jan 2026 18:41:14 +0000
Subject: [PATCH] sched: Make model scheduler more robust against stale live-in
sets
As the comment in the patch says, previous inter-block insn movement
can mean that the current block's live-in becomes stale. This is
somewhat undesirable, since it'll make estimates less conservative
than intended. However, a fully accurate update would be too expensive
for something that is only supposed to be a heuristic.
gcc/
PR rtl-optimization/80357
PR rtl-optimization/94014
PR rtl-optimization/123144
* haifa-sched.cc (model_recompute): Ignore dying uses of registers
that are not assumed to be live.
gcc/testsuite/
PR rtl-optimization/123144
* gcc.dg/torture/pr123144.c: New file.
---
gcc/haifa-sched.cc | 8 ++++-
gcc/testsuite/gcc.dg/torture/pr123144.c | 41 +++++++++++++++++++++++++
2 files changed, 48 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.dg/torture/pr123144.c
diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
index 4daa4d7b18a..58cda3af839 100644
--- a/gcc/haifa-sched.cc
+++ b/gcc/haifa-sched.cc
@@ -2147,7 +2147,13 @@ model_recompute (rtx_insn *insn)
for (use = INSN_REG_USE_LIST (insn); use != NULL; use = use->next_insn_use)
{
new_last = model_last_use_except (use);
- if (new_last < point && bitmap_set_bit (tmp_bitmap, use->regno))
+ if (new_last < point
+ && bitmap_set_bit (tmp_bitmap, use->regno)
+ /* df_get_live_in has not necessarily been updated to reflect the
+ effect of inter-block movement performed by earlier schedules.
+ Cope with stale live-in sets by ignoring registers that are not
+ currently assumed to be live. */
+ && bitmap_bit_p (curr_reg_live, use->regno))
{
gcc_assert (num_uses < ARRAY_SIZE (uses));
uses[num_uses].last_use = new_last;
diff --git a/gcc/testsuite/gcc.dg/torture/pr123144.c b/gcc/testsuite/gcc.dg/torture/pr123144.c
new file mode 100644
index 00000000000..a91df549dc5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr123144.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-w" } */
+/* { dg-additional-options "-mcpu=power8" { target power64*-*-* } } */
+
+#include <stdint.h>
+#define BS_VEC(type, num) type __attribute__((vector_size(num * sizeof(type))))
+int8_t backsmith_snippet_893(int8_t BS_ARG_1, BS_VEC(uint8_t, 32) BS_ARG_5)
+{
+ BS_VEC(uint16_t, 8) BS_VAR_0;
+ for (;;)
+ for (uint64_t BS_INC_1 = 0; BS_INC_1 < 13; BS_INC_1 += 1)
+ {
+ if (BS_ARG_5[2])
+ for (;;)
+ __builtin_convertvector((BS_VEC(int8_t, 16)){},
+ BS_VEC(uint8_t, 16));
+ uint8_t BS_TEMP_59 = __builtin_convertvector(
+ __builtin_shufflevector(
+ __builtin_convertvector((BS_VEC(int8_t, 4)){},
+ BS_VEC(uint16_t, 4)),
+ __builtin_convertvector((BS_VEC(int8_t, 4)){ BS_ARG_1 },
+ BS_VEC(uint16_t, 4)),
+ 4, 2, 2, 2, 6, 3, 0, 0, 3, 2, 1, 2, 4, 0, 1, 4, 2, 0, 3, 6,
+ 4, 3, 1, 0, 2, 5, 3, 7, 4, 2, 4, 2),
+ BS_VEC(uint8_t, 32))[(BS_INC_1 ? BS_VAR_0[3] : 2) ?: 2];
+ uint8_t BS_TEMP_60 = BS_TEMP_59;
+ for (; BS_TEMP_60;)
+ ;
+ BS_VEC(uint32_t, 8)
+ BS_TEMP_68 = __builtin_convertvector(
+ __builtin_convertvector(
+ (BS_VEC(uint64_t, 8)){ BS_INC_1, BS_INC_1, BS_INC_1,
+ BS_INC_1, BS_INC_1, BS_INC_1,
+ BS_INC_1, BS_INC_1 },
+ BS_VEC(uint16_t, 8)),
+ BS_VEC(uint32_t, 8));
+ BS_VAR_0[BS_INC_1 < 8 ? BS_INC_1 : 0] = BS_TEMP_68[0]
+ * BS_TEMP_68[1] * BS_TEMP_68[2] * BS_TEMP_68[3] * BS_TEMP_68[4]
+ * 6 * BS_TEMP_68[7];
+ }
+}
--
2.52.0