Hi, Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634201.html BR, Kewen on 2023/10/25 10:45, Kewen.Lin wrote: > Hi, > > This is almost a repost for v2 which was posted at[1] in March > excepting for: > 1) rebased from r14-4810 which is relatively up-to-date, > some conflicts on "int to bool" return type change have > been resolved; > 2) adjust commit log a bit; > 3) fix misspelled "articial" with "artificial" somewhere; > > -- > *v2 comments*: > > By addressing Alexander's comments, against v1 this > patch v2 mainly: > > - Rename no_real_insns_p to no_real_nondebug_insns_p; > - Introduce enum rgn_bb_deps_free_action for three > kinds of actions to free deps; > - Change function free_deps_for_bb_no_real_insns_p to > resolve_forw_deps which only focuses on forward deps; > - Extend the handlings to cover dbg-cnt sched_block, > add one test case for it; > - Move free_trg_info call in schedule_region to an > appropriate place. > > One thing I'm not sure about is the change in function > sched_rgn_local_finish, currently the invocation to > sched_rgn_local_free is guarded with !sel_sched_p (), > so I just follow it, but the initialization of those > structures (in sched_rgn_local_init) isn't guarded > with !sel_sched_p (), it looks odd. > > -- > > As PR108273 shows, when there is one block which only has > NOTE_P and LABEL_P insns at non-debug mode while has some > extra DEBUG_INSN_P insns at debug mode, after scheduling > it, the DFA states would be different between debug mode > and non-debug mode. Since at non-debug mode, the block > meets no_real_insns_p, it gets skipped; while at debug > mode, it gets scheduled, even it only has NOTE_P, LABEL_P > and DEBUG_INSN_P, the call of function advance_one_cycle > will change the DFA state. PR108519 also shows this issue > can be exposed by some scheduler changes. > > This patch is to change function no_real_insns_p to > function no_real_nondebug_insns_p by taking debug insn into > account, which make us not try to schedule for the block > having only NOTE_P, LABEL_P and DEBUG_INSN_P insns, > resulting in consistent DFA states between non-debug and > debug mode. > > Changing no_real_insns_p to no_real_nondebug_insns_p caused > ICE when doing free_block_dependencies, the root cause is > that we create dependencies for debug insns, those > dependencies are expected to be resolved during scheduling > insns, but they get skipped after this change. > By checking the code, it looks it's reasonable to skip to > compute block dependences for no_real_nondebug_insns_p > blocks. There is also another issue, which gets exposed > in SPEC2017 bmks build at option -O2 -g, is that we could > skip to schedule some block, which already gets dependency > graph built so has dependencies computed and rgn_n_insns > accumulated, then the later verification on if the graph > becomes exhausted by scheduling would fail as follow: > > /* Sanity check: verify that all region insns were > scheduled. */ > gcc_assert (sched_rgn_n_insns == rgn_n_insns); > > , and also some forward deps aren't resovled. > > As Alexander pointed out, the current debug count handling > also suffers the similar issue, so this patch handles these > two cases together: one is for some block gets skipped by > !dbg_cnt (sched_block), the other is for some block which > is not no_real_nondebug_insns_p initially but becomes > no_real_nondebug_insns_p due to speculative scheduling. > > This patch can be bootstrapped and regress-tested on > x86_64-redhat-linux, aarch64-linux-gnu and > powerpc64{,le}-linux-gnu. > > I also verified this patch can pass SPEC2017 both intrate > and fprate bmks building at -g -O2/-O3. > > Any thoughts? Is it ok for trunk? > > [1] v2: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html > [2] v1: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614224.html > > BR, > Kewen > ----- > PR rtl-optimization/108273 > > gcc/ChangeLog: > > * haifa-sched.cc (no_real_insns_p): Rename to ... > (no_real_nondebug_insns_p): ... this, and consider DEBUG_INSN_P insn. > * sched-ebb.cc (schedule_ebb): Replace no_real_insns_p with > no_real_nondebug_insns_p. > * sched-int.h (no_real_insns_p): Rename to ... > (no_real_nondebug_insns_p): ... this. > * sched-rgn.cc (enum rgn_bb_deps_free_action): New enum. > (bb_deps_free_actions): New static variable. > (compute_block_dependences): Skip for no_real_nondebug_insns_p. > (resolve_forw_deps): New function. > (free_block_dependencies): Check bb_deps_free_actions and call > function resolve_forw_deps for RGN_BB_DEPS_FREE_ARTIFICIAL. > (compute_priorities): Replace no_real_insns_p with > no_real_nondebug_insns_p. > (schedule_region): Replace no_real_insns_p with > no_real_nondebug_insns_p, set RGN_BB_DEPS_FREE_ARTIFICIAL if the block > get dependencies computed before but skipped now, fix up count > sched_rgn_n_insns for it too. Call free_trg_info when the block > gets scheduled, and move sched_rgn_local_finish after the loop > of free_block_dependencies loop. > (sched_rgn_local_init): Allocate and compute bb_deps_free_actions. > (sched_rgn_local_finish): Free bb_deps_free_actions. > * sel-sched.cc (sel_region_target_finish): Replace no_real_insns_p with > no_real_nondebug_insns_p. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/pr108273.c: New test. > --- > gcc/haifa-sched.cc | 9 +- > gcc/sched-ebb.cc | 2 +- > gcc/sched-int.h | 2 +- > gcc/sched-rgn.cc | 148 +++++++++++++++----- > gcc/sel-sched.cc | 3 +- > gcc/testsuite/gcc.target/powerpc/pr108273.c | 26 ++++ > 6 files changed, 150 insertions(+), 40 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108273.c > > diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc > index 8e8add709b3..30cc90ec49f 100644 > --- a/gcc/haifa-sched.cc > +++ b/gcc/haifa-sched.cc > @@ -5033,14 +5033,17 @@ get_ebb_head_tail (basic_block beg, basic_block end, > *tailp = end_tail; > } > > -/* Return true if there are no real insns in the range [ HEAD, TAIL ]. */ > +/* Return true if there are no real nondebug insns in the range > + [ HEAD, TAIL ]. */ > > bool > -no_real_insns_p (const rtx_insn *head, const rtx_insn *tail) > +no_real_nondebug_insns_p (const rtx_insn *head, const rtx_insn *tail) > { > while (head != NEXT_INSN (tail)) > { > - if (!NOTE_P (head) && !LABEL_P (head)) > + if (!NOTE_P (head) > + && !LABEL_P (head) > + && !DEBUG_INSN_P (head)) > return false; > head = NEXT_INSN (head); > } > diff --git a/gcc/sched-ebb.cc b/gcc/sched-ebb.cc > index 110fcdbca4d..03d96290a7c 100644 > --- a/gcc/sched-ebb.cc > +++ b/gcc/sched-ebb.cc > @@ -491,7 +491,7 @@ schedule_ebb (rtx_insn *head, rtx_insn *tail, bool > modulo_scheduling) > first_bb = BLOCK_FOR_INSN (head); > last_bb = BLOCK_FOR_INSN (tail); > > - if (no_real_insns_p (head, tail)) > + if (no_real_nondebug_insns_p (head, tail)) > return BLOCK_FOR_INSN (tail); > > gcc_assert (INSN_P (head) && INSN_P (tail)); > diff --git a/gcc/sched-int.h b/gcc/sched-int.h > index 64a2f0bcff9..adca494ade5 100644 > --- a/gcc/sched-int.h > +++ b/gcc/sched-int.h > @@ -1397,7 +1397,7 @@ extern void free_global_sched_pressure_data (void); > extern int haifa_classify_insn (const_rtx); > extern void get_ebb_head_tail (basic_block, basic_block, > rtx_insn **, rtx_insn **); > -extern bool no_real_insns_p (const rtx_insn *, const rtx_insn *); > +extern bool no_real_nondebug_insns_p (const rtx_insn *, const rtx_insn *); > > extern int insn_sched_cost (rtx_insn *); > extern int dep_cost_1 (dep_t, dw_t); > diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc > index e5964f54ead..2549e834aa8 100644 > --- a/gcc/sched-rgn.cc > +++ b/gcc/sched-rgn.cc > @@ -213,6 +213,22 @@ static int rgn_nr_edges; > /* Array of size rgn_nr_edges. */ > static edge *rgn_edges; > > +/* Possible actions for dependencies freeing. */ > +enum rgn_bb_deps_free_action > +{ > + /* This block doesn't get dependencies computed so don't need to free. */ > + RGN_BB_DEPS_FREE_NO, > + /* This block gets scheduled normally so free dependencies as usual. */ > + RGN_BB_DEPS_FREE_NORMAL, > + /* This block gets skipped in scheduling but has dependencies computed > early, > + need to free the forward list artificially. */ > + RGN_BB_DEPS_FREE_ARTIFICIAL > +}; > + > +/* For basic block i, bb_deps_free_actions[i] indicates which action needs > + to be taken for freeing its dependencies. */ > +static enum rgn_bb_deps_free_action *bb_deps_free_actions; > + > /* Mapping from each edge in the graph to its number in the rgn. */ > #define EDGE_TO_BIT(edge) ((int)(size_t)(edge)->aux) > #define SET_EDGE_TO_BIT(edge,nr) ((edge)->aux = (void *)(size_t)(nr)) > @@ -2735,6 +2751,15 @@ compute_block_dependences (int bb) > gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb)); > get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); > > + /* Don't compute block dependences if there are no real nondebug insns. */ > + if (no_real_nondebug_insns_p (head, tail)) > + { > + if (current_nr_blocks > 1) > + propagate_deps (bb, &tmp_deps); > + free_deps (&tmp_deps); > + return; > + } > + > sched_analyze (&tmp_deps, head, tail); > > add_branch_dependences (head, tail); > @@ -2749,6 +2774,24 @@ compute_block_dependences (int bb) > targetm.sched.dependencies_evaluation_hook (head, tail); > } > > +/* Artificially resolve forward dependencies for instructions HEAD to TAIL. > */ > + > +static void > +resolve_forw_deps (rtx_insn *head, rtx_insn *tail) > +{ > + rtx_insn *insn; > + rtx_insn *next_tail = NEXT_INSN (tail); > + sd_iterator_def sd_it; > + dep_t dep; > + > + /* There could be some insns which get skipped in scheduling but we compute > + dependencies for them previously, so make them resolved. */ > + for (insn = head; insn != next_tail; insn = NEXT_INSN (insn)) > + for (sd_it = sd_iterator_start (insn, SD_LIST_FORW); > + sd_iterator_cond (&sd_it, &dep);) > + sd_resolve_dep (sd_it); > +} > + > /* Free dependencies of instructions inside BB. */ > static void > free_block_dependencies (int bb) > @@ -2758,9 +2801,12 @@ free_block_dependencies (int bb) > > get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); > > - if (no_real_insns_p (head, tail)) > + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_NO) > return; > > + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_ARTIFICIAL) > + resolve_forw_deps (head, tail); > + > sched_free_deps (head, tail, true); > } > > @@ -3024,7 +3070,7 @@ compute_priorities (void) > gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb)); > get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); > > - if (no_real_insns_p (head, tail)) > + if (no_real_nondebug_insns_p (head, tail)) > continue; > > rgn_n_insns += set_priorities (head, tail); > @@ -3158,7 +3204,7 @@ schedule_region (int rgn) > > get_ebb_head_tail (first_bb, last_bb, &head, &tail); > > - if (no_real_insns_p (head, tail)) > + if (no_real_nondebug_insns_p (head, tail)) > { > gcc_assert (first_bb == last_bb); > continue; > @@ -3178,44 +3224,62 @@ schedule_region (int rgn) > > get_ebb_head_tail (first_bb, last_bb, &head, &tail); > > - if (no_real_insns_p (head, tail)) > + if (no_real_nondebug_insns_p (head, tail)) > { > gcc_assert (first_bb == last_bb); > save_state_for_fallthru_edge (last_bb, bb_state[first_bb->index]); > - continue; > + > + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_NO) > + continue; > + > + /* As it's not no_real_nondebug_insns_p initially, then it has some > + dependencies computed so free it artificially. */ > + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_ARTIFICIAL; > } > + else > + { > + current_sched_info->prev_head = PREV_INSN (head); > + current_sched_info->next_tail = NEXT_INSN (tail); > > - current_sched_info->prev_head = PREV_INSN (head); > - current_sched_info->next_tail = NEXT_INSN (tail); > + remove_notes (head, tail); > > - remove_notes (head, tail); > + unlink_bb_notes (first_bb, last_bb); > > - unlink_bb_notes (first_bb, last_bb); > + target_bb = bb; > > - target_bb = bb; > + gcc_assert (flag_schedule_interblock || current_nr_blocks == 1); > + current_sched_info->queue_must_finish_empty = current_nr_blocks == 1; > > - gcc_assert (flag_schedule_interblock || current_nr_blocks == 1); > - current_sched_info->queue_must_finish_empty = current_nr_blocks == 1; > + curr_bb = first_bb; > + if (dbg_cnt (sched_block)) > + { > + int saved_last_basic_block = last_basic_block_for_fn (cfun); > > - curr_bb = first_bb; > - if (dbg_cnt (sched_block)) > - { > - int saved_last_basic_block = last_basic_block_for_fn (cfun); > + schedule_block (&curr_bb, bb_state[first_bb->index]); > + gcc_assert (EBB_FIRST_BB (bb) == first_bb); > + sched_rgn_n_insns += sched_n_insns; > + realloc_bb_state_array (saved_last_basic_block); > + save_state_for_fallthru_edge (last_bb, curr_state); > > - schedule_block (&curr_bb, bb_state[first_bb->index]); > - gcc_assert (EBB_FIRST_BB (bb) == first_bb); > - sched_rgn_n_insns += sched_n_insns; > - realloc_bb_state_array (saved_last_basic_block); > - save_state_for_fallthru_edge (last_bb, curr_state); > - } > - else > - { > - sched_rgn_n_insns += rgn_n_insns; > - } > + /* Clean up. */ > + if (current_nr_blocks > 1) > + free_trg_info (); > + } > + else > + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_ARTIFICIAL; > + } > > - /* Clean up. */ > - if (current_nr_blocks > 1) > - free_trg_info (); > + /* We have counted this block when computing rgn_n_insns > + previously, so need to fix up sched_rgn_n_insns now. */ > + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_ARTIFICIAL) > + { > + while (head != NEXT_INSN (tail)) > + { > + if (INSN_P (head)) > + sched_rgn_n_insns++; > + head = NEXT_INSN (head); > + } > + } > } > > /* Sanity check: verify that all region insns were scheduled. */ > @@ -3223,13 +3287,13 @@ schedule_region (int rgn) > > sched_finish_ready_list (); > > - /* Done with this region. */ > - sched_rgn_local_finish (); > - > /* Free dependencies. */ > for (bb = 0; bb < current_nr_blocks; ++bb) > free_block_dependencies (bb); > > + /* Done with this region. */ > + sched_rgn_local_finish (); > + > gcc_assert (haifa_recovery_bb_ever_added_p > || deps_pools_are_empty_p ()); > } > @@ -3450,6 +3514,19 @@ sched_rgn_local_init (int rgn) > e->aux = NULL; > } > } > + > + /* Initialize bb_deps_free_actions. */ > + bb_deps_free_actions > + = XNEWVEC (enum rgn_bb_deps_free_action, current_nr_blocks); > + for (bb = 0; bb < current_nr_blocks; bb++) > + { > + rtx_insn *head, *tail; > + get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); > + if (no_real_nondebug_insns_p (head, tail)) > + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_NO; > + else > + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_NORMAL; > + } > } > > /* Free data computed for the finished region. */ > @@ -3467,9 +3544,12 @@ sched_rgn_local_free (void) > void > sched_rgn_local_finish (void) > { > - if (current_nr_blocks > 1 && !sel_sched_p ()) > + if (!sel_sched_p ()) > { > - sched_rgn_local_free (); > + if (current_nr_blocks > 1) > + sched_rgn_local_free (); > + > + free (bb_deps_free_actions); > } > } > > diff --git a/gcc/sel-sched.cc b/gcc/sel-sched.cc > index 1925f4a9461..8310c892e13 100644 > --- a/gcc/sel-sched.cc > +++ b/gcc/sel-sched.cc > @@ -7213,7 +7213,8 @@ sel_region_target_finish (bool reset_sched_cycles_p) > > find_ebb_boundaries (EBB_FIRST_BB (i), scheduled_blocks); > > - if (no_real_insns_p (current_sched_info->head, > current_sched_info->tail)) > + if (no_real_nondebug_insns_p (current_sched_info->head, > + current_sched_info->tail)) > continue; > > if (reset_sched_cycles_p) > diff --git a/gcc/testsuite/gcc.target/powerpc/pr108273.c > b/gcc/testsuite/gcc.target/powerpc/pr108273.c > new file mode 100644 > index 00000000000..937224eaa69 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr108273.c > @@ -0,0 +1,26 @@ > +/* { dg-options "-O2 -fdbg-cnt=sched_block:1" } */ > +/* { dg-prune-output {\*\*\*dbgcnt:.*limit.*reached} } */ > + > +/* Verify there is no ICE. */ > + > +int a, b, c, e, f; > +float d; > + > +void > +g () > +{ > + float h, i[1]; > + for (; f;) > + if (c) > + { > + d *e; > + if (b) > + { > + float *j = i; > + j[0] += 0; > + } > + h += d; > + } > + if (h) > + a = i[0]; > +} > -- > 2.39.1