Hi,

I'd like to gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html

BR,
Kewen

>> on 2023/3/29 15:18, Kewen.Lin via Gcc-patches wrote:
>>> Hi,
>>>
>>> By addressing Alexander's comments, against v1 this
>>> patch v2 mainly:
>>>
>>>   - Rename no_real_insns_p to no_real_nondebug_insns_p;
>>>   - Introduce enum rgn_bb_deps_free_action for three
>>>     kinds of actions to free deps;
>>>   - Change function free_deps_for_bb_no_real_insns_p to
>>>     resolve_forw_deps which only focuses on forward deps;
>>>   - Extend the handlings to cover dbg-cnt sched_block,
>>>     add one test case for it;
>>>   - Move free_trg_info call in schedule_region to an
>>>     appropriate place.
>>>
>>> One thing I'm not sure about is the change in function
>>> sched_rgn_local_finish, currently the invocation to
>>> sched_rgn_local_free is guarded with !sel_sched_p (),
>>> so I just follow it, but the initialization of those
>>> structures (in sched_rgn_local_init) isn't guarded
>>> with !sel_sched_p (), it looks odd.
>>>
>>> ----
>>>
>>> As PR108273 shows, when there is one block which only has
>>> NOTE_P and LABEL_P insns at non-debug mode while has some
>>> extra DEBUG_INSN_P insns at debug mode, after scheduling
>>> it, the DFA states would be different between debug mode
>>> and non-debug mode.  Since at non-debug mode, the block
>>> meets no_real_insns_p, it gets skipped; while at debug
>>> mode, it gets scheduled, even it only has NOTE_P, LABEL_P
>>> and DEBUG_INSN_P, the call of function advance_one_cycle
>>> will change the DFA state.  PR108519 also shows this issue
>>> issue can be exposed by some scheduler changes.
>>>
>>> This patch is to change function no_real_insns_p to
>>> function no_real_nondebug_insns_p by taking debug insn into
>>> account, which make us not try to schedule for the block
>>> having only NOTE_P, LABEL_P and DEBUG_INSN_P insns,
>>> resulting in consistent DFA states between non-debug and
>>> debug mode.
>>>
>>> Changing no_real_insns_p to no_real_nondebug_insns_p caused
>>> ICE when doing free_block_dependencies, the root cause is
>>> that we create dependencies for debug insns, those
>>> dependencies are expected to be resolved during scheduling
>>> insns, but which gets skipped after this change.
>>> By checking the code, it looks it's reasonable to skip to
>>> compute block dependences for no_real_nondebug_insns_p
>>> blocks.  There is also another issue, which gets exposed
>>> in SPEC2017 bmks build at option -O2 -g, is that we could
>>> skip to schedule some block, which already gets dependency
>>> graph built so has dependencies computed and rgn_n_insns
>>> accumulated, then the later verification on if the graph
>>> becomes exhausted by scheduling would fail as follow:
>>>
>>>   /* Sanity check: verify that all region insns were
>>>      scheduled.  */
>>>     gcc_assert (sched_rgn_n_insns == rgn_n_insns);
>>>
>>> , and also some forward deps aren't resovled.
>>>
>>> As Alexander pointed out, the current debug count handling
>>> also suffers the similar issue, so this patch handles these
>>> two cases together: one is for some block gets skipped by
>>> !dbg_cnt (sched_block), the other is for some block which
>>> is not no_real_nondebug_insns_p initially but becomes
>>> no_real_nondebug_insns_p due to speculative scheduling.
>>>
>>> This patch can be bootstrapped and regress-tested on
>>> x86_64-redhat-linux, aarch64-linux-gnu and
>>> powerpc64{,le}-linux-gnu.
>>>
>>> I also verified this patch can pass SPEC2017 both intrate
>>> and fprate bmks building at -g -O2/-O3.
>>>
>>> Any thoughts?
>>>
>>> BR,
>>> Kewen
>>> ----
>>>     PR rtl-optimization/108273
>>>
>>> gcc/ChangeLog:
>>>
>>>     * haifa-sched.cc (no_real_insns_p): Rename to ...
>>>     (no_real_nondebug_insns_p): ... this, and consider DEBUG_INSN_P insn.
>>>     * sched-ebb.cc (schedule_ebb): Replace no_real_insns_p with
>>>     no_real_nondebug_insns_p.
>>>     * sched-int.h (no_real_insns_p): Rename to ...
>>>     (no_real_nondebug_insns_p): ... this.
>>>     * sched-rgn.cc (enum rgn_bb_deps_free_action): New enum.
>>>     (bb_deps_free_actions): New static variable.
>>>     (compute_block_dependences): Skip for no_real_nondebug_insns_p.
>>>     (resolve_forw_deps): New function.
>>>     (free_block_dependencies): Check bb_deps_free_actions and call
>>>     function resolve_forw_deps for RGN_BB_DEPS_FREE_ARTICIAL.
>>>     (compute_priorities): Replace no_real_insns_p with
>>>     no_real_nondebug_insns_p.
>>>     (schedule_region): Replace no_real_insns_p with
>>>     no_real_nondebug_insns_p, set RGN_BB_DEPS_FREE_ARTICIAL if the block
>>>     get dependencies computed before but skipped now, fix up count
>>>     sched_rgn_n_insns for it too.  Call free_trg_info when the block
>>>     gets scheduled, and move sched_rgn_local_finish after the loop
>>>     of free_block_dependencies loop.
>>>     (sched_rgn_local_init): Allocate and compute bb_deps_free_actions.
>>>     (sched_rgn_local_finish): Free bb_deps_free_actions.
>>>     * sel-sched.cc (sel_region_target_finish): Replace no_real_insns_p with
>>>     no_real_nondebug_insns_p.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * gcc.target/powerpc/pr108273.c: New test.
>>> ---
>>>  gcc/haifa-sched.cc                          |   9 +-
>>>  gcc/sched-ebb.cc                            |   2 +-
>>>  gcc/sched-int.h                             |   2 +-
>>>  gcc/sched-rgn.cc                            | 148 +++++++++++++++-----
>>>  gcc/sel-sched.cc                            |   3 +-
>>>  gcc/testsuite/gcc.target/powerpc/pr108273.c |  26 ++++
>>>  6 files changed, 150 insertions(+), 40 deletions(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108273.c
>>>
>>> diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
>>> index 48b53776fa9..371de486eb0 100644
>>> --- a/gcc/haifa-sched.cc
>>> +++ b/gcc/haifa-sched.cc
>>> @@ -5033,14 +5033,17 @@ get_ebb_head_tail (basic_block beg, basic_block end,
>>>    *tailp = end_tail;
>>>  }
>>>
>>> -/* Return nonzero if there are no real insns in the range [ HEAD, TAIL ].  
>>> */
>>> +/* Return nonzero if there are no real nondebug insns in the range
>>> +   [ HEAD, TAIL ].  */
>>>
>>>  int
>>> -no_real_insns_p (const rtx_insn *head, const rtx_insn *tail)
>>> +no_real_nondebug_insns_p (const rtx_insn *head, const rtx_insn *tail)
>>>  {
>>>    while (head != NEXT_INSN (tail))
>>>      {
>>> -      if (!NOTE_P (head) && !LABEL_P (head))
>>> +      if (!NOTE_P (head)
>>> +     && !LABEL_P (head)
>>> +     && !DEBUG_INSN_P (head))
>>>     return 0;
>>>        head = NEXT_INSN (head);
>>>      }
>>> diff --git a/gcc/sched-ebb.cc b/gcc/sched-ebb.cc
>>> index 3eb6a24f187..e3bb0d57159 100644
>>> --- a/gcc/sched-ebb.cc
>>> +++ b/gcc/sched-ebb.cc
>>> @@ -491,7 +491,7 @@ schedule_ebb (rtx_insn *head, rtx_insn *tail, bool 
>>> modulo_scheduling)
>>>    first_bb = BLOCK_FOR_INSN (head);
>>>    last_bb = BLOCK_FOR_INSN (tail);
>>>
>>> -  if (no_real_insns_p (head, tail))
>>> +  if (no_real_nondebug_insns_p (head, tail))
>>>      return BLOCK_FOR_INSN (tail);
>>>
>>>    gcc_assert (INSN_P (head) && INSN_P (tail));
>>> diff --git a/gcc/sched-int.h b/gcc/sched-int.h
>>> index 97b7d2d319b..eb4e8acec9f 100644
>>> --- a/gcc/sched-int.h
>>> +++ b/gcc/sched-int.h
>>> @@ -1397,7 +1397,7 @@ extern void free_global_sched_pressure_data (void);
>>>  extern int haifa_classify_insn (const_rtx);
>>>  extern void get_ebb_head_tail (basic_block, basic_block,
>>>                            rtx_insn **, rtx_insn **);
>>> -extern int no_real_insns_p (const rtx_insn *, const rtx_insn *);
>>> +extern int no_real_nondebug_insns_p (const rtx_insn *, const rtx_insn *);
>>>
>>>  extern int insn_sched_cost (rtx_insn *);
>>>  extern int dep_cost_1 (dep_t, dw_t);
>>> diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc
>>> index f2751f62450..3508a26aeb6 100644
>>> --- a/gcc/sched-rgn.cc
>>> +++ b/gcc/sched-rgn.cc
>>> @@ -213,6 +213,22 @@ static int rgn_nr_edges;
>>>  /* Array of size rgn_nr_edges.  */
>>>  static edge *rgn_edges;
>>>
>>> +/* Possible actions for dependencies freeing.  */
>>> +enum rgn_bb_deps_free_action
>>> +{
>>> +  /* This block doesn't get dependencies computed so don't need to free.  
>>> */
>>> +  RGN_BB_DEPS_FREE_NO,
>>> +  /* This block gets scheduled normally so free dependencies as usual.  */
>>> +  RGN_BB_DEPS_FREE_NORMAL,
>>> +  /* This block gets skipped in scheduling but has dependencies computed 
>>> early,
>>> +     need to free the forward list articially.  */
>>> +  RGN_BB_DEPS_FREE_ARTICIAL
>>> +};
>>> +
>>> +/* For basic block i, bb_deps_free_actions[i] indicates which action needs
>>> +   to be taken for freeing its dependencies.  */
>>> +static enum rgn_bb_deps_free_action *bb_deps_free_actions;
>>> +
>>>  /* Mapping from each edge in the graph to its number in the rgn.  */
>>>  #define EDGE_TO_BIT(edge) ((int)(size_t)(edge)->aux)
>>>  #define SET_EDGE_TO_BIT(edge,nr) ((edge)->aux = (void *)(size_t)(nr))
>>> @@ -2730,6 +2746,15 @@ compute_block_dependences (int bb)
>>>    gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb));
>>>    get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail);
>>>
>>> +  /* Don't compute block dependencies if there are no real nondebug insns. 
>>>  */
>>> +  if (no_real_nondebug_insns_p (head, tail))
>>> +    {
>>> +      if (current_nr_blocks > 1)
>>> +   propagate_deps (bb, &tmp_deps);
>>> +      free_deps (&tmp_deps);
>>> +      return;
>>> +    }
>>> +
>>>    sched_analyze (&tmp_deps, head, tail);
>>>
>>>    add_branch_dependences (head, tail);
>>> @@ -2744,6 +2769,24 @@ compute_block_dependences (int bb)
>>>      targetm.sched.dependencies_evaluation_hook (head, tail);
>>>  }
>>>
>>> +/* Artificially resolve forward dependencies for instructions HEAD to 
>>> TAIL.  */
>>> +
>>> +static void
>>> +resolve_forw_deps (rtx_insn *head, rtx_insn *tail)
>>> +{
>>> +  rtx_insn *insn;
>>> +  rtx_insn *next_tail = NEXT_INSN (tail);
>>> +  sd_iterator_def sd_it;
>>> +  dep_t dep;
>>> +
>>> +  /* There could be some insns which get skipped in scheduling but we 
>>> compute
>>> +     dependencies for them previously, so make them resolved.  */
>>> +  for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
>>> +    for (sd_it = sd_iterator_start (insn, SD_LIST_FORW);
>>> +    sd_iterator_cond (&sd_it, &dep);)
>>> +      sd_resolve_dep (sd_it);
>>> +}
>>> +
>>>  /* Free dependencies of instructions inside BB.  */
>>>  static void
>>>  free_block_dependencies (int bb)
>>> @@ -2753,9 +2796,12 @@ free_block_dependencies (int bb)
>>>
>>>    get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail);
>>>
>>> -  if (no_real_insns_p (head, tail))
>>> +  if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_NO)
>>>      return;
>>>
>>> +  if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_ARTICIAL)
>>> +    resolve_forw_deps (head, tail);
>>> +
>>>    sched_free_deps (head, tail, true);
>>>  }
>>>
>>> @@ -3019,7 +3065,7 @@ compute_priorities (void)
>>>        gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb));
>>>        get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, 
>>> &tail);
>>>
>>> -      if (no_real_insns_p (head, tail))
>>> +      if (no_real_nondebug_insns_p (head, tail))
>>>     continue;
>>>
>>>        rgn_n_insns += set_priorities (head, tail);
>>> @@ -3153,7 +3199,7 @@ schedule_region (int rgn)
>>>
>>>       get_ebb_head_tail (first_bb, last_bb, &head, &tail);
>>>
>>> -     if (no_real_insns_p (head, tail))
>>> +     if (no_real_nondebug_insns_p (head, tail))
>>>         {
>>>           gcc_assert (first_bb == last_bb);
>>>           continue;
>>> @@ -3173,44 +3219,62 @@ schedule_region (int rgn)
>>>
>>>        get_ebb_head_tail (first_bb, last_bb, &head, &tail);
>>>
>>> -      if (no_real_insns_p (head, tail))
>>> +      if (no_real_nondebug_insns_p (head, tail))
>>>     {
>>>       gcc_assert (first_bb == last_bb);
>>>       save_state_for_fallthru_edge (last_bb, bb_state[first_bb->index]);
>>> -     continue;
>>> +
>>> +     if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_NO)
>>> +       continue;
>>> +
>>> +     /* As it's not no_real_nondebug_insns_p initially, then it has some
>>> +        dependencies computed so free it articially.  */
>>> +     bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_ARTICIAL;
>>>     }
>>> +      else
>>> +   {
>>> +     current_sched_info->prev_head = PREV_INSN (head);
>>> +     current_sched_info->next_tail = NEXT_INSN (tail);
>>>
>>> -      current_sched_info->prev_head = PREV_INSN (head);
>>> -      current_sched_info->next_tail = NEXT_INSN (tail);
>>> +     remove_notes (head, tail);
>>>
>>> -      remove_notes (head, tail);
>>> +     unlink_bb_notes (first_bb, last_bb);
>>>
>>> -      unlink_bb_notes (first_bb, last_bb);
>>> +     target_bb = bb;
>>>
>>> -      target_bb = bb;
>>> +     gcc_assert (flag_schedule_interblock || current_nr_blocks == 1);
>>> +     current_sched_info->queue_must_finish_empty = current_nr_blocks == 1;
>>>
>>> -      gcc_assert (flag_schedule_interblock || current_nr_blocks == 1);
>>> -      current_sched_info->queue_must_finish_empty = current_nr_blocks == 1;
>>> +     curr_bb = first_bb;
>>> +     if (dbg_cnt (sched_block))
>>> +       {
>>> +         int saved_last_basic_block = last_basic_block_for_fn (cfun);
>>>
>>> -      curr_bb = first_bb;
>>> -      if (dbg_cnt (sched_block))
>>> -        {
>>> -     int saved_last_basic_block = last_basic_block_for_fn (cfun);
>>> +         schedule_block (&curr_bb, bb_state[first_bb->index]);
>>> +         gcc_assert (EBB_FIRST_BB (bb) == first_bb);
>>> +         sched_rgn_n_insns += sched_n_insns;
>>> +         realloc_bb_state_array (saved_last_basic_block);
>>> +         save_state_for_fallthru_edge (last_bb, curr_state);
>>>
>>> -     schedule_block (&curr_bb, bb_state[first_bb->index]);
>>> -     gcc_assert (EBB_FIRST_BB (bb) == first_bb);
>>> -     sched_rgn_n_insns += sched_n_insns;
>>> -     realloc_bb_state_array (saved_last_basic_block);
>>> -     save_state_for_fallthru_edge (last_bb, curr_state);
>>> -        }
>>> -      else
>>> -        {
>>> -          sched_rgn_n_insns += rgn_n_insns;
>>> -        }
>>> +         /* Clean up.  */
>>> +         if (current_nr_blocks > 1)
>>> +           free_trg_info ();
>>> +       }
>>> +     else
>>> +       bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_ARTICIAL;
>>> +   }
>>>
>>> -      /* Clean up.  */
>>> -      if (current_nr_blocks > 1)
>>> -   free_trg_info ();
>>> +      /* We have counted this block when computing rgn_n_insns
>>> +    previously, so need to fix up sched_rgn_n_insns now.  */
>>> +      if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_ARTICIAL)
>>> +   {
>>> +     while (head != NEXT_INSN (tail))
>>> +       {
>>> +         if (INSN_P (head))
>>> +           sched_rgn_n_insns++;
>>> +         head = NEXT_INSN (head);
>>> +       }
>>> +   }
>>>      }
>>>
>>>    /* Sanity check: verify that all region insns were scheduled.  */
>>> @@ -3218,13 +3282,13 @@ schedule_region (int rgn)
>>>
>>>    sched_finish_ready_list ();
>>>
>>> -  /* Done with this region.  */
>>> -  sched_rgn_local_finish ();
>>> -
>>>    /* Free dependencies.  */
>>>    for (bb = 0; bb < current_nr_blocks; ++bb)
>>>      free_block_dependencies (bb);
>>>
>>> +  /* Done with this region.  */
>>> +  sched_rgn_local_finish ();
>>> +
>>>    gcc_assert (haifa_recovery_bb_ever_added_p
>>>           || deps_pools_are_empty_p ());
>>>  }
>>> @@ -3445,6 +3509,19 @@ sched_rgn_local_init (int rgn)
>>>         e->aux = NULL;
>>>          }
>>>      }
>>> +
>>> +  /* Initialize bb_deps_free_actions.  */
>>> +  bb_deps_free_actions
>>> +    = XNEWVEC (enum rgn_bb_deps_free_action, current_nr_blocks);
>>> +  for (bb = 0; bb < current_nr_blocks; bb++)
>>> +    {
>>> +      rtx_insn *head, *tail;
>>> +      get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, 
>>> &tail);
>>> +      if (no_real_nondebug_insns_p (head, tail))
>>> +   bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_NO;
>>> +      else
>>> +   bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_NORMAL;
>>> +    }
>>>  }
>>>
>>>  /* Free data computed for the finished region.  */
>>> @@ -3462,9 +3539,12 @@ sched_rgn_local_free (void)
>>>  void
>>>  sched_rgn_local_finish (void)
>>>  {
>>> -  if (current_nr_blocks > 1 && !sel_sched_p ())
>>> +  if (!sel_sched_p ())
>>>      {
>>> -      sched_rgn_local_free ();
>>> +      if (current_nr_blocks > 1)
>>> +   sched_rgn_local_free ();
>>> +
>>> +      free (bb_deps_free_actions);
>>>      }
>>>  }
>>>
>>> diff --git a/gcc/sel-sched.cc b/gcc/sel-sched.cc
>>> index cb1cf75bfe4..04415590455 100644
>>> --- a/gcc/sel-sched.cc
>>> +++ b/gcc/sel-sched.cc
>>> @@ -7213,7 +7213,8 @@ sel_region_target_finish (bool reset_sched_cycles_p)
>>>
>>>        find_ebb_boundaries (EBB_FIRST_BB (i), scheduled_blocks);
>>>
>>> -      if (no_real_insns_p (current_sched_info->head, 
>>> current_sched_info->tail))
>>> +      if (no_real_nondebug_insns_p (current_sched_info->head,
>>> +                               current_sched_info->tail))
>>>     continue;
>>>
>>>        if (reset_sched_cycles_p)
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr108273.c 
>>> b/gcc/testsuite/gcc.target/powerpc/pr108273.c
>>> new file mode 100644
>>> index 00000000000..937224eaa69
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr108273.c
>>> @@ -0,0 +1,26 @@
>>> +/* { dg-options "-O2 -fdbg-cnt=sched_block:1" } */
>>> +/* { dg-prune-output {\*\*\*dbgcnt:.*limit.*reached} } */
>>> +
>>> +/* Verify there is no ICE.  */
>>> +
>>> +int a, b, c, e, f;
>>> +float d;
>>> +
>>> +void
>>> +g ()
>>> +{
>>> +  float h, i[1];
>>> +  for (; f;)
>>> +    if (c)
>>> +      {
>>> +   d *e;
>>> +   if (b)
>>> +     {
>>> +       float *j = i;
>>> +       j[0] += 0;
>>> +     }
>>> +   h += d;
>>> +      }
>>> +  if (h)
>>> +    a = i[0];
>>> +}
>>> --
>>> 2.25.1
>>


BR,
Kewen

Reply via email to