Hello,

On Mon, Jan 05 2026, Jan Hubicka wrote:
>> 2025-07-08  Martin Jambor  <[email protected]>
>> 
>>      * params.opt (param_ipa_cp_sweeps): New.
>>      * ipa-cp.cc (max_number_sweeps): New.
>>      (get_max_overall_size): New parameter cur_sweep, use it and the total
>>      number of sweeps from the NODE to calculate the result too.
>>      (ipcp_propagate_stage): Get the maximum number of sweeps specified in
>>      the corresponding parameter of any possibly affected node.
>>      (good_cloning_opportunity_p): Add parameter cur_sweep, adjust the
>>      threshold according to it.
>>      (decide_about_value): New parameter cur_sweep, pass it to
>>      get_max_overall_size and to good_cloning_opportunity_p.
>>      (decide_whether_version_node): New parameter cur_sweep, pass it to
>>      decide_about_value and get_max_overall_size.  Make sure the node is
>>      not dead.
>>      (ipcp_decision_stage): Make multiple sweeps over the call-graph.
> OK except...
>> diff --git a/gcc/params.opt b/gcc/params.opt
>> index beaa61c48ab..3ff41c20f13 100644
>> --- a/gcc/params.opt
>> +++ b/gcc/params.opt
>> @@ -265,6 +265,10 @@ Percentage penalty the recursive functions will receive 
>> when they are evaluated
>>  Common Joined UInteger Var(param_ipa_cp_single_call_penalty) Init(15) 
>> IntegerRange(0, 100) Param Optimization
>>  Percentage penalty functions containing a single call to another function 
>> will receive when they are evaluated for cloning.
>>  
>> +-param=ipa-cp-sweeps=
>> +Common Joined UInteger Var(param_ipa_cp_sweeps) Init(3) IntegerRange(1, 
>> 100) Param Optimization
>> +The number of times the interprocedural constant propagation will traverse 
>> all functions to make cloning decisions.
>> +
> Shouldn't there also be a documentation update?

Thanks, you are of course right.

I have just pushed the following after rebasing, and re-testing, this
time also with make info and make pdf.

Martin


2025-07-08  Martin Jambor  <[email protected]>

        * params.opt (param_ipa_cp_sweeps): New.
        * doc/invoke.texi (ipa-cp-sweeps): New.
        * ipa-cp.cc (max_number_sweeps): New.
        (get_max_overall_size): New parameter cur_sweep, use it and the total
        number of sweeps from the NODE to calculate the result too.
        (ipcp_propagate_stage): Get the maximum number of sweeps specified in
        the corresponding parameter of any possibly affected node.
        (good_cloning_opportunity_p): Add parameter cur_sweep, adjust the
        threshold according to it.
        (decide_about_value): New parameter cur_sweep, pass it to
        get_max_overall_size and to good_cloning_opportunity_p.
        (decide_whether_version_node): New parameter cur_sweep, pass it to
        decide_about_value and get_max_overall_size.  Make sure the node is
        not dead.
        (ipcp_decision_stage): Make multiple sweeps over the call-graph.
---
 gcc/doc/invoke.texi |   4 ++
 gcc/ipa-cp.cc       | 106 +++++++++++++++++++++++++++++---------------
 gcc/params.opt      |   4 ++
 3 files changed, 78 insertions(+), 36 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f6c79b051d0..cf96d43ba11 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17522,6 +17522,10 @@ are evaluated for cloning.
 Percentage penalty functions containing a single call to another
 function will receive when they are evaluated for cloning.
 
+@item ipa-cp-sweeps
+The number of times the interprocedural constant propagation will traverse
+all functions to make cloning decisions.
+
 @item ipa-max-agg-items
 IPA-CP is also capable to propagate a number of scalar values passed
 in an aggregate. @option{ipa-max-agg-items} controls the maximum
diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 6c2fdc8c05b..2d91c907596 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -151,6 +151,10 @@ object_allocator<ipcp_agg_lattice> ipcp_agg_lattice_pool
 
 static long overall_size, orig_overall_size;
 
+/* The maximum number of IPA-CP decision sweeps that any node requested in its
+   param.  */
+static int max_number_sweeps;
+
 /* Node name to unique clone suffix number map.  */
 static hash_map<const char *, unsigned> *clone_num_suffixes;
 
@@ -3376,12 +3380,14 @@ incorporate_penalties (cgraph_node *node, 
ipa_node_params *info,
 
 /* Return true if cloning NODE is a good idea, given the estimated TIME_BENEFIT
    and SIZE_COST and with the sum of frequencies of incoming edges to the
-   potential new clone in FREQUENCIES.  */
+   potential new clone in FREQUENCIES.  CUR_SWEEP is the number of the current
+   sweep of IPA-CP over the call-graph in the decision stage.  */
 
 static bool
 good_cloning_opportunity_p (struct cgraph_node *node, sreal time_benefit,
                            sreal freq_sum, profile_count count_sum,
-                           int size_cost, bool called_without_ipa_profile)
+                           int size_cost, bool called_without_ipa_profile,
+                           int cur_sweep)
 {
   gcc_assert (count_sum.ipa () == count_sum);
   if (count_sum.quality () == AFDO)
@@ -3396,7 +3402,9 @@ good_cloning_opportunity_p (struct cgraph_node *node, 
sreal time_benefit,
   gcc_assert (size_cost > 0);
 
   ipa_node_params *info = ipa_node_params_sum->get (node);
+  int num_sweeps = opt_for_fn (node->decl, param_ipa_cp_sweeps);
   int eval_threshold = opt_for_fn (node->decl, param_ipa_cp_eval_threshold);
+  eval_threshold = (eval_threshold * num_sweeps) / cur_sweep;
   /* If we know the execution IPA execution counts, we can estimate overall
      speedup of the program.  */
   if (count_sum.nonzero_p ())
@@ -3551,20 +3559,25 @@ perform_estimation_of_a_value (cgraph_node *node,
   val->local_size_cost = size;
 }
 
-/* Get the overall limit oof growth based on parameters extracted from growth.
-   it does not really make sense to mix functions with different overall growth
-   limits but it is possible and if it happens, we do not want to select one
-   limit at random.  */
+/* Get the overall limit of growth based on parameters extracted from growth,
+   and CUR_SWEEP, which is the number of the current sweep of IPA-CP over the
+   call-graph in the decision stage.  It does not really make sense to mix
+   functions with different overall growth limits or even number of sweeps but
+   it is possible and if it happens, we do not want to select one limit at
+   random, so get the limits from NODE.  */
 
 static long
-get_max_overall_size (cgraph_node *node)
+get_max_overall_size (cgraph_node *node, int cur_sweep)
 {
   long max_new_size = orig_overall_size;
   long large_unit = opt_for_fn (node->decl, param_ipa_cp_large_unit_insns);
   if (max_new_size < large_unit)
     max_new_size = large_unit;
+  int num_sweeps = opt_for_fn (node->decl, param_ipa_cp_sweeps);
+  gcc_assert (cur_sweep <= num_sweeps);
   int unit_growth = opt_for_fn (node->decl, param_ipa_cp_unit_growth);
-  max_new_size += max_new_size * unit_growth / 100 + 1;
+  max_new_size += ((max_new_size * unit_growth * cur_sweep)
+                  / num_sweeps) / 100 + 1;
   return max_new_size;
 }
 
@@ -4022,6 +4035,10 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
        unsigned nlattices = ipa_get_param_count (info);
        info->lattices.safe_grow_cleared (nlattices, true);
        initialize_node_lattices (node);
+
+       int num_sweeps = opt_for_fn (node->decl, param_ipa_cp_sweeps);
+       if (max_number_sweeps < num_sweeps)
+         max_number_sweeps = num_sweeps;
       }
     ipa_size_summary *s = ipa_size_summaries->get (node);
     if (node->definition && !node->alias && s != NULL)
@@ -5783,13 +5800,14 @@ ipcp_val_agg_replacement_ok_p (vec<ipa_argagg_value, 
va_gc> *,
    parameter itself, otherwise it is stored at the given OFFSET of the
    parameter.  AVALS describes the other already known values.  SELF_GEN_CLONES
    is a vector which contains clones created for self-recursive calls with an
-   arithmetic pass-through jump function.  */
+   arithmetic pass-through jump function.  CUR_SWEEP is the number of the
+   current sweep of the call-graph during the decision stage.  */
 
 template <typename valtype>
 static bool
 decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset,
                    ipcp_value<valtype> *val, ipa_auto_call_arg_values *avals,
-                   vec<cgraph_node *> *self_gen_clones)
+                   vec<cgraph_node *> *self_gen_clones, int cur_sweep)
 {
   int caller_count;
   sreal freq_sum;
@@ -5802,7 +5820,8 @@ decide_about_value (struct cgraph_node *node, int index, 
HOST_WIDE_INT offset,
       perhaps_add_new_callers (node, val);
       return false;
     }
-  else if (val->local_size_cost + overall_size > get_max_overall_size (node))
+  else if (val->local_size_cost + overall_size
+          > get_max_overall_size (node, cur_sweep))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
        fprintf (dump_file, "   Ignoring candidate value because "
@@ -5850,10 +5869,10 @@ decide_about_value (struct cgraph_node *node, int 
index, HOST_WIDE_INT offset,
   if (!good_cloning_opportunity_p (node, val->local_time_benefit,
                                   freq_sum, count_sum,
                                   val->local_size_cost,
-                                  called_without_ipa_profile)
+                                  called_without_ipa_profile, cur_sweep)
       && !good_cloning_opportunity_p (node, val->prop_time_benefit,
                                      freq_sum, count_sum, val->prop_size_cost,
-                                     called_without_ipa_profile))
+                                     called_without_ipa_profile, cur_sweep))
     return false;
 
   if (dump_file)
@@ -5914,16 +5933,18 @@ ipa_range_contains_p (const vrange &r, tree val)
   return r.contains_p (val);
 }
 
-/* Decide whether and what specialized clones of NODE should be created.  */
+/* Decide whether and what specialized clones of NODE should be created.
+   CUR_SWEEP is the number of the current sweep of the call-graph during the
+   decision stage.  */
 
 static bool
-decide_whether_version_node (struct cgraph_node *node)
+decide_whether_version_node (struct cgraph_node *node, int cur_sweep)
 {
   ipa_node_params *info = ipa_node_params_sum->get (node);
   int i, count = ipa_get_param_count (info);
   bool ret = false;
 
-  if (count == 0)
+  if (info->node_dead || count == 0)
     return false;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -5974,7 +5995,7 @@ decide_whether_version_node (struct cgraph_node *node)
                  continue;
                }
              ret |= decide_about_value (node, i, -1, val, &avals,
-                                        &self_gen_clones);
+                                        &self_gen_clones, cur_sweep);
            }
        }
 
@@ -5990,7 +6011,7 @@ decide_whether_version_node (struct cgraph_node *node)
                    || !aglat->is_single_const ()))
              for (val = aglat->values; val; val = val->next)
                ret |= decide_about_value (node, i, aglat->offset, val, &avals,
-                                          &self_gen_clones);
+                                          &self_gen_clones, cur_sweep);
        }
 
       if (!ctxlat->bottom
@@ -5999,7 +6020,7 @@ decide_whether_version_node (struct cgraph_node *node)
          ipcp_value<ipa_polymorphic_call_context> *val;
          for (val = ctxlat->values; val; val = val->next)
            ret |= decide_about_value (node, i, -1, val, &avals,
-                                      &self_gen_clones);
+                                      &self_gen_clones, cur_sweep);
        }
     }
 
@@ -6049,9 +6070,10 @@ decide_whether_version_node (struct cgraph_node *node)
        }
       else if (good_cloning_opportunity_p (node, time, stats.freq_sum,
                                           stats.count_sum, size,
-                                          stats.called_without_ipa_profile))
+                                          stats.called_without_ipa_profile,
+                                          cur_sweep))
        {
-         if (size + overall_size <= get_max_overall_size (node))
+         if (size + overall_size <= get_max_overall_size (node, cur_sweep))
            {
              if (!dbg_cnt (ipa_cp_values))
                return ret;
@@ -6277,26 +6299,38 @@ ipcp_decision_stage (class ipa_topo_info *topo)
   int i;
 
   if (dump_file)
-    fprintf (dump_file, "\nIPA decision stage:\n\n");
+    fprintf (dump_file, "\nIPA decision stage (%i sweeps):\n",
+            max_number_sweeps);
 
-  for (i = topo->nnodes - 1; i >= 0; i--)
+  for (int cur_sweep = 1; cur_sweep <= max_number_sweeps; cur_sweep++)
     {
-      struct cgraph_node *node = topo->order[i];
-      bool change = false, iterate = true;
+      if (dump_file && (dump_flags & TDF_DETAILS))
+       fprintf (dump_file, "\nIPA decision sweep number %i (out of %i):\n",
+                cur_sweep, max_number_sweeps);
 
-      while (iterate)
+      for (i = topo->nnodes - 1; i >= 0; i--)
        {
-         struct cgraph_node *v;
-         iterate = false;
-         for (v = node; v; v = ((struct ipa_dfs_info *) v->aux)->next_cycle)
-           if (v->has_gimple_body_p ()
-               && ipcp_versionable_function_p (v))
-             iterate |= decide_whether_version_node (v);
-
-         change |= iterate;
+         struct cgraph_node *node = topo->order[i];
+         bool change = false, iterate = true;
+
+         while (iterate)
+           {
+             struct cgraph_node *v;
+             iterate = false;
+             for (v = node;
+                  v;
+                  v = ((struct ipa_dfs_info *) v->aux)->next_cycle)
+               if (v->has_gimple_body_p ()
+                   && ipcp_versionable_function_p (v)
+                   && (cur_sweep
+                       <= opt_for_fn (node->decl, param_ipa_cp_sweeps)))
+                 iterate |= decide_whether_version_node (v, cur_sweep);
+
+             change |= iterate;
+           }
+         if (change)
+           identify_dead_nodes (node);
        }
-      if (change)
-       identify_dead_nodes (node);
     }
 
   /* Currently, the primary use of callback edges is constant propagation.
diff --git a/gcc/params.opt b/gcc/params.opt
index e2b3930d5a7..6248c73b89d 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -273,6 +273,10 @@ Percentage penalty the recursive functions will receive 
when they are evaluated
 Common Joined UInteger Var(param_ipa_cp_single_call_penalty) Init(15) 
IntegerRange(0, 100) Param Optimization
 Percentage penalty functions containing a single call to another function will 
receive when they are evaluated for cloning.
 
+-param=ipa-cp-sweeps=
+Common Joined UInteger Var(param_ipa_cp_sweeps) Init(3) IntegerRange(1, 100) 
Param Optimization
+The number of times the interprocedural constant propagation will traverse all 
functions to make cloning decisions.
+
 -param=ipa-cp-unit-growth=
 Common Joined UInteger Var(param_ipa_cp_unit_growth) Init(10) Param 
Optimization
 How much can given compilation unit grow because of the interprocedural 
constant propagation (in percent).
-- 
2.52.0

Reply via email to