I've updated the patch to check it at ipa-inline:

Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c (revision 199593)
+++ gcc/ipa-inline.c (working copy)
@@ -434,6 +434,16 @@ want_early_inline_function_p (struct cgraph_edge *

       if (growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_ANY))
  ;
+      else if (flag_auto_profile)
+ {
+  if (dump_file)
+    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
+     "call is cold in profiling and code would grow by %i\n",
+     xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
+     xstrdup (cgraph_node_name (callee)), callee->uid,
+     growth);
+    want_inline = false;
+ }
       else if (!cgraph_maybe_hot_edge_p (e))
  {
   if (dump_file)

Thanks,
Dehao

On Sun, Jun 2, 2013 at 9:08 PM, Xinliang David Li <davi...@google.com> wrote:
> If the purpose of the fix is to filter early inlinings with code
> growth in autoFDO, the proposed fix is the wrong way to do -- it
> changes the meaning of cgraph_maybe_hot_edge_p.
>
> David
>
> On Sun, Jun 2, 2013 at 7:25 PM, Dehao Chen <de...@google.com> wrote:
>> On Sun, Jun 2, 2013 at 7:14 PM, Xinliang David Li <davi...@google.com> wrote:
>>>
>>> auto profile info is not available yet in early inlining, why would
>>> this change make any difference?
>>
>> Because the check of PARAM_EARLY_INLINING_INSNS is after the check of
>> cgraph_maybe_hot_edge_p in early inline. If
>> cgraph_maybe_hot_edge_p fails, the early inline will not happen even
>> if growth is less than PARAM_EARLY_INLINING_INSNS.
>>
>>>
>>> Can you just reset the max_iters to a
>>> higher value for autoFDO?
>>
>> We could do that, but it could still lead to some code bloat because
>> recursive inlines can happen for at most, say 10, iterations.
>>
>> Dehao
>>
>>>
>>> David
>>>
>>> On Sun, Jun 2, 2013 at 6:21 PM, Dehao Chen <de...@google.com> wrote:
>>> > The patch was committed to google-4_8, but it causes problem because
>>> > einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
>>> > recursive inlining at einline stage (e.g. main->foo, foo->bar,
>>> > bar->foo) when autofdo is enabled.
>>> >
>>> > The following patch can fix the problem by doing more targetted early 
>>> > inlining:
>>> >
>>> > Index: gcc/predict.c
>>> > ===================================================================
>>> > --- gcc/predict.c (revision 199593)
>>> > +++ gcc/predict.c (working copy)
>>> > @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>>> >        && !maybe_hot_count_p (NULL,
>>> >                               edge->count))
>>> >      return false;
>>> > +  if (flag_auto_profile)
>>> > +    return false;
>>> >    if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>>> >        || (edge->callee
>>> >    && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
>>> >
>>> > Performance testing on-going...
>>> >
>>> > Dehao
>>> >
>>> > On Wed, May 29, 2013 at 3:44 PM, Dehao Chen <de...@google.com> wrote:
>>> >> OK, I'll commit the early inline part.
>>> >>
>>> >> Dehao
>>> >>
>>> >> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li <davi...@google.com> 
>>> >> wrote:
>>> >>> The early inlining part is ok. The tracer optimization should be
>>> >>> revisited -- we should have more fine grain control on it (for
>>> >>> instance, based on FDO summary -- but that should be common to
>>> >>> FDO/LIPO).
>>> >>>
>>> >>> David
>>> >>>
>>> >>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen <de...@google.com> wrote:
>>> >>>> In gcc4-8, the max einline iterations are restricted to 1. For
>>> >>>> AutoFDO, this is bad because early inline is not size restricted. This
>>> >>>> patch allows einline to do multiple iterations in AutoFDO. It also
>>> >>>> enables tracer optimization in AutoFDO.
>>> >>>>
>>> >>>> Bootstrapped and passed regression test.
>>> >>>>
>>> >>>> OK for googel-4_8?
>>> >>>>
>>> >>>> Thanks,
>>> >>>> Dehao
>>> >>>>
>>> >>>> Index: gcc/ipa-inline.c
>>> >>>> ===================================================================
>>> >>>> --- gcc/ipa-inline.c (revision 199416)
>>> >>>> +++ gcc/ipa-inline.c (working copy)
>>> >>>> @@ -2161,7 +2161,8 @@ early_inliner (void)
>>> >>>>      {
>>> >>>>        /* We iterate incremental inlining to get trivial cases of 
>>> >>>> indirect
>>> >>>>   inlining.  */
>>> >>>> -      while (iterations < PARAM_VALUE 
>>> >>>> (PARAM_EARLY_INLINER_MAX_ITERATIONS)
>>> >>>> +      while ((flag_auto_profile
>>> >>>> +      || iterations < PARAM_VALUE 
>>> >>>> (PARAM_EARLY_INLINER_MAX_ITERATIONS))
>>> >>>>       && early_inline_small_functions (node))
>>> >>>>   {
>>> >>>>    timevar_push (TV_INTEGRATION);
>>> >>>> Index: gcc/opts.c
>>> >>>> ===================================================================
>>> >>>> --- gcc/opts.c (revision 199416)
>>> >>>> +++ gcc/opts.c (working copy)
>>> >>>> @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
>>> >>>>   opts->x_flag_peel_loops = value;
>>> >>>>        if (!opts_set->x_flag_value_profile_transformations)
>>> >>>>   opts->x_flag_value_profile_transformations = value;
>>> >>>> +      if (!opts_set->x_flag_tracer)
>>> >>>> + opts->x_flag_tracer = value;
>>> >>>>        if (!opts_set->x_flag_inline_functions)
>>> >>>>   opts->x_flag_inline_functions = value;
>>> >>>>        if (!opts_set->x_flag_ipa_cp)

Reply via email to