Ok for now as a workraround, but this is probably not a long term fix. David
On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen <de...@google.com> wrote: > The patch was updated to ignore comdat einline tuning for AutoFDO. > Performance testing is green. > > OK for google-4_9? > > Thanks, > Dehao > > Index: gcc/auto-profile.c > =================================================================== > --- gcc/auto-profile.c (revision 217523) > +++ gcc/auto-profile.c (working copy) > @@ -1771,6 +1771,7 @@ auto_profile (void) > free_dominance_info (CDI_DOMINATORS); > free_dominance_info (CDI_POST_DOMINATORS); > rebuild_cgraph_edges (); > + compute_inline_parameters (cgraph_get_node > (current_function_decl), true); > pop_cfun (); > } > > Index: gcc/ipa-inline.c > =================================================================== > --- gcc/ipa-inline.c (revision 217523) > +++ gcc/ipa-inline.c (working copy) > @@ -501,7 +501,7 @@ want_early_inline_function_p (struct cgraph_edge * > growth); > want_inline = false; > } > - else if (DECL_COMDAT (callee->decl) > + else if (!flag_auto_profile && DECL_COMDAT (callee->decl) > && growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT)) > ; > else if ((n = num_calls (callee)) != 0 > > On Thu, Nov 13, 2014 at 3:42 PM, Dehao Chen <de...@google.com> wrote: >> We do not do sophisticated recursive call detection in einline phase. >> It only happens in ipa-inline phase. >> >> Dehao >> >> On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li <davi...@google.com> >> wrote: >>> On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen <de...@google.com> wrote: >>>> IIRC, AutoFDO the actual iteration for AutoFDO is mostly <3. But it >>>> should not harm to set max iter as 10. >>>> >>>> On Thu, Nov 13, 2014 at 2:51 PM, Xinliang David Li <davi...@google.com> >>>> wrote: >>>>> After inline summary is recomputed, the large code growth problem will >>>>> also be better controlled, right? >>>> >>>> For this case, recomputing inline summary does not help because the >>>> code was bloated in first einline phase. >>> >>> For recursive inlining, the inline summary for the cloned edges need >>> to be updated to prevent the growth? >>> >>> david >>> >>>> >>>> Dehao >>>> >>>>> >>>>> David >>>>> >>>>> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li <davi...@google.com> >>>>> wrote: >>>>>> Is there a need to have 10 iterations of early inline for autofdo? >>>>>> >>>>>> David >>>>>> >>>>>> On Thu, Nov 13, 2014 at 2:25 PM, Dehao Chen <de...@google.com> wrote: >>>>>>> In AutoFDO, we increase einline iterations. This could lead to >>>>>>> extensive code bloat if we have recursive calls like: >>>>>>> >>>>>>> dtor() { >>>>>>> destroy(node); >>>>>>> } >>>>>>> >>>>>>> destroy(node) { >>>>>>> destroy(left) >>>>>>> destroy(right) >>>>>>> } >>>>>>> >>>>>>> In this case, the size growth will be around 8 which is smaller than >>>>>>> threshold (11). However, if we allow this to happen for 2 iterations, >>>>>>> it will expand the size by 1024X. To fix this problem, we want to set >>>>>>> a much smaller threshold in the AutoFDO case. This is because AutoFDO >>>>>>> do not not rely on aggressive einline to gain more profile context. >>>>>>> >>>>>>> And also, in AutoFDO pass, after we processed a function, we need to >>>>>>> recompute inline parameters because rebuild_cgraph_edges will zero out >>>>>>> all inline parameters. >>>>>>> >>>>>>> The patch is attached below, bootstrapped and perf test on-going. OK >>>>>>> for google-4_9? >>>>>>> >>>>>>> Thanks, >>>>>>> Dehao >>>>>>> >>>>>>> Index: gcc/auto-profile.c >>>>>>> =================================================================== >>>>>>> --- gcc/auto-profile.c (revision 217523) >>>>>>> +++ gcc/auto-profile.c (working copy) >>>>>>> @@ -1771,6 +1771,7 @@ auto_profile (void) >>>>>>> free_dominance_info (CDI_DOMINATORS); >>>>>>> free_dominance_info (CDI_POST_DOMINATORS); >>>>>>> rebuild_cgraph_edges (); >>>>>>> + compute_inline_parameters (cgraph_get_node >>>>>>> (current_function_decl), true); >>>>>>> pop_cfun (); >>>>>>> } >>>>>>> >>>>>>> Index: gcc/opts.c >>>>>>> =================================================================== >>>>>>> --- gcc/opts.c (revision 217523) >>>>>>> +++ gcc/opts.c (working copy) >>>>>>> @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts, >>>>>>> maybe_set_param_value ( >>>>>>> PARAM_EARLY_INLINER_MAX_ITERATIONS, 10, >>>>>>> opts->x_param_values, opts_set->x_param_values); >>>>>>> + maybe_set_param_value ( >>>>>>> + PARAM_EARLY_INLINING_INSNS, 4, >>>>>>> + opts->x_param_values, opts_set->x_param_values); >>>>>>> + maybe_set_param_value ( >>>>>>> + PARAM_EARLY_INLINING_INSNS_COMDAT, 4, >>>>>>> + opts->x_param_values, opts_set->x_param_values); >>>>>>> value = true; >>>>>>> /* No break here - do -fauto-profile processing. */ >>>>>>> case OPT_fauto_profile: