The patch was committed to google-4_8, but it causes problem because einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause recursive inlining at einline stage (e.g. main->foo, foo->bar, bar->foo) when autofdo is enabled.
The following patch can fix the problem by doing more targetted early inlining: Index: gcc/predict.c =================================================================== --- gcc/predict.c (revision 199593) +++ gcc/predict.c (working copy) @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge) && !maybe_hot_count_p (NULL, edge->count)) return false; + if (flag_auto_profile) + return false; if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED || (edge->callee && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)) Performance testing on-going... Dehao On Wed, May 29, 2013 at 3:44 PM, Dehao Chen <de...@google.com> wrote: > OK, I'll commit the early inline part. > > Dehao > > On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li <davi...@google.com> > wrote: >> The early inlining part is ok. The tracer optimization should be >> revisited -- we should have more fine grain control on it (for >> instance, based on FDO summary -- but that should be common to >> FDO/LIPO). >> >> David >> >> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen <de...@google.com> wrote: >>> In gcc4-8, the max einline iterations are restricted to 1. For >>> AutoFDO, this is bad because early inline is not size restricted. This >>> patch allows einline to do multiple iterations in AutoFDO. It also >>> enables tracer optimization in AutoFDO. >>> >>> Bootstrapped and passed regression test. >>> >>> OK for googel-4_8? >>> >>> Thanks, >>> Dehao >>> >>> Index: gcc/ipa-inline.c >>> =================================================================== >>> --- gcc/ipa-inline.c (revision 199416) >>> +++ gcc/ipa-inline.c (working copy) >>> @@ -2161,7 +2161,8 @@ early_inliner (void) >>> { >>> /* We iterate incremental inlining to get trivial cases of indirect >>> inlining. */ >>> - while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS) >>> + while ((flag_auto_profile >>> + || iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)) >>> && early_inline_small_functions (node)) >>> { >>> timevar_push (TV_INTEGRATION); >>> Index: gcc/opts.c >>> =================================================================== >>> --- gcc/opts.c (revision 199416) >>> +++ gcc/opts.c (working copy) >>> @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts, >>> opts->x_flag_peel_loops = value; >>> if (!opts_set->x_flag_value_profile_transformations) >>> opts->x_flag_value_profile_transformations = value; >>> + if (!opts_set->x_flag_tracer) >>> + opts->x_flag_tracer = value; >>> if (!opts_set->x_flag_inline_functions) >>> opts->x_flag_inline_functions = value; >>> if (!opts_set->x_flag_ipa_cp)