Re: Relax limits of early inliner for the forwarder functions
> > So this counts all calls in the function we want to inline (!?). > That's completely > backward to me. In fact for forwarder functions you still only allow half > of the early-inlining-insns growth. Previously for non-leafs we didn't allow > any growth (hm, why?). Well, the idea is that inlining leaf functions is almost always good idea (i.e. you can assume that the function's body will optimize well with surrounding code and eliminating a call is good thing) Inlining functions that have call in it is less cool. I introduced the non-leaf/leaf logic in about 4.6 time after late inlining became more informed about anticipated optimizations, but it really caused quite some trouble on C++ abstraction, so relaxing this logic somewhat seemed like resonable idea. > > Now with relaxing that and allowing functions with calls to be inlined more > frequently we run into PR55797 which shows that we cannot limit recursive > inlining anymore if it is indirect one level. By means of early > inlining iteration > we blow up completely (8 iterations at most?!). Also because we do not > compute overall function growth (because we rely on early inlining only > shrinking code size ...). Well, we compute function growth, but for each iteratio nseparately. > > I believe we at least need to track recursive inlining during early inliner > iteration by means of some ->aux marking or so. Hmm, I guess we want to disable recursive inlining in the early inliner completely. I will take a look. Honza > > Honza - please have a look at the ICE in PR55797 and the issues with > this patch enabling more inlining. > > Thanks, > Richard. > > > { > > if (dump_file) > > fprintf (dump_file, " will not early inline: %s/%i->%s/%i, " > > -"growth %i exceeds --param early-inlining-insns\n", > > +"growth %i exceeds --param early-inlining-insns " > > +"divided by number of calls\n", > > xstrdup (cgraph_node_name (e->caller)), e->caller->uid, > > xstrdup (cgraph_node_name (callee)), callee->uid, > > growth);
Re: Relax limits of early inliner for the forwarder functions
On Mon, Nov 5, 2012 at 12:23 PM, Jan Hubicka wrote: > Hi, > in 4.6 timeframe I limited early inlier growth to apply only for leaf > functions. > This does not work really well, because with less propagation of address > expressions > we are really not 100% succesfull on detecting C++ forwarders and predicting > them > zero cost. This patch simply makes the cost to be divided by number of > callees, similarly > as in LLVM. > > Bootstrapped/regtested x86_64-linux, benchmarked and comitted. > The patch seems consistent win in all benchmarks, most noticeably in tramp3d. > > * ipa-inline.c (leaf_node_p): Rename to ... > (num_calls) ... this one. > (want_early_inline_function_p): Allow smal growth on non-leafs. > Index: ipa-inline.c > === > --- ipa-inline.c(revision 193134) > +++ ipa-inline.c(working copy) > @@ -380,17 +380,18 @@ can_early_inline_edge_p (struct cgraph_e > } > > > -/* Return true when N is leaf function. Accept cheap builtins > - in leaf functions. */ > +/* Return number of calls in N. Ignore cheap builtins. */ > > -static bool > -leaf_node_p (struct cgraph_node *n) > +static int > +num_calls (struct cgraph_node *n) > { >struct cgraph_edge *e; > + int num = 0; > + >for (e = n->callees; e; e = e->next_callee) > if (!is_inexpensive_builtin (e->callee->symbol.decl)) > - return false; > - return true; > + num++; > + return num; > } This counts all calls in 'n' > > @@ -414,6 +415,8 @@ want_early_inline_function_p (struct cgr >else > { >int growth = estimate_edge_growth (e); > + int n; > + >if (growth <= 0) > ; >else if (!cgraph_maybe_hot_edge_p (e) > @@ -427,22 +430,23 @@ want_early_inline_function_p (struct cgr > growth); > want_inline = false; > } > - else if (!leaf_node_p (callee) > - && growth > 0) > + else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) > { > if (dump_file) > fprintf (dump_file, " will not early inline: %s/%i->%s/%i, " > -"callee is not leaf and code would grow by %i\n", > +"growth %i exceeds --param early-inlining-insns\n", > xstrdup (cgraph_node_name (e->caller)), e->caller->uid, > xstrdup (cgraph_node_name (callee)), callee->uid, > growth); > want_inline = false; > } > - else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) > + else if ((n = num_calls (callee)) != 0 > + && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) So this counts all calls in the function we want to inline (!?). That's completely backward to me. In fact for forwarder functions you still only allow half of the early-inlining-insns growth. Previously for non-leafs we didn't allow any growth (hm, why?). Now with relaxing that and allowing functions with calls to be inlined more frequently we run into PR55797 which shows that we cannot limit recursive inlining anymore if it is indirect one level. By means of early inlining iteration we blow up completely (8 iterations at most?!). Also because we do not compute overall function growth (because we rely on early inlining only shrinking code size ...). I believe we at least need to track recursive inlining during early inliner iteration by means of some ->aux marking or so. Honza - please have a look at the ICE in PR55797 and the issues with this patch enabling more inlining. Thanks, Richard. > { > if (dump_file) > fprintf (dump_file, " will not early inline: %s/%i->%s/%i, " > -"growth %i exceeds --param early-inlining-insns\n", > +"growth %i exceeds --param early-inlining-insns " > +"divided by number of calls\n", > xstrdup (cgraph_node_name (e->caller)), e->caller->uid, > xstrdup (cgraph_node_name (callee)), callee->uid, > growth);
Relax limits of early inliner for the forwarder functions
Hi, in 4.6 timeframe I limited early inlier growth to apply only for leaf functions. This does not work really well, because with less propagation of address expressions we are really not 100% succesfull on detecting C++ forwarders and predicting them zero cost. This patch simply makes the cost to be divided by number of callees, similarly as in LLVM. Bootstrapped/regtested x86_64-linux, benchmarked and comitted. The patch seems consistent win in all benchmarks, most noticeably in tramp3d. * ipa-inline.c (leaf_node_p): Rename to ... (num_calls) ... this one. (want_early_inline_function_p): Allow smal growth on non-leafs. Index: ipa-inline.c === --- ipa-inline.c(revision 193134) +++ ipa-inline.c(working copy) @@ -380,17 +380,18 @@ can_early_inline_edge_p (struct cgraph_e } -/* Return true when N is leaf function. Accept cheap builtins - in leaf functions. */ +/* Return number of calls in N. Ignore cheap builtins. */ -static bool -leaf_node_p (struct cgraph_node *n) +static int +num_calls (struct cgraph_node *n) { struct cgraph_edge *e; + int num = 0; + for (e = n->callees; e; e = e->next_callee) if (!is_inexpensive_builtin (e->callee->symbol.decl)) - return false; - return true; + num++; + return num; } @@ -414,6 +415,8 @@ want_early_inline_function_p (struct cgr else { int growth = estimate_edge_growth (e); + int n; + if (growth <= 0) ; else if (!cgraph_maybe_hot_edge_p (e) @@ -427,22 +430,23 @@ want_early_inline_function_p (struct cgr growth); want_inline = false; } - else if (!leaf_node_p (callee) - && growth > 0) + else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) { if (dump_file) fprintf (dump_file, " will not early inline: %s/%i->%s/%i, " -"callee is not leaf and code would grow by %i\n", +"growth %i exceeds --param early-inlining-insns\n", xstrdup (cgraph_node_name (e->caller)), e->caller->uid, xstrdup (cgraph_node_name (callee)), callee->uid, growth); want_inline = false; } - else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) + else if ((n = num_calls (callee)) != 0 + && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) { if (dump_file) fprintf (dump_file, " will not early inline: %s/%i->%s/%i, " -"growth %i exceeds --param early-inlining-insns\n", +"growth %i exceeds --param early-inlining-insns " +"divided by number of calls\n", xstrdup (cgraph_node_name (e->caller)), e->caller->uid, xstrdup (cgraph_node_name (callee)), callee->uid, growth);