http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58555
--- Comment #25 from Jan Hubicka <hubicka at ucw dot cz> --- > It is easier to just return at beggining instead of duplicating the check. > Have patch for it, just for some reason > I wanted to look deper into why we inline here. I forgot the reason, but will > work it out again today. OK, remember the reason, the inlining decisions did not make sense :) It turned out to be very basic and apparently ages old profile updating but in clone_inlined_nodes. This function is used when one inlines call a->b and it produces clone of b to be inlined into a. It updates frequencies in the clone, so if the edge is, say, executing twice per invocation of a, the frequencies in b are doubled. Now it may happen that b has inlined functions to it. In that case the function recurses and inlines further. It stil updates frequencies on the same basis: i.e. it takes average number of executions of edge per invocation and multiply. This is completely wrong. If there is chain b->c->d->e all inlined and c is executed at average 10 times per execution of b and c->d, the the frequenceis of edge c->d->e are already scaled up 10fold. The current logic sees edge c->d already multiplied and scale again causing quite an explosion in frequencies. Bootstrapping/regtesting x86_64-linux. Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 207870) +++ ipa-inline.c (working copy) @@ -708,6 +684,12 @@ if (outer_node->global.inlined_to) caller_freq = outer_node->callers->frequency; + if (!caller_freq) + { + reason = "function is inlined and ulikely"; + want_inline = false; + } + if (!want_inline) ; /* Inlining of self recursive function into copy of itself within other function Index: ipa-inline.h =================================================================== --- ipa-inline.h (revision 207870) +++ ipa-inline.h (working copy) @@ -233,7 +234,8 @@ /* In ipa-inline-transform.c */ bool inline_call (struct cgraph_edge *, bool, vec<cgraph_edge_p> *, int *, bool); unsigned int inline_transform (struct cgraph_node *); -void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *); +void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *, + int freq_scale = -1); extern int ncalls_inlined; extern int nfunctions_inlined; Index: ipa-inline-transform.c =================================================================== --- ipa-inline-transform.c (revision 207870) +++ ipa-inline-transform.c (working copy) @@ -127,11 +127,16 @@ the edge and redirect it to the new clone. DUPLICATE is used for bookkeeping on whether we are actually creating new clones or re-using node originally representing out-of-line function call. - */ + By default the offline copy is removed, when it appers dead after inlining. + UPDATE_ORIGINAL prevents this transformation. + If OVERALL_SIZE is non-NULL, the size is updated to reflect the + transformation. + FREQ_SCALE is implicit parameter used for internal bookeeping when + recursively copying functions inlined into the clone. */ void clone_inlined_nodes (struct cgraph_edge *e, bool duplicate, - bool update_original, int *overall_size) + bool update_original, int *overall_size, int freq_scale) { struct cgraph_node *inlining_into; struct cgraph_edge *next; @@ -171,12 +176,16 @@ duplicate = false; e->callee->externally_visible = false; update_noncloned_frequencies (e->callee, e->frequency); + gcc_assert (freq_scale == -1); } else { struct cgraph_node *n; + + if (freq_scale == -1) + freq_scale = e->frequency; n = cgraph_clone_node (e->callee, e->callee->decl, - e->count, e->frequency, update_original, + e->count, freq_scale, update_original, vNULL, true, inlining_into); cgraph_redirect_edge_callee (e, n); } @@ -191,7 +200,7 @@ { next = e->next_callee; if (!e->inline_failed) - clone_inlined_nodes (e, duplicate, update_original, overall_size); + clone_inlined_nodes (e, duplicate, update_original, overall_size, freq_scale); if (e->speculative && !speculation_useful_p (e, true)) { cgraph_resolve_speculation (e, NULL);