[PATCH] Fix PR37448 (and dups?)

Richard Biener Thu, 18 Feb 2016 02:07:48 -0800

This fixes the IPA inline analysis quadraticness that arises when we
inline functions into the same function many times via
inline_to_all_callers as we do in the testcase for PR37448.  In that
case updating the overall summary of the caller is done for each
inlined call instead of just once.


The solution is to keep track of which nodes we inlined to and
delay the summary update.

Honza, I believe this is safe as the callers summary should only
matter for further inline decisions and not this one (inlining
into all callers).  Double-checking is appreciated - there should
be no code changes by this patch (I verified this on the PR37448
testcase).

This brings down compile-time of ipa inlining heuristics from

 ipa inlining heuristics : 260.14 (84%) usr   0.50 (23%) sys 260.91 (84%) 
wall  102217 kB (10%) ggc

(that's an optimized non-checking compiler) to

 ipa inlining heuristics :   3.52 ( 3%) usr   0.55 (12%) sys   4.21 ( 3%) 
wall  102216 kB (12%) ggc

(that's an unoptimized, checking enabled compiler, "before" on that
unoptimized compiler is  ipa inlining heuristics : 935.06 (89%)).

We still do a lot of inlining here so we see

 integration             :  21.22 (17%) usr   0.45 (10%) sys  21.68 (17%) 
wall  142979 kB (17%) ggc

which is supposedly expected (same unoptimized, checking enabled 
compiler).

Bootstrap & regtest is running on x86_64-unknown-linux-gnu, ok for trunk
and branch(es)?

Thanks,
Richard.

2016-02-18  Richard Biener  <rguent...@suse.de>

        PR ipa/37448
        * ipa-inline.c (inline_to_all_callers_1): Add to the callers
        hash-set, do not update overall summaries here.  Renamed from ...
        (inline_to_all_callers): ... this which is now wrapping the
        above and performing delayed overall summary update.

Index: gcc/ipa-inline.c
===================================================================
*** gcc/ipa-inline.c    (revision 233498)
--- gcc/ipa-inline.c    (working copy)
*************** flatten_function (struct cgraph_node *no
*** 2163,2169 ****
     recursion.  */
  
  static bool
! inline_to_all_callers (struct cgraph_node *node, void *data)
  {
    int *num_calls = (int *)data;
    bool callee_removed = false;
--- 2163,2170 ----
     recursion.  */
  
  static bool
! inline_to_all_callers_1 (struct cgraph_node *node, void *data,
!                        hash_set<cgraph_node *> *callers)
  {
    int *num_calls = (int *)data;
    bool callee_removed = false;
*************** inline_to_all_callers (struct cgraph_nod
*** 2193,2199 ****
                   inline_summaries->get (node->callers->caller)->size);
        }
  
!       inline_call (node->callers, true, NULL, NULL, true, &callee_removed);
        if (dump_file)
        fprintf (dump_file,
                 " Inlined into %s which now has %i size\n",
--- 2194,2203 ----
                   inline_summaries->get (node->callers->caller)->size);
        }
  
!       /* Remember which callers we inlined to, delaying updating the
!        overall summary.  */
!       callers->add (node->callers->caller);
!       inline_call (node->callers, true, NULL, NULL, false, &callee_removed);
        if (dump_file)
        fprintf (dump_file,
                 " Inlined into %s which now has %i size\n",
*************** inline_to_all_callers (struct cgraph_nod
*** 2211,2216 ****
--- 2215,2237 ----
    return false;
  }
  
+ /* Wrapper around inline_to_all_callers_1 doing delayed overall summary
+    update.  */
+ 
+ static bool
+ inline_to_all_callers (struct cgraph_node *node, void *data)
+ {
+   hash_set<cgraph_node *> callers;
+   bool res = inline_to_all_callers_1 (node, data, &callers);
+   /* Perform the delayed update of the overall summary of all callers
+      processed.  This avoids quadratic behavior in the cases where
+      we have a lot of calls to the same function.  */
+   for (hash_set<cgraph_node *>::iterator i = callers.begin ();
+        i != callers.end (); ++i)
+     inline_update_overall_summary (*i);
+   return res;
+ }
+ 
  /* Output overall time estimate.  */
  static void
  dump_overall_stats (void)

[PATCH] Fix PR37448 (and dups?)

Reply via email to