Re: [PATCH] Fix PR50823 and its gazillion dups

2011-12-07 Thread Jan Hubicka
Am Tue 06 Dec 2011 03:41:36 PM CET schrieb Richard Guenther  
rguent...@suse.de:




This removes accounting for the number of remaining calls in
the inlining edge badness calculation (as discussed in private
with Honza a long time ago - and yes, we disagreed).  This


Well, the main reason for disagreement was that is makes programs bigger.
The basic idea of including badness in the cost is to make inliner  
preffer to fully inline functions that can be fully inline with little  
effort.  This seems

to work well to limit size growths when inlining.


fixes the various ICEs of the edge badness update and caching
code checking which are now present for over one month.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
A SPEC 2k6 LTO run is also in progress where I hope it will fix
the 403.gcc and 435.gromacs builds which fail with the same
ICE since a month.

Honza, I guess you have some 12h to come up with a different fix now ;)


I have patch for maintaining the growth via edge removal/addition  
hooks for some time now that  speeds things up, too, but needs some  
cleanups (in particular it messes up heap updating somewhere and I did  
not work out where, yet).  I think I can do it over weekend, but I am  
fine if this patch gets into tree beforehand. At least we will see how  
important the global metric still is.


Thanks!
Honza



Re: [PATCH] Fix PR50823 and its gazillion dups

2011-12-07 Thread Richard Guenther
On Wed, 7 Dec 2011, Jan Hubicka wrote:

 Am Tue 06 Dec 2011 03:41:36 PM CET schrieb Richard Guenther
 rguent...@suse.de:
 
  
  This removes accounting for the number of remaining calls in
  the inlining edge badness calculation (as discussed in private
  with Honza a long time ago - and yes, we disagreed).  This
 
 Well, the main reason for disagreement was that is makes programs bigger.
 The basic idea of including badness in the cost is to make inliner preffer to
 fully inline functions that can be fully inline with little effort.  This
 seems
 to work well to limit size growths when inlining.

Well, the theory might work for functions called once, but I doubt
it works for a function called 3 times and one called 2 times.
For functions called once we have the inline-functions-called-once
stuff, idependent on the fibheap priority driven inlining.

  fixes the various ICEs of the edge badness update and caching
  code checking which are now present for over one month.
  
  Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
  A SPEC 2k6 LTO run is also in progress where I hope it will fix
  the 403.gcc and 435.gromacs builds which fail with the same
  ICE since a month.
  
  Honza, I guess you have some 12h to come up with a different fix now ;)
 
 I have patch for maintaining the growth via edge removal/addition hooks for
 some time now that  speeds things up, too, but needs some cleanups (in
 particular it messes up heap updating somewhere and I did not work out where,
 yet).  I think I can do it over weekend, but I am fine if this patch gets into
 tree beforehand. At least we will see how important the global metric still
 is.

SPEC works, shows no regressions (in fact, binary size improves slightly).

I have applied the patch for now.  You can resort to reverting it
with a more proper fix, but there is no reason to keep this broken
during stage3.

Thanks,
Richard.


[PATCH] Fix PR50823 and its gazillion dups

2011-12-06 Thread Richard Guenther

This removes accounting for the number of remaining calls in
the inlining edge badness calculation (as discussed in private
with Honza a long time ago - and yes, we disagreed).  This
fixes the various ICEs of the edge badness update and caching
code checking which are now present for over one month.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
A SPEC 2k6 LTO run is also in progress where I hope it will fix
the 403.gcc and 435.gromacs builds which fail with the same
ICE since a month.

Honza, I guess you have some 12h to come up with a different fix now ;)

The patch enables simplifying the edge badness recalculation code
and possibly will speed it up.

Thanks,
Richard.

2011-12-06  Richard Guenther  rguent...@suse.de

PR tree-optimization/50823
* ipa-inline.c (edge_badness): Do not account for the number of
remaining calls.

* gcc.dg/torture/pr50823.c: New testcase.

Index: gcc/ipa-inline.c
===
--- gcc/ipa-inline.c(revision 182044)
+++ gcc/ipa-inline.c(working copy)
@@ -808,7 +808,6 @@ edge_badness (struct cgraph_edge *edge,
   else if (flag_guess_branch_prob)
 {
   int div = edge-frequency * (110) / CGRAPH_FREQ_MAX;
-  int growth_for_all;
 
   div = MAX (div, 1);
   gcc_checking_assert (edge-frequency = CGRAPH_FREQ_MAX);
@@ -846,14 +845,12 @@ edge_badness (struct cgraph_edge *edge,
  if (dump)
fprintf (dump_file, Badness overflow\n);
}
-  growth_for_all = estimate_growth (callee);
-  badness += growth_for_all;
   if (dump)
{
  fprintf (dump_file,
-%i: guessed profile. frequency %f, overall growth %i,
+%i: guessed profile. frequency %f,
benefit %f%%, divisor %i\n,
-  (int) badness, (double)edge-frequency / CGRAPH_FREQ_BASE, 
growth_for_all,
+  (int) badness, (double)edge-frequency / CGRAPH_FREQ_BASE,
   relative_time_benefit (callee_info, edge, time_growth) * 100 
/ 256.0, div);
}
 }
Index: gcc/testsuite/gcc.dg/torture/pr50823.c
===
--- gcc/testsuite/gcc.dg/torture/pr50823.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr50823.c  (revision 0)
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options -finline-functions } */
+
+int k1, k2, k3, k4, k5, k6, k7, k8;
+
+void set_first_insn (int);
+void set_last_insn (void);
+
+static int make_insn_raw (void) 
+{
+  set_first_insn (0);
+  set_last_insn ();
+  return k1;
+}
+
+static void add_insn_after (void)
+{
+  if (k2)
+k3 = k4;
+
+  if (k5)
+k6 = k7;
+}
+
+void emit_pattern_after_noloc (int (make_raw) (void)) 
+{
+  if (k8)
+{
+  make_raw ();
+  add_insn_after ();
+}
+}
+
+void emit_insn_after_noloc (void)
+{
+  emit_pattern_after_noloc (make_insn_raw);
+}
+
+void emit_debug_insn_before_setloc (int k9)
+{
+  if (k9)
+make_insn_raw ();
+}