[Bug debug/59515] -Og doesn't generate out-of-line copies of inline functions like -O0 does

2017-10-26 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59515

--- Comment #3 from Andrew Pinski  ---
Maybe -fkeep-inline-functions ?

[Bug debug/59515] -Og doesn't generate out-of-line copies of inline functions like -O0 does

2017-10-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59515

--- Comment #4 from Richard Biener  ---
-fkeep-inline-functions results for

extern void foobar();
struct Foo {
void baz () {
foobar();
foobar();
}
void bar () {
baz();
baz();
}
};
int main()
{
  Foo foo;
  foo.bar();
  return 0;
}

in

_ZN3Foo3bazEv:
.LFB0:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
call_Z6foobarv
call_Z6foobarv
addq$8, %rsp
.cfi_def_cfa_offset 8
ret

_ZN3Foo3barEv:
.LFB1:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
call_Z6foobarv
call_Z6foobarv
call_Z6foobarv
call_Z6foobarv
addq$8, %rsp
.cfi_def_cfa_offset 8
ret

main:
.LFB2:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
call_Z6foobarv
call_Z6foobarv
call_Z6foobarv
call_Z6foobarv
movl$0, %eax
addq$8, %rsp
.cfi_def_cfa_offset 8
ret


that's undesirable and for some testcases can lead to exponential increase
of compile-time and binary size.  So what we want is for the functions
just kept because of -fkeep-inline-functions is restricting inlining
even more (_really_ only optimize if the size shrinks), or maybe even
disable inlining completely.

As said the implementation side would require keeping offline copy clones
due to the way early inlining works.

Another option would be to reduce --param early-inlining-insns for -Og
and live with the theoretical exponential explosion issue.
With that --param set to zero we do not inline into main but we do inline
into bar () (weird).  The threshold to inline into main is 10.
Something is wrong with size accounting it seems?

Analyzing function body size: void Foo::baz()
Accounting size:2.00, time:0.00 on new predicate:(not inlined)

 BB 2 predicate:(true)
  foobar ();
freq:1.00 size:  1 time: 10
  foobar ();
freq:1.00 size:  1 time: 10
  return;
freq:1.00 size:  1 time:  2
Will be eliminated by inlining
Accounting size:1.00, time:2.00 on predicate:(not inlined)

Inline summary for void Foo::baz()/0 inlinable
  self time:   22
  global time: 0
  self size:   5
  global size: 0
  min size:   0
  self stack:  0
  global stack:0
size:0.00, time:0.00, predicate:(true)
size:3.00, time:2.00, predicate:(not inlined)

  calls:
void Foo::baz()/0 function not considered for inlining
  loop depth: 0 freq:1000 size: 2 time: 11 callee size: 2 stack: 0
void Foo::baz()/0 function not considered for inlining
  loop depth: 0 freq:1000 size: 2 time: 11 callee size: 2 stack: 0

so it seems the call with the this parameter is size 2 and the inlined
function with two calls w/o parameters is 2 as well.  Makes sense in
a simplistic way...

That said, a bandaid fix would be the following - some benchmarking
(compile-time / runtime / code-size) for whether adjusting the
early inlining param is necessary would be nice.  Documentation should
be adjusted to reflect the -fkeep-inline-functions default for -Og.

Index: gcc/opts.c
===
--- gcc/opts.c  (revision 254074)
+++ gcc/opts.c  (working copy)
@@ -673,11 +673,17 @@ default_options_optimization (struct gcc
   default_param_value (PARAM_MIN_CROSSJUMP_INSNS),
   opts->x_param_values, opts_set->x_param_values);

-  /* Restrict the amount of work combine does at -Og while retaining
- most of its useful transforms.  */
   if (opts->x_optimize_debug)
-maybe_set_param_value (PARAM_MAX_COMBINE_INSNS, 2,
-  opts->x_param_values, opts_set->x_param_values);
+{
+  /* Restrict the amount of work combine does at -Og while retaining
+most of its useful transforms.  */
+  maybe_set_param_value (PARAM_MAX_COMBINE_INSNS, 2,
+opts->x_param_values, opts_set->x_param_values);
+  /* Restrict early inlining to avoid -fkeep-inline-functions kept
+ functions to grow too large.  */
+  maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 4,
+opts->x_param_values, opts_set->x_param_values);
+}

   /* Allow default optimizations to be specified on a per-machine basis.  */
   maybe_default_options (opts, opts_set,
@@ -948,6 +954,11 @@ finish_options (struct gcc_options *opts
 maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0,
opts->x_param_values, opts_set->x_param_values);

+  /* When using -Og enable -fkeep-inline-functions.  */
+  if (opts->x_optimize_debug
+  && !opts_set->x_flag_keep_inline_functions)
+opts->x_flag_keep_inline_functions = 1;
+
   /* The -

[Bug debug/59515] -Og doesn't generate out-of-line copies of inline functions like -O0 does

2013-12-15 Thread naesten at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59515

--- Comment #1 from Samuel Bronson  ---
Oh, I meant to delete this part:

> Traceback (most recent call last):
>   File "/usr/lib/debug//usr/lib/i386-linux-gnu/libstdc++.so.6.0.18-gdb.py", 
> line
>  63, in   
>  
> from libstdcxx.v6.printers import register_libstdcxx_printers
> ImportError: No module named libstdcxx.v6.printers

It's totally irrelevant to the present bug.

(It is an artefact of a bug in either gdb or debian's listdc++ debug package:
the file in question is getting loaded once for
"/usr/lib/debug//usr/lib/i386-linux-gnu/libstdc++.so" and once for
"/usr/lib/i386-linux-gnu/libstdc++.so", and fails the first time 'round because
it's not expecting to be called in that context...)


[Bug debug/59515] -Og doesn't generate out-of-line copies of inline functions like -O0 does

2013-12-19 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59515

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2013-12-19
 CC||hubicka at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #2 from Richard Biener  ---
Thanks for the report - I agree that -Og should preserve an OOL copy of inlined
functions.  I'll see what I can do here (with C++ and a lot of abstraction
penalty the code size impact of that could be quite big though, and via that
also the compile-time increase).  We definitely need to avoid inlining into the
unused offline copy - but usually that's too late because we inline from the
leafs - if we don't the code-size impact is exponential ... :/

Honza, is there a way to do this in a clean way?  That is, create a clone
of all initially reachable functions that we don't inline into, but remove
the clone if the original function prevails?  That is, for

inline int bar () { return 42; }
inline int foo () { return bar() + bar(); }
int main()
{
  return foo ();
}

have main return 84 but offline copies of bar and foo while foo should still
call bar twice?

Not sure if the exponential size consideration matters for -Og which
inlines for size improvements only.