https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119387
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=114563
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Patrick Palka from comment #6)
> Strangely, it seems to have started with r14-5979 "c++: P2280R4, Using
> unknown refs in constant expr [PR106650]".
>
> GCC trunk -ftime-report (with -O2 -g):
>
> callgraph construction : 257.75 ( 66%) 444M ( 5%)
> template instantiation : 18.99 ( 5%) 2082M ( 22%)
> constant expression evaluation : 35.33 ( 9%) 5799M ( 61%)
> TOTAL : 391.71 9484M
>
> GCC trunk -ftime-report (with -O2 -g), r14-5979 reverted:
>
> callgraph construction : 11.57 ( 23%) 444M ( 12%)
> template instantiation : 18.66 ( 38%) 2082M ( 54%)
> constant expression evaluation : 1.59 ( 3%) 147M ( 4%)
> TOTAL : 49.53 3839M
>
> With just -O2 -fsyntax-only, there's also >3x increase in peak memory usage,
> 2.2GB vs 6.8GB.
For me, a not up-to-date trunk with release checking with -O2:
phase opt and generate : 12.86 ( 20%) 164M ( 2%)
callgraph construction : 8.28 ( 13%) 55M ( 1%)
template instantiation : 13.50 ( 21%) 2078M ( 24%)
constant expression evaluation : 32.17 ( 51%) 5799M ( 68%)
TOTAL : 63.13 8504M
and with -O2 -g:
phase opt and generate : 297.82 ( 69%) 593M ( 6%)
callgraph construction : 292.23 ( 67%) 444M ( 5%)
template instantiation : 14.54 ( 3%) 2083M ( 22%)
constant expression evaluation : 32.20 ( 7%) 5799M ( 61%)
symout : 85.34 ( 20%) 535M ( 6%)
TOTAL : 434.68 9485M
to me the increased -g time is simply debug info generation (we have no
timevar for that, symout captures some of it), possibly because of very
many templates that are being instantiated?
Interestingly we have again (I've seen this elsewhere, PR114563):
Samples: 1M of event 'cycles:P', Event count (approx.): 1867236850006
Overhead Samples Command Shared Object Symbol
85.81% 1500713 cc1plus cc1plus [.]
ggc_internal_alloc(un
1.55% 28292 cc1plus cc1plus [.]
cxx_eval_constant_exp
1.44% 25472 cc1plus cc1plus [.]
find_substitution(tre
timevar coverage can be improved by putting early debug generation under
TV_SYMOUT as well:
diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
index 82f205488e9..fa54a59d02b 100644
--- a/gcc/cgraphunit.cc
+++ b/gcc/cgraphunit.cc
@@ -2588,6 +2588,8 @@ symbol_table::finalize_compilation_unit (void)
if (!seen_error ())
{
+ timevar_push (TV_SYMOUT);
+
/* Give the frontends the chance to emit early debug based on
what is still reachable in the TU. */
(*lang_hooks.finalize_early_debug) ();
@@ -2597,6 +2599,8 @@ symbol_table::finalize_compilation_unit (void)
debuginfo_early_start ();
(*debug_hooks->early_finish) (main_input_filename);
debuginfo_early_stop ();
+
+ timevar_pop (TV_SYMOUT);
}
/* Finally drive the pass manager. */