Re: [PR58315] reset inlined debug vars at return-to point
On Wed, Jun 3, 2015 at 11:55 PM, Alexandre Oliva aol...@redhat.com wrote: On Feb 25, 2015, Alexandre Oliva aol...@redhat.com wrote: This patch fixes a problem that has been with us for several years. Variable tracking has no idea about the end of the lifetime of inlined variables, so it keeps on computing locations for them over and over, even though the computed locations make no sense whatsoever because the variable can't even be accessed any more. With this patch, we unbind all inlined variables at the point the inlined function returns to, so that the locations for those variables will not be touched any further. In theory, we could do something similar to non-inlined auto variables, when they go out of scope, but their decls apply to the entire function and I'm told gdb sort-of expects the variables to be accessible throughout the function, so I'm not tackling that in this patch, for I'm happy enough with what this patch gets us: - almost 99% reduction in the output asm for the PR testcase - more than 90% reduction in the peak memory use compiling that testcase - 63% reduction in the compile time for that testcase What's scary is that the testcase is not particularly pathological. Any function that calls a longish sequence of inlined functions, that in turn call other inline functions, and so on, something that's not particularly unusual in C++, will likely observe significant improvement, as we won't see growing sequences of var_location notes after each call or so, as var-tracking computes a new in-stack location for the implicit this argument of each previously-inlined function. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? Ping? Ok for trunk and 5.2 after a while with no issues popping up. Thanks, Richard. for gcc/ChangeLog PR debug/58315 * tree-inline.c (reset_debug_binding): New. (reset_debug_bindings): Likewise. (expand_call_inline): Call it. --- gcc/tree-inline.c | 56 + 1 file changed, 56 insertions(+) diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 71d75d9..c1578e5 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4346,6 +4346,60 @@ add_local_variables (struct function *callee, struct function *caller, } } +/* Add to BINDINGS a debug stmt resetting SRCVAR if inlining might + have brought in or introduced any debug stmts for SRCVAR. */ + +static inline void +reset_debug_binding (copy_body_data *id, tree srcvar, gimple_seq *bindings) +{ + tree *remappedvarp = id-decl_map-get (srcvar); + + if (!remappedvarp) +return; + + if (TREE_CODE (*remappedvarp) != VAR_DECL) +return; + + if (*remappedvarp == id-retvar || *remappedvarp == id-retbnd) +return; + + tree tvar = target_for_debug_bind (*remappedvarp); + if (!tvar) +return; + + gdebug *stmt = gimple_build_debug_bind (tvar, NULL_TREE, + id-call_stmt); + gimple_seq_add_stmt (bindings, stmt); +} + +/* For each inlined variable for which we may have debug bind stmts, + add before GSI a final debug stmt resetting it, marking the end of + its life, so that var-tracking knows it doesn't have to compute + further locations for it. */ + +static inline void +reset_debug_bindings (copy_body_data *id, gimple_stmt_iterator gsi) +{ + tree var; + unsigned ix; + gimple_seq bindings = NULL; + + if (!gimple_in_ssa_p (id-src_cfun)) +return; + + if (!opt_for_fn (id-dst_fn, flag_var_tracking_assignments)) +return; + + for (var = DECL_ARGUMENTS (id-src_fn); + var; var = DECL_CHAIN (var)) +reset_debug_binding (id, var, bindings); + + FOR_EACH_LOCAL_DECL (id-src_cfun, ix, var) +reset_debug_binding (id, var, bindings); + + gsi_insert_seq_before_without_update (gsi, bindings, GSI_SAME_STMT); +} + /* If STMT is a GIMPLE_CALL, replace it with its inline expansion. */ static bool @@ -4659,6 +4713,8 @@ expand_call_inline (basic_block bb, gimple stmt, copy_body_data *id) GCOV_COMPUTE_SCALE (cg_edge-frequency, CGRAPH_FREQ_BASE), bb, return_block, NULL); + reset_debug_bindings (id, stmt_gsi); + /* Reset the escaped solution. */ if (cfun-gimple_df) pt_solution_reset (cfun-gimple_df-escaped); -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 25, 2015, Alexandre Oliva aol...@redhat.com wrote: This patch fixes a problem that has been with us for several years. Variable tracking has no idea about the end of the lifetime of inlined variables, so it keeps on computing locations for them over and over, even though the computed locations make no sense whatsoever because the variable can't even be accessed any more. With this patch, we unbind all inlined variables at the point the inlined function returns to, so that the locations for those variables will not be touched any further. In theory, we could do something similar to non-inlined auto variables, when they go out of scope, but their decls apply to the entire function and I'm told gdb sort-of expects the variables to be accessible throughout the function, so I'm not tackling that in this patch, for I'm happy enough with what this patch gets us: - almost 99% reduction in the output asm for the PR testcase - more than 90% reduction in the peak memory use compiling that testcase - 63% reduction in the compile time for that testcase What's scary is that the testcase is not particularly pathological. Any function that calls a longish sequence of inlined functions, that in turn call other inline functions, and so on, something that's not particularly unusual in C++, will likely observe significant improvement, as we won't see growing sequences of var_location notes after each call or so, as var-tracking computes a new in-stack location for the implicit this argument of each previously-inlined function. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? Ping? for gcc/ChangeLog PR debug/58315 * tree-inline.c (reset_debug_binding): New. (reset_debug_bindings): Likewise. (expand_call_inline): Call it. --- gcc/tree-inline.c | 56 + 1 file changed, 56 insertions(+) diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 71d75d9..c1578e5 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4346,6 +4346,60 @@ add_local_variables (struct function *callee, struct function *caller, } } +/* Add to BINDINGS a debug stmt resetting SRCVAR if inlining might + have brought in or introduced any debug stmts for SRCVAR. */ + +static inline void +reset_debug_binding (copy_body_data *id, tree srcvar, gimple_seq *bindings) +{ + tree *remappedvarp = id-decl_map-get (srcvar); + + if (!remappedvarp) +return; + + if (TREE_CODE (*remappedvarp) != VAR_DECL) +return; + + if (*remappedvarp == id-retvar || *remappedvarp == id-retbnd) +return; + + tree tvar = target_for_debug_bind (*remappedvarp); + if (!tvar) +return; + + gdebug *stmt = gimple_build_debug_bind (tvar, NULL_TREE, + id-call_stmt); + gimple_seq_add_stmt (bindings, stmt); +} + +/* For each inlined variable for which we may have debug bind stmts, + add before GSI a final debug stmt resetting it, marking the end of + its life, so that var-tracking knows it doesn't have to compute + further locations for it. */ + +static inline void +reset_debug_bindings (copy_body_data *id, gimple_stmt_iterator gsi) +{ + tree var; + unsigned ix; + gimple_seq bindings = NULL; + + if (!gimple_in_ssa_p (id-src_cfun)) +return; + + if (!opt_for_fn (id-dst_fn, flag_var_tracking_assignments)) +return; + + for (var = DECL_ARGUMENTS (id-src_fn); + var; var = DECL_CHAIN (var)) +reset_debug_binding (id, var, bindings); + + FOR_EACH_LOCAL_DECL (id-src_cfun, ix, var) +reset_debug_binding (id, var, bindings); + + gsi_insert_seq_before_without_update (gsi, bindings, GSI_SAME_STMT); +} + /* If STMT is a GIMPLE_CALL, replace it with its inline expansion. */ static bool @@ -4659,6 +4713,8 @@ expand_call_inline (basic_block bb, gimple stmt, copy_body_data *id) GCOV_COMPUTE_SCALE (cg_edge-frequency, CGRAPH_FREQ_BASE), bb, return_block, NULL); + reset_debug_bindings (id, stmt_gsi); + /* Reset the escaped solution. */ if (cfun-gimple_df) pt_solution_reset (cfun-gimple_df-escaped); -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Mar 9, 2015, Jeff Law l...@redhat.com wrote: On 03/09/15 08:38, Richard Biener wrote: On Fri, Mar 6, 2015 at 7:04 PM, Alexandre Oliva aol...@redhat.com wrote: On Feb 26, 2015, Alexandre Oliva aol...@redhat.com wrote: So far, all the differences I looked at were caused by padding at the end of BBs, and by jump stmts without line numbers at the end of BBs, both right after the debug reset stmts the proposed patch introduces. Further investigation showed there were other sources of spurious differences: - copies arising from the expansion of PHI nodes; source code information associated with these copies points at the source of the copy, which is hardly useful and sensible. Care to explain? We spend quite some resources to maintain them (locations on PHI args, that is). The situation I remember, for looking more extensively into, involved a variable local to the inlined function, a copy of that variable to a caller's variable, and that caller's variable being further modified elsewhere. We coalesced the two variables into one, removed the copy statement altogether, and the location from the initial set within the inline function made it to PHI nodes related with the caller's variable, and that survived various PHI recomputations due to various CFG reorganizations. When the time came to expand the PHI nodes, copies were required and line number info internal to the inlined function was added to the copy arising from the edge that had in it the SSA name set inside the inline function. And so the ranges of the inline function were extended far past its actual end, because a coalesced copy from one of its variables was optimized away and then reintroduced with line number info pertaining to the inlined function. I almost responded to that claim as well, but then thought better of it as that patch (AFAICT) wasn't proposed for inclusion, but was being used for testing purposes. The patch that removed line number annotation from copies arising from expanding PHI nodes was included only for reference, yes. The upthread patch for PR58315 still stands. Expansion of the PHI into copies should have locations which point to the source of the various PHI args. Those are quite meaningful. I can see those can be meaningful in many cases. Not in the ones I looked at as part of investigating the coverage changes, though. We have a substantial number of copies that arise from processes like the one I described above, and those are most certainly not sensible, mainly because the relevant line number info was dropped on the floor along with copies optimized away due to coalescing of variables from different scopes. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Fri, Mar 6, 2015 at 7:04 PM, Alexandre Oliva aol...@redhat.com wrote: On Feb 26, 2015, Alexandre Oliva aol...@redhat.com wrote: So far, all the differences I looked at were caused by padding at the end of BBs, and by jump stmts without line numbers at the end of BBs, both right after the debug reset stmts the proposed patch introduces. Further investigation showed there were other sources of spurious differences: - copies arising from the expansion of PHI nodes; source code information associated with these copies points at the source of the copy, which is hardly useful and sensible. Care to explain? We spend quite some resources to maintain them (locations on PHI args, that is). Btw, I'd expect tree-ssa-copyrename.c to be a source of bogus base variables still. Richard. - optimization of single-range location lists to a location expression covering the entire function, which extends, often incorrectly, the range of the expression, and thus the coverage of the function. By patching GCC so as to eliminate these differences (patches attached for reference), I managed to reduce the coverage differences in libgcc_s.so to a manageable size (about a dozen variables in 3 functions in 2 object files), so I could manually investigate each one of them. All remaining differences amounted to single insns, originally before the reset debug stmt introduced by the patch, computing subexpressions of expressions expanded after the return point of the inlined function. Because of TER, this often causes stmts to be moved past corresponding debug stmts. This is a long-standing and long-known issue to me, thus nothing particularly new brought about or worsened by the proposed patch, and that would be fixed with stmt frontiers. Ultimately, the issue is not so much that the insn gets emitted at a different spot, but rather the possibility that no insn will remain at the point where it was expected, which might make it impossible (with current compiler and debugger) to inspect the results of that computation. The worst the reset debug stmts introduced by this patch could do is to make those effects not visible at other points of the computation, where they are not guaranteed or expected to be visible to begin with. That said, it might be nice if we emitted partial of full expansions of subexpressions at their original location, relative to debug stmts expanded to debug insns, rather than all at the point of the full expression. I recall having started something along these lines years ago, but that project never got to a working state. The differences in libstdc++.so, even after the patches intended to minimize differences, are too many to investigate in depth, but from what I can tell from a quick glance at the diff in dwlocstat per-variable per-range coverage dumps, nearly all of the differences amount to one insn in the final range, which likely means TER-introduced differences. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On 03/09/15 08:38, Richard Biener wrote: On Fri, Mar 6, 2015 at 7:04 PM, Alexandre Oliva aol...@redhat.com wrote: On Feb 26, 2015, Alexandre Oliva aol...@redhat.com wrote: So far, all the differences I looked at were caused by padding at the end of BBs, and by jump stmts without line numbers at the end of BBs, both right after the debug reset stmts the proposed patch introduces. Further investigation showed there were other sources of spurious differences: - copies arising from the expansion of PHI nodes; source code information associated with these copies points at the source of the copy, which is hardly useful and sensible. Care to explain? We spend quite some resources to maintain them (locations on PHI args, that is). I almost responded to that claim as well, but then thought better of it as that patch (AFAICT) wasn't proposed for inclusion, but was being used for testing purposes. Expansion of the PHI into copies should have locations which point to the source of the various PHI args. Those are quite meaningful. Jeff
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 26, 2015, Alexandre Oliva aol...@redhat.com wrote: So far, all the differences I looked at were caused by padding at the end of BBs, and by jump stmts without line numbers at the end of BBs, both right after the debug reset stmts the proposed patch introduces. Further investigation showed there were other sources of spurious differences: - copies arising from the expansion of PHI nodes; source code information associated with these copies points at the source of the copy, which is hardly useful and sensible. - optimization of single-range location lists to a location expression covering the entire function, which extends, often incorrectly, the range of the expression, and thus the coverage of the function. By patching GCC so as to eliminate these differences (patches attached for reference), I managed to reduce the coverage differences in libgcc_s.so to a manageable size (about a dozen variables in 3 functions in 2 object files), so I could manually investigate each one of them. All remaining differences amounted to single insns, originally before the reset debug stmt introduced by the patch, computing subexpressions of expressions expanded after the return point of the inlined function. Because of TER, this often causes stmts to be moved past corresponding debug stmts. This is a long-standing and long-known issue to me, thus nothing particularly new brought about or worsened by the proposed patch, and that would be fixed with stmt frontiers. Ultimately, the issue is not so much that the insn gets emitted at a different spot, but rather the possibility that no insn will remain at the point where it was expected, which might make it impossible (with current compiler and debugger) to inspect the results of that computation. The worst the reset debug stmts introduced by this patch could do is to make those effects not visible at other points of the computation, where they are not guaranteed or expected to be visible to begin with. That said, it might be nice if we emitted partial of full expansions of subexpressions at their original location, relative to debug stmts expanded to debug insns, rather than all at the point of the full expression. I recall having started something along these lines years ago, but that project never got to a working state. The differences in libstdc++.so, even after the patches intended to minimize differences, are too many to investigate in depth, but from what I can tell from a quick glance at the diff in dwlocstat per-variable per-range coverage dumps, nearly all of the differences amount to one insn in the final range, which likely means TER-introduced differences. End scopes before location-less insns From: Alexandre Oliva aol...@redhat.com --- gcc/final.c | 25 - 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/gcc/final.c b/gcc/final.c index 1fa93d9..6fab871 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -1634,13 +1634,19 @@ choose_inner_scope (tree s1, tree s2) /* Emit lexical block notes needed to change scope from S1 to S2. */ static void -change_scope (rtx_insn *orig_insn, tree s1, tree s2) +change_scope (rtx_insn *prev_insn, rtx_insn *orig_insn, tree s1, tree s2) { rtx_insn *insn = orig_insn; + rtx_insn *einsn; tree com = NULL_TREE; tree ts1 = s1, ts2 = s2; tree s; + if (!prev_insn) +einsn = insn; + else +einsn = NEXT_INSN (prev_insn); + while (ts1 != ts2) { gcc_assert (ts1 ts2); @@ -1660,7 +1666,7 @@ change_scope (rtx_insn *orig_insn, tree s1, tree s2) s = s1; while (s != com) { - rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, insn); + rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, einsn); NOTE_BLOCK (note) = s; s = BLOCK_SUPERCONTEXT (s); } @@ -1684,6 +1690,7 @@ reemit_insn_block_notes (void) tree cur_block = DECL_INITIAL (cfun-decl); rtx_insn *insn; rtx_note *note; + rtx_insn *prev_insn = NULL; insn = get_insns (); for (; insn; insn = NEXT_INSN (insn)) @@ -1693,10 +1700,16 @@ reemit_insn_block_notes (void) /* Prevent lexical blocks from straddling section boundaries. */ if (NOTE_P (insn) NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) { + rtx_insn *einsn; + if (prev_insn) + einsn = NEXT_INSN (prev_insn); + else + einsn = insn; + prev_insn = NULL; for (tree s = cur_block; s != DECL_INITIAL (cfun-decl); s = BLOCK_SUPERCONTEXT (s)) { - rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, insn); + rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, einsn); NOTE_BLOCK (note) = s; note = emit_note_after (NOTE_INSN_BLOCK_BEG, insn); NOTE_BLOCK (note) = s; @@ -1732,14 +1745,16 @@ reemit_insn_block_notes (void) if (this_block != cur_block) { - change_scope (insn, cur_block, this_block); + change_scope
Re: [PR58315] reset inlined debug vars at return-to point
On March 5, 2015 8:26:42 PM CET, Alexandre Oliva aol...@redhat.com wrote: On Mar 4, 2015, Richard Biener richard.guent...@gmail.com wrote: Compile-time was slightly faster with the patch, 45s vs. 47s, but the machine wasn't completely un-loaded. var-tracking parts: unpatched: variable tracking : 0.63 ( 1%) usr 0.03 ( 1%) sys 0.82 ( 2%) wall 28641 kB ( 2%) ggc var-tracking dataflow : 3.72 ( 8%) usr 0.04 ( 1%) sys 3.65 ( 7%) wall1337 kB ( 0%) ggc var-tracking emit : 2.63 ( 6%) usr 0.02 ( 1%) sys 2.55 ( 5%) wall 148835 kB ( 8%) ggc patched: variable tracking : 0.64 ( 1%) usr 0.01 ( 0%) sys 0.72 ( 1%) wall 24202 kB ( 1%) ggc var-tracking dataflow : 1.96 ( 4%) usr 0.01 ( 0%) sys 1.94 ( 4%) wall1326 kB ( 0%) ggc var-tracking emit : 1.46 ( 3%) usr 0.02 ( 0%) sys 1.49 ( 3%) wall 46980 kB ( 3%) ggc we have at the point of RTL expansion 56% more debug statements (988231 lines with # DEBUG in the .optimized dump out of 1212518 lines in total vs. 630666 out of 854953). So we go from roughly 1 debug stmt per real stmt to 1.5 debug stmts per real stmt. So, if I got this right, all these extra debug stmts and insns had no effect whatsoever on compile time proper. The reduction in compile time can be entirely accounted for by the time they save in the var-tracking parts, and any compile time increase they bring about in other passes is negligible. Does this match your assessment of the impact of the patch? For the effect on tramp3d, yes. The positive effect on var-tracking compile time really looks good. So I'm tempted to approve the patch for 5.0. we have two pairs of DEBUG stmts for dD.570173 here, binding to _25 and then immediately resetting. They're at different lines, and associated with different statements, so once we have stmt frontiers support in GCC and GDB, you will be able to stop between them an inspect the state, regardless of the absence of executable code between them. I wonder why we use the same decl uid for two different inline instances though. Do we not remap them during inlining? Richard.
Re: [PR58315] reset inlined debug vars at return-to point
On Mar 4, 2015, Richard Biener richard.guent...@gmail.com wrote: Compile-time was slightly faster with the patch, 45s vs. 47s, but the machine wasn't completely un-loaded. var-tracking parts: unpatched: variable tracking : 0.63 ( 1%) usr 0.03 ( 1%) sys 0.82 ( 2%) wall 28641 kB ( 2%) ggc var-tracking dataflow : 3.72 ( 8%) usr 0.04 ( 1%) sys 3.65 ( 7%) wall1337 kB ( 0%) ggc var-tracking emit : 2.63 ( 6%) usr 0.02 ( 1%) sys 2.55 ( 5%) wall 148835 kB ( 8%) ggc patched: variable tracking : 0.64 ( 1%) usr 0.01 ( 0%) sys 0.72 ( 1%) wall 24202 kB ( 1%) ggc var-tracking dataflow : 1.96 ( 4%) usr 0.01 ( 0%) sys 1.94 ( 4%) wall1326 kB ( 0%) ggc var-tracking emit : 1.46 ( 3%) usr 0.02 ( 0%) sys 1.49 ( 3%) wall 46980 kB ( 3%) ggc we have at the point of RTL expansion 56% more debug statements (988231 lines with # DEBUG in the .optimized dump out of 1212518 lines in total vs. 630666 out of 854953). So we go from roughly 1 debug stmt per real stmt to 1.5 debug stmts per real stmt. So, if I got this right, all these extra debug stmts and insns had no effect whatsoever on compile time proper. The reduction in compile time can be entirely accounted for by the time they save in the var-tracking parts, and any compile time increase they bring about in other passes is negligible. Does this match your assessment of the impact of the patch? we have two pairs of DEBUG stmts for dD.570173 here, binding to _25 and then immediately resetting. They're at different lines, and associated with different statements, so once we have stmt frontiers support in GCC and GDB, you will be able to stop between them an inspect the state, regardless of the absence of executable code between them. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Wed, Feb 25, 2015 at 10:40 AM, Alexandre Oliva aol...@redhat.com wrote: This patch fixes a problem that has been with us for several years. Variable tracking has no idea about the end of the lifetime of inlined variables, so it keeps on computing locations for them over and over, even though the computed locations make no sense whatsoever because the variable can't even be accessed any more. With this patch, we unbind all inlined variables at the point the inlined function returns to, so that the locations for those variables will not be touched any further. In theory, we could do something similar to non-inlined auto variables, when they go out of scope, but their decls apply to the entire function and I'm told gdb sort-of expects the variables to be accessible throughout the function, so I'm not tackling that in this patch, for I'm happy enough with what this patch gets us: - almost 99% reduction in the output asm for the PR testcase - more than 90% reduction in the peak memory use compiling that testcase - 63% reduction in the compile time for that testcase What's scary is that the testcase is not particularly pathological. Any function that calls a longish sequence of inlined functions, that in turn call other inline functions, and so on, something that's not particularly unusual in C++, will likely observe significant improvement, as we won't see growing sequences of var_location notes after each call or so, as var-tracking computes a new in-stack location for the implicit this argument of each previously-inlined function. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? Some numbers for tramp3d at -Ofast -g with an optimized, release checking (but not bootstrapped) trunk. Compile-time memory usage seems unchanged (~980MB, to show differences we'd probably need to ggc-collect more often). Compile-time was slightly faster with the patch, 45s vs. 47s, but the machine wasn't completely un-loaded. var-tracking parts: unpatched: variable tracking : 0.63 ( 1%) usr 0.03 ( 1%) sys 0.82 ( 2%) wall 28641 kB ( 2%) ggc var-tracking dataflow : 3.72 ( 8%) usr 0.04 ( 1%) sys 3.65 ( 7%) wall1337 kB ( 0%) ggc var-tracking emit : 2.63 ( 6%) usr 0.02 ( 1%) sys 2.55 ( 5%) wall 148835 kB ( 8%) ggc patched: variable tracking : 0.64 ( 1%) usr 0.01 ( 0%) sys 0.72 ( 1%) wall 24202 kB ( 1%) ggc var-tracking dataflow : 1.96 ( 4%) usr 0.01 ( 0%) sys 1.94 ( 4%) wall1326 kB ( 0%) ggc var-tracking emit : 1.46 ( 3%) usr 0.02 ( 0%) sys 1.49 ( 3%) wall 46980 kB ( 3%) ggc we have at the point of RTL expansion 56% more debug statements (988231 lines with # DEBUG in the .optimized dump out of 1212518 lines in total vs. 630666 out of 854953). So we go from roughly 1 debug stmt per real stmt to 1.5 debug stmts per real stmt. I didn't try to inspect generated debug information or asses its quality. But I wonder for example what we do for _25 = MEM[(const struct DomainD.47403 *)this_11(D) + 8B].D.47690.domain_mD.47464; # DEBUG dD.570173 = _25 # DEBUG dD.570173 = NULL # DEBUG thisD.572552 = NULL npc_14 = _25 / _27; # DEBUG npcD.171771 = npc_14 # DEBUG D#447 = this_11(D)-blocks_mD.64412.D.47809 # DEBUG thisD.572554 = D#447 # DEBUG dD.570173 = _25 # DEBUG dD.570173 = NULL # DEBUG thisD.572554 = NULL remainder_16 = _25 % _27; we have two pairs of DEBUG stmts for dD.570173 here, binding to _25 and then immediately resetting. Apart from that we might possibly DCE those I wonder if this isn't breaking debug experience vs. what we had previously: _25 = MEM[(const struct DomainD.47403 *)this_11(D) + 8B].D.47690.domain_mD.47464; # DEBUG dD.570173 = _25 npc_14 = _25 / _27; # DEBUG npcD.171771 = npc_14 # DEBUG D#447 = this_11(D)-blocks_mD.64412.D.47809 # DEBUG thisD.572554 = D#447 # DEBUG dD.570173 = _25 remainder_16 = _25 % _27; dD.570173 is never reset anywhere in the function without the patch. Just for curiosity here is the last dump piece again, this time with location information (we should dump the associated BLOCK someow) [tramp3d-v4.cpp:3457:45] _25 = [tramp3d-v4.cpp:3457:45] MEM[(const struct DomainD.47403 *)this_11(D) + 8B].D.47690.domain_mD.47464; [tramp3d-v4.cpp:3457:45] # DEBUG dD.570173 = _25 [tramp3d-v4.cpp:53176:32] npc_14 = _25 / _27; [tramp3d-v4.cpp:53176:32] # DEBUG npcD.171771 = npc_14 [tramp3d-v4.cpp:53177:33] # DEBUG D#447 = [tramp3d-v4.cpp:53177:33] [tramp3d-v4.cpp:53177:19] this_11(D)-blocks_mD.64412.D.47809 [tramp3d-v4.cpp:53177:33] # DEBUG thisD.572554 = D#447 [tramp3d-v4.cpp:3457:45] # DEBUG dD.570173 = _25 [tramp3d-v4.cpp:53177:38] remainder_16 = _25 % _27; like 3457 is inline Element_t first() const { return DT::first(this-domain_m); } and 5317[67] are int npc = blocks_m.first() / contexts; int remainder = blocks_m.first() % contexts; 'd' comes from DT::first(this-domain_m); which is on line 4604:
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 27, 2015, Petr Machata pmach...@redhat.com wrote: Alexandre Oliva aol...@redhat.com writes: Ok, I looked into it further, after patching dwlocstat to dump per-variable per-range coverage/length info, so as to be able to compare object files more easily. If you send me those patches, I can finish them, bind the functionality to a command line option, and merge upstream. Here's what I've got so far. It's ugly and needs polishing, that I planned to do once this issue was resolved, but if you feel like beating me to it, you're surely welcome to ;-) I'm sure there must be a better way to compute the CIE offset, or even the base address of the debug info section (at least the - 11 needs to be replaced by something more sensible and more portable), but I didn't try very hard to find it out :-) Anyway, ideally, dumping it would be optional; I've added and removed it numerous times depending on the exact object file I was comparing and what info I sought. diff --git a/locstats.cc b/locstats.cc index 8a458cb..0b71ceb 100644 --- a/locstats.cc +++ b/locstats.cc @@ -240,7 +240,8 @@ static die_action process_location (Dwarf_Attribute *locattr, bool full_implicit, std::bitsetdt__count die_type, mutability_t mut, -int coverage); +int coverage, +bool dump_partials = false); static die_action process_implicit_pointer (Dwarf_Attribute *locattr, Dwarf_Op *op, @@ -400,7 +401,8 @@ process_location (Dwarf_Attribute *locattr, bool full_implicit, std::bitsetdt__count die_type, mutability_t mut, - int coverage) + int coverage, + bool dump_partials) { Dwarf_Op *expr; size_t len; @@ -459,6 +461,7 @@ process_location (Dwarf_Attribute *locattr, { Dwarf_Addr low = rit-first; Dwarf_Addr high = rit-second; + size_t const covered_before = covered; length += high - low; //std::cerr low .. high std::endl; @@ -525,6 +528,10 @@ process_location (Dwarf_Attribute *locattr, if (cover) covered++; } + + if (dump_partials) + std::cout ' ' (covered - covered_before) + '/' (high - low); } if (length == 0 || covered == 0) @@ -562,6 +569,8 @@ is_inlined (Dwarf_Die *die) void process (Dwarf *dw) { + const bool dump_partials = true; + // map percentage-occurrences. Percentage is cov_00..100, where // 0..100 is rounded-down integer division. std::mapint, unsigned long tally; @@ -586,11 +595,16 @@ process (Dwarf *dw) for (cu_iterator it = cu_iterator (dw); it != cu_iterator::end (); ++it) last_cit = it; + char *base = NULL; + for (all_dies_iterator it (dw); it != all_dies_iterator::end (); ++it) { std::bitsetdt__count die_type; Dwarf_Die *die = *it; + if (!base) + base = (char*)die-addr - 11; + if (show_progress) { cu_iterator cit = it.cu (); @@ -677,17 +691,29 @@ process (Dwarf *dw) if (locattr == NULL) locattr = dwarf_attr (die, DW_AT_const_value, locattr_mem); - /* - Dwarf_Attribute name_attr_mem, - *name_attr = dwarf_attr_integrate (die, DW_AT_name, name_attr_mem); - std::string name = name_attr != NULL - ? dwarf_formstring (name_attr) - : (dwarf_hasattr_integrate (die, DW_AT_artificial) - ? artificial : ???); + if (dump_partials) + { + std::string name; - std::cerr die= std::hex die.offset () - ' name '\''; - */ + for (auto pit = it; pit != all_dies_iterator::end (); + pit = pit.parent ()) + { + Dwarf_Attribute name_attr_mem, + *name_attr = dwarf_attr_integrate (*pit, DW_AT_name, name_attr_mem); + std::string thisname = name_attr != NULL + ? dwarf_formstring (name_attr) + : (dwarf_hasattr_integrate (*pit, DW_AT_artificial) + ? artificial : ???); + + if (name != ) + name = thisname + :: + name; + else + name = thisname; + } + + std::cout // die= std::hex (char*)die-addr - base ' ' + name; + } int coverage; mutability_t mut; @@ -697,7 +723,7 @@ process (Dwarf *dw) interested_mutability, interested_implicit, full_implicit, -die_type, mut, coverage) != da_ok) +die_type, mut, coverage, dump_partials) != da_ok) continue; } catch (::error e) @@ -751,7 +777,7 @@ process (Dwarf *dw) tally[coverage]++; total++; - //std::cerr std::endl; + std::cout std::endl; } if (show_progress) -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
Alexandre Oliva aol...@redhat.com writes: Ok, I looked into it further, after patching dwlocstat to dump per-variable per-range coverage/length info, so as to be able to compare object files more easily. If you send me those patches, I can finish them, bind the functionality to a command line option, and merge upstream. Thanks, Petr
Re: [PR58315] reset inlined debug vars at return-to point
Jakub Jelinek ja...@redhat.com writes: it counts on what percentage of bytes in those ranges (or instructions?) the variable has defined location. Yes, it counts bytes. It doesn't know anything about instruction lengths etc. Thanks, Petr
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 25, 2015, Alexandre Oliva aol...@redhat.com wrote: if a function is called within a loop and we inline it, bindings from the call in one iteration of the loop will carry over onto the subsequent iteration until a new binding occurs. Wait, I have to take that back and revisit the code I looked at. var-tracking handles confluences differently between VTA-tracked and old-style var-tracking vars, and I was probably confused by that: it occurred to me this morning that confluence points of VTA-tracked variables perform a set intersection, whereas old-style VTA performs set union across locations at all incoming edges. So, what I have concluded should only apply to old-style var-tracking vars, or to VTA-tracked vars in unrolled loops whose intermediate conditions were optimized out (so that there isn't any confluence with incoming edges without bindings for the var). I'm afraid I'll have to look again and figure out what I got wrong. Apologies for the noise. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 26, 2015, Richard Biener richard.guent...@gmail.com wrote: After all if the inliner inserts resets just for vars that already have debug stmts then I cook up a testcase where those debug stmts only appear after inlining. Please try that. Hint: the actual requirement is that the VTA-trackable var has been remapped during inlining of executable code. Its having debug stmts is a consequence of its being there. I wonder, thus, how the compiler would bring a var that hadn't even been remapped back from the dead, so as to introduce a debug stmt for it. Tough! :-) Indeed if we want to be as close to the source as possible we should insert debug stmts from the start (where the values are still computed) so that code-motion will simply make it unavailable (and also reset locations so you don't get gdb jumping around). *nod* Statement frontiers notes are a proposed solution for the jumping around and eliminating code motion issues from debug info. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 26, 2015, Richard Biener richard.guent...@gmail.com wrote: and more concentrating on the effect of your patch as opposed to debug stmt philosophy. (which looks reasonable minus code-motion issues). (but we might still want to compute it during var-tracking if at a later PC range the scope will be active again). By taking debug stmt philosophy out of your reasoning, you come to a conclusion that directly contradicts the very philosophy. I.e., you're not setting the philosophy aside, you're setting yourself up for failure within that philosophy. Your note on code-motion issues shows another weakness, not in the philosophy I endorse, but on rather on the one you do. If annotations don't move, and they carry all the information debuggers need to care about, there's no code-motion issue whatsoever, and moving executable instructions would not have any effects whatsoever. Sure, we're not there yet, but if we want to get there, taking steps in the opposite direction won't take us there any time soon. Thus consider the suggestion to insert reset debug insns at the beginning of var-tracking and at points where a scope becomes dead (thus after points where in a backward CFG walk a scope becomes live). The point where the scope becomes dead under what philosophical line? I believe we've already established that trying to recover that information after throwing it away is cheaper but error prone. I believe we've already agreed that we don't want debug information to be incorrect. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Thu, Feb 26, 2015 at 1:01 AM, Alexandre Oliva aol...@redhat.com wrote: On Feb 25, 2015, Jakub Jelinek ja...@redhat.com wrote: On Wed, Feb 25, 2015 at 06:17:33PM -0300, Alexandre Oliva wrote: My measurements, for a not particularly unusual testcase, showed an overall reduction of 63% in compile time, as indicated yesterday. Now, who should bear the burden of collecting evidence to back up the claims against the change? Are those concerns enough to hold it up? Can you e.g. run dwlocstat on some larger C++ binaries built without and with your patch? I believe dwlocstat is supposed to count only the instructions where the variables or parameters are in scope, so should be exactly what we care about here. Erhm... I don't think that would cover the case you were worried about, namely inspecting variables of an inlined function while at a statement moved out of the function ranges. Anyway, I've run dwlocstat and inspected the results. There is indeed a significant reduction in the coverage, so I looked into that. What I found out is that the reduction is an improvement on yet another long-standing var-tracking issue: if a function is called within a loop and we inline it, bindings from the call in one iteration of the loop will carry over onto the subsequent iteration until a new binding occurs. This accounts for all of the coverage reductions I've investigated. This, in turn, suggests that introducing reset stmts when variables go out of scope even in local blocks will further reduce debug info, although in some cases it might have the opposite effect. E.g., if the actual live range of a variable is scattered across multiple non-contiguous blocks, stopping the binding from being used in intervening blocks where the variable is not live will cause additional entries in the location list. I've observed this effect with the proposed patch, too, but it removes a lot more nonsensical entries than it adds entries to account for not covering intervening ranges by accident. Answering your questions on my mail here (because it fits I think), and more concentrating on the effect of your patch as opposed to debug stmt philosophy. Your patch ends up pruning the var-tracking sets at the additional reset-points you introduce. You introduce them at inline boundaries (which looks reasonable minus code-motion issues). I claim you can achieve the same result by effectively inserting those reset debug insns right before var-tracking itself and by re-computing BLOCK liveness. That will automatically deal with code motion that extends the life-range of an inlined BLOCK by moving stmts (still associated with that BLOCK) across the return point of the inlined call (and thus across your debug resets). And it will allow var-tracking to eventually compute locations for vars at the entry of that BLOCK piece. After all you say correctly that what matters is location info for X in places where a debug consumer can successfully lookup X (that is, in PC ranges where the scope X was declared in is active). In other places there is no reason to emit location info for X (but we might still want to compute it during var-tracking if at a later PC range the scope will be active again). Now I said you could do var-tracking uops for those resets somehow, but I now realize that I have not much internal knowledge of var-tracking. Thus consider the suggestion to insert reset debug insns at the beginning of var-tracking and at points where a scope becomes dead (thus after points where in a backward CFG walk a scope becomes live). Richard. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Thu, Feb 26, 2015 at 11:29:35AM +0100, Richard Biener wrote: I claim you can achieve the same result by effectively inserting those reset debug insns right before var-tracking itself and by re-computing BLOCK liveness. That will automatically deal with code motion that extends the life-range of an inlined BLOCK by moving stmts (still associated with that BLOCK) across the return point of the inlined call (and thus across your debug resets). And it will allow var-tracking to eventually compute locations for vars at the entry of that BLOCK piece. If that would work, it would be nice, but I'm not sure how you can do that. Using something like final.c (reemit_insn_block_notes) you can discover the ranges of scopes (inline functions and lexical blocks). If some scope has a single range, it is the easy case, you know where it starts and where it ends. For scopes with multiple ranges, how can you find out what case it is though? I mean, either it can be just the case that some instruction(s) with different scope got scheduled (or sunk / whatever) in between the instructions of the scope, in that case resetting the vars there is highly undesirable (would majorly grow the .debug_loc lists and break debuggers, e.g. when you try to watch some variable through the live of some inline, if you reset it any time a single unrelated insn is encountered, the debugger would need to tell you the var got lost/changed). Or it can be the case of unrolled loop which has lexical scopes or inlines in it, in that case you will have multiple same scopes, but in that case it is unnecessary to preserve variables in between those, it is really different inlines. Jakub
Re: [PR58315] reset inlined debug vars at return-to point
On Thu, Feb 26, 2015 at 11:42 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Feb 26, 2015 at 11:29:35AM +0100, Richard Biener wrote: I claim you can achieve the same result by effectively inserting those reset debug insns right before var-tracking itself and by re-computing BLOCK liveness. That will automatically deal with code motion that extends the life-range of an inlined BLOCK by moving stmts (still associated with that BLOCK) across the return point of the inlined call (and thus across your debug resets). And it will allow var-tracking to eventually compute locations for vars at the entry of that BLOCK piece. If that would work, it would be nice, but I'm not sure how you can do that. Using something like final.c (reemit_insn_block_notes) you can discover the ranges of scopes (inline functions and lexical blocks). If some scope has a single range, it is the easy case, you know where it starts and where it ends. For scopes with multiple ranges, how can you find out what case it is though? I mean, either it can be just the case that some instruction(s) with different scope got scheduled (or sunk / whatever) in between the instructions of the scope, in that case resetting the vars there is highly undesirable (would majorly grow the .debug_loc lists and break debuggers, e.g. when you try to watch some variable through the live of some inline, if you reset it any time a single unrelated insn is encountered, the debugger would need to tell you the var got lost/changed). Or it can be the case of unrolled loop which has lexical scopes or inlines in it, in that case you will have multiple same scopes, but in that case it is unnecessary to preserve variables in between those, it is really different inlines. I don't claim you can get the maximal pruning of the var-tracking solutions. I just claim that you can do something sensible by computing where BLOCKs go dead. And that this would be better than simply ignoring the code motion issue alltogether. If Alexs solution would be acceptable in terms of that issue then we can as well just insert debug resets for each variable at initial BLOCK boundaries. After all if the inliner inserts resets just for vars that already have debug stmts then I cook up a testcase where those debug stmts only appear after inlining. Indeed if we want to be as close to the source as possible we should insert debug stmts from the start (where the values are still computed) so that code-motion will simply make it unavailable (and also reset locations so you don't get gdb jumping around). Richard. Jakub
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 26, 2015, Jakub Jelinek ja...@redhat.com wrote: On Wed, Feb 25, 2015 at 09:01:09PM -0300, Alexandre Oliva wrote: On Wed, Feb 25, 2015 at 06:17:33PM -0300, Alexandre Oliva wrote: My measurements, for a not particularly unusual testcase, showed an overall reduction of 63% in compile time, as indicated yesterday. Now, who should bear the burden of collecting evidence to back up the claims against the change? Are those concerns enough to hold it up? Can you e.g. run dwlocstat on some larger C++ binaries built without and with your patch? I believe dwlocstat is supposed to count only the instructions where the variables or parameters are in scope, so should be exactly what we care about here. Erhm... I don't think that would cover the case you were worried about, namely inspecting variables of an inlined function while at a statement moved out of the function ranges. Anyway, I've run dwlocstat and inspected the results. There is indeed a significant reduction in the coverage, so I looked into that. Significant reduction in the coverage should be a red flag. Ok, I looked into it further, after patching dwlocstat to dump per-variable per-range coverage/length info, so as to be able to compare object files more easily. So far, all the differences I looked at were caused by padding at the end of BBs, and by jump stmts without line numbers at the end of BBs, both right after the debug reset stmts the proposed patch introduces. I'll dig further, because so far I've only looked at a few cases. I have to figure out some way to automate the investigation of these differences, because it has been too time-intensive, and not really fruitful in terms of finding the scenarios you're concerned with. The good news is that, looking deeper into cases that appeared to leak info across loop iterations, I confirmed I was mistaken in yesterday's analysis. Phew! :-) -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Wed, Feb 25, 2015 at 11:54:16AM +0100, Richard Biener wrote: Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? But code-motion could still move stmts from the inlined functions across these resets? That said - shouldn't this simply performed by proper var-tracking u-ops produced by a backward scan over the function for live scope-blocks? That is, when you see a scope block becoming live from exit then add u-ops resetting all vars from that scope block? Yeah, wanted to say the same, I'm afraid such a change will very much affect debugging experience close to the end of inlined functions, as sinking, scheduling and various other passes may move statements from the inline functions across those resets. And, various tools and users really want to be able to inspect variables and parameters on the return statement. So, IMHO to do something like this, we'd need to mark those debug stmts some way to say that they aren't normal debug binds, but debug binds at the end of scopes (whether inlined functions or just lexical blocks), and optimization passes that perform code motion should try to detect the case when they are moving downward some statements across such debug stmts and move those debug stmts along with those if possible. And another thing is the amount of the added debug stmts, right now we don't add debug stmts all the time for everything, just when something is needed, while your patch adds it unconditionally, even when debug stmts for those won't be really emitted. As they are just resets, that hopefully will not drastically affect var-tracking time, but might affect other optimization passes, which would need to deal with much more statements than before. Jakub
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 25, 2015, Richard Biener richard.guent...@gmail.com wrote: But code-motion could still move stmts from the inlined functions across these resets? Sure, just like it could still move stmts across any other debug stmts. Once you return from a function, it's as if all of its variables ceased to exist, so what is the problem with this? The real error, IMHO, is to assume the moved instruction is still inside the inline function. It isn't. If you wanted to inspect the variable before it went out of scope, the debugger should have helped you stop there, not stop you at an instruction that is outside the expected flow. That said - shouldn't this simply performed by proper var-tracking u-ops produced by a backward scan over the function for live scope-blocks? Please elaborate (ideally with a patch); I have no idea of how you plan to map scope blocks to their original (and thus correct) position (i.e., before any code motion). That is, when you see a scope block becoming live from exit then add u-ops resetting all vars from that scope block? Oh, you want code motion artifacts to leak into the language VM execution modeled in debug info? That would be fundamentally against the model implemented with debug stmts and insns. /me mumbles something about mixing metaphors and leaky screwdrivers ;-D IOW, I don't think that would be appropriate at all. Remember the VTA presentation at the GCC Summit years ago, when I showed you just can't hope to recover debug info from code already mangled by the compiler, because optimizations are lossy? You get to a point in which you don't know what you don't know, so you're bound to make wrong decisions. So, take note of what you're going to need when you know it's still accurate. Your patch as-is would add very many debug stmts to for example tramp3d. Do you have any evidence that this would have a negative impact on compile time, memory use or any other relevant metric? Odds are that, if it makes any difference whatsoever, it will be a very positive one. The absolute worst case of this patch is doubling the debug stmt count (proof: it will add at most one debug stmt per inlined variable that had at least one debug stmt). Now, if you're concerned about debug stmt count, we could introduce another debug stmt/insn form that can reset multiple variables in a single swoop. It does not seem to me like this would be worth the effort. And as you say, the same reasoning applies to scopes in general, not just for inlines. I actually said the opposite. We turn block-local variables into function-wide declarations very early, so apparently we *can* reference the variables even when they're out of scope. But we *cannot* do that for inlined variables. That's why I drew the line where I did. (Plus, introducing the debug temps at the point I did was easy to try, and it had a huge positive impact :-) Sure we *could* introduce debug unbind stmts at the end of scopes. If you have evidence or even a hunch that this will have a positive effect on relevant metrics, go for it! -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Wed, Feb 25, 2015 at 06:17:33PM -0300, Alexandre Oliva wrote: My measurements, for a not particularly unusual testcase, showed an overall reduction of 63% in compile time, as indicated yesterday. Now, who should bear the burden of collecting evidence to back up the claims against the change? Are those concerns enough to hold it up? Can you e.g. run dwlocstat on some larger C++ binaries built without and with your patch? I believe dwlocstat is supposed to count only the instructions where the variables or parameters are in scope, so should be exactly what we care about here. E.g. cc1plus and libstdc++.so.6 might be good candidates from gcc itself, perhaps firefox or similar as something even larger. Jakub
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 25, 2015, Jakub Jelinek ja...@redhat.com wrote: various tools and users really want to be able to inspect variables and parameters on the return statement. This patch won't affect the return statement. The resets are at the return-to statement; if you stop at the return statement (assuming you have code generated for it in place), you should still get the same bindings. Now, if *all* the code for the return statement was optimized away, and you stop at the subsequent instruction, that happens to be past the return, then, yeah, you're out of luck, but then you were already out of luck before. Now, there is of course the case in which there is some code left in place for the return stmt, but it is no longer in its natural position in the code flow. You stop there, you're technically out of the inline scope, but now you're also past the clobbering of the inlined variables. The real solution for this is to implement the stmt frontiers I presented at some GCC Summit, so that, when you stop at a statement, you get the state you expect regardless of code motion, because you get the state at the natural flow of control, even if node actual code remained at that point. And another thing is the amount of the added debug stmts, right now we don't add debug stmts all the time for everything, just when something is needed, We add debug stmts whenever a trackable (auto gimple register) variable is modified. They are clobbered at the end of the inline function they expanded out of, so this just corrects an long-standing and quite expensive oversight. You won't get debug stmts for unused inlined variables, for example: you should only get them for variables that were remapped while inlining the code to begin with. If you got code for them and they are trackable, you certainly got debug stmts for them as well. while your patch adds it unconditionally, even when debug stmts for those won't be really emitted. It shouldn't. Please show me? As they are just resets, that hopefully will not drastically affect var-tracking time They will. But in a positive way :-) but might affect other optimization passes, which would need to deal with much more statements than before. It shouldn't be hard to test this hypothesis one way or another. Tweak the code that introduces the new debug temps so that they all refer to the same fake variable, as opposed to resetting the intended variables, and then you'll have an exact measure of the compile-time impact *without* the savings afforded by the resets. Then we can compare whether it's an overall win or loss. My measurements, for a not particularly unusual testcase, showed an overall reduction of 63% in compile time, as indicated yesterday. Now, who should bear the burden of collecting evidence to back up the claims against the change? Are those concerns enough to hold it up? -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 25, 2015, Jakub Jelinek ja...@redhat.com wrote: On Wed, Feb 25, 2015 at 06:17:33PM -0300, Alexandre Oliva wrote: My measurements, for a not particularly unusual testcase, showed an overall reduction of 63% in compile time, as indicated yesterday. Now, who should bear the burden of collecting evidence to back up the claims against the change? Are those concerns enough to hold it up? Can you e.g. run dwlocstat on some larger C++ binaries built without and with your patch? I believe dwlocstat is supposed to count only the instructions where the variables or parameters are in scope, so should be exactly what we care about here. Erhm... I don't think that would cover the case you were worried about, namely inspecting variables of an inlined function while at a statement moved out of the function ranges. Anyway, I've run dwlocstat and inspected the results. There is indeed a significant reduction in the coverage, so I looked into that. What I found out is that the reduction is an improvement on yet another long-standing var-tracking issue: if a function is called within a loop and we inline it, bindings from the call in one iteration of the loop will carry over onto the subsequent iteration until a new binding occurs. This accounts for all of the coverage reductions I've investigated. This, in turn, suggests that introducing reset stmts when variables go out of scope even in local blocks will further reduce debug info, although in some cases it might have the opposite effect. E.g., if the actual live range of a variable is scattered across multiple non-contiguous blocks, stopping the binding from being used in intervening blocks where the variable is not live will cause additional entries in the location list. I've observed this effect with the proposed patch, too, but it removes a lot more nonsensical entries than it adds entries to account for not covering intervening ranges by accident. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR58315] reset inlined debug vars at return-to point
On Wed, Feb 25, 2015 at 09:01:09PM -0300, Alexandre Oliva wrote: On Wed, Feb 25, 2015 at 06:17:33PM -0300, Alexandre Oliva wrote: My measurements, for a not particularly unusual testcase, showed an overall reduction of 63% in compile time, as indicated yesterday. Now, who should bear the burden of collecting evidence to back up the claims against the change? Are those concerns enough to hold it up? Can you e.g. run dwlocstat on some larger C++ binaries built without and with your patch? I believe dwlocstat is supposed to count only the instructions where the variables or parameters are in scope, so should be exactly what we care about here. Erhm... I don't think that would cover the case you were worried about, namely inspecting variables of an inlined function while at a statement moved out of the function ranges. Anyway, I've run dwlocstat and inspected the results. There is indeed a significant reduction in the coverage, so I looked into that. Significant reduction in the coverage should be a red flag. I admit I haven't read dwlocstat sources recently, but: dwlocstat = This is a tool for examining Dwarf location info coverage. It goes through DIEs of given binary's debug info that represent variables and function parameters. For each such DIE, it computes coverage of that DIE's range (list of addresses of DIE's scope) by location expressions (a description of where given variable is located at given location: e.g. a variable can be in a register). matches what I wanted the tool to do, namely, if you have say some DW_TAG_variable that is in scope of say an inline function and that DW_TAG_inlined_subroutine has DW_AT_ranges L1-L2, L3-L4, L5-L6, it counts on what percentage of bytes in those ranges (or instructions?) the variable has defined location. The fear I have is that due to scheduling or any other code motion toward the end of function, you simply have instructions like L5-L6 above that are considered part of the function, but that the newly added resets are say on L4 or so and thus whenever you stop on the L5-L6 instructions, the debugger will show you you are in the inline function, but you won't be able to inspect most if not any variables. What I found out is that the reduction is an improvement on yet another long-standing var-tracking issue: if a function is called within a loop and we inline it, bindings from the call in one iteration of the loop will carry over onto the subsequent iteration until a new binding occurs. This accounts for all of the coverage reductions I've investigated. It really depends. Say if you have some inline void foo (void) { int somevar; LARGE CODE NOT INITIALIZING somevar; somevar = VALUE; USE somevar; } and for (...) { foo (); } and the loop is unrolled, then indeed, lost of coverage between the start of the second foo and somevar = VALUE; is acceptable, if it previously just contained the value from previous iteration. But as I said above, if it is about the last few stmts in the inline scope, it is undesirable, and if e.g. the instructions from different unrolled iterations are intermixed, then it is undesirable too. Which is why I was suggesting special debug binds that the optimizers would try to move downward (towards the exit block) on (as much as possible) any code motions across them, unless the debug binds would be moved across a debug bind for the same decl - in that case the end of scope debug bind should be removed rather than moved over. In most cases, there won't be much of code motion across the whole function, but only limited scope, so in the common case you'll still get the effect you are looking for or close to that, but it won't actually penalize the debug info quality. Jakub
Re: [PR58315] reset inlined debug vars at return-to point
On Wed, Feb 25, 2015 at 10:40 AM, Alexandre Oliva aol...@redhat.com wrote: This patch fixes a problem that has been with us for several years. Variable tracking has no idea about the end of the lifetime of inlined variables, so it keeps on computing locations for them over and over, even though the computed locations make no sense whatsoever because the variable can't even be accessed any more. With this patch, we unbind all inlined variables at the point the inlined function returns to, so that the locations for those variables will not be touched any further. In theory, we could do something similar to non-inlined auto variables, when they go out of scope, but their decls apply to the entire function and I'm told gdb sort-of expects the variables to be accessible throughout the function, so I'm not tackling that in this patch, for I'm happy enough with what this patch gets us: - almost 99% reduction in the output asm for the PR testcase - more than 90% reduction in the peak memory use compiling that testcase - 63% reduction in the compile time for that testcase What's scary is that the testcase is not particularly pathological. Any function that calls a longish sequence of inlined functions, that in turn call other inline functions, and so on, something that's not particularly unusual in C++, will likely observe significant improvement, as we won't see growing sequences of var_location notes after each call or so, as var-tracking computes a new in-stack location for the implicit this argument of each previously-inlined function. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? But code-motion could still move stmts from the inlined functions across these resets? That said - shouldn't this simply performed by proper var-tracking u-ops produced by a backward scan over the function for live scope-blocks? That is, when you see a scope block becoming live from exit then add u-ops resetting all vars from that scope block? Your patch as-is would add very many debug stmts to for example tramp3d. And as you say, the same reasoning applies to scopes in general, not just for inlines. Richard. Reset inlined debug variables at the end of the inlined function From: Alexandre Oliva aol...@redhat.com for gcc/ChangeLog PR debug/58315 * tree-inline.c (reset_debug_binding): New. (reset_debug_bindings): Likewise. (expand_call_inline): Call it. --- gcc/tree-inline.c | 56 + 1 file changed, 56 insertions(+) diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index d8abe03..5b58d8b 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4345,6 +4345,60 @@ add_local_variables (struct function *callee, struct function *caller, } } +/* Add to BINDINGS a debug stmt resetting SRCVAR if inlining might + have brought in or introduced any debug stmts for SRCVAR. */ + +static inline void +reset_debug_binding (copy_body_data *id, tree srcvar, gimple_seq *bindings) +{ + tree *remappedvarp = id-decl_map-get (srcvar); + + if (!remappedvarp) +return; + + if (TREE_CODE (*remappedvarp) != VAR_DECL) +return; + + if (*remappedvarp == id-retvar || *remappedvarp == id-retbnd) +return; + + tree tvar = target_for_debug_bind (*remappedvarp); + if (!tvar) +return; + + gdebug *stmt = gimple_build_debug_bind (tvar, NULL_TREE, + id-call_stmt); + gimple_seq_add_stmt (bindings, stmt); +} + +/* For each inlined variable for which we may have debug bind stmts, + add before GSI a final debug stmt resetting it, marking the end of + its life, so that var-tracking knows it doesn't have to compute + further locations for it. */ + +static inline void +reset_debug_bindings (copy_body_data *id, gimple_stmt_iterator gsi) +{ + tree var; + unsigned ix; + gimple_seq bindings = NULL; + + if (!gimple_in_ssa_p (id-src_cfun)) +return; + + if (!opt_for_fn (id-dst_fn, flag_var_tracking_assignments)) +return; + + for (var = DECL_ARGUMENTS (id-src_fn); + var; var = DECL_CHAIN (var)) +reset_debug_binding (id, var, bindings); + + FOR_EACH_LOCAL_DECL (id-src_cfun, ix, var) +reset_debug_binding (id, var, bindings); + + gsi_insert_seq_before_without_update (gsi, bindings, GSI_SAME_STMT); +} + /* If STMT is a GIMPLE_CALL, replace it with its inline expansion. */ static bool @@ -4650,6 +4704,8 @@ expand_call_inline (basic_block bb, gimple stmt, copy_body_data *id) GCOV_COMPUTE_SCALE (cg_edge-frequency, CGRAPH_FREQ_BASE), bb, return_block, NULL); + reset_debug_bindings (id, stmt_gsi); + /* Reset the escaped solution. */ if (cfun-gimple_df) pt_solution_reset (cfun-gimple_df-escaped); -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in