Re: Merge switch statements in tree-cfgcleanup
On 07/22/2016 07:13 AM, Richard Biener wrote: + if (dom_info_available_p (CDI_DOMINATORS)) I think DOM info is always available during CFG cleanup My recollection is that the cfg cleanup routines start by removing unreachables, then recomputing the dominator tree if it's not available. That sequencing is vital as the dominator tree builder will assert there's no unreachables in the CFG. jeff
Re: Merge switch statements in tree-cfgcleanup
On Wed, Jul 20, 2016 at 6:26 PM, Bernd Schmidtwrote: > On 07/20/2016 06:09 PM, Jeff Law wrote: >> >> So I'm going to let Richi run with the review on this one since the two >> of you are already iterating. But I did have one comment on the >> placement of the pass. >> >> I believe one of the key things to consider for whether or not something >> like this belongs in the cfgcleanup code is whether or not the >> optimization is likely exposed repeatedly through the optimization >> pipeline. If it's mostly a source level issue or only usually exposed >> by a limited set of optimizers, then a separate pass might be better. > > > It can trigger before switchconv, and could expose optimization > opportunities there, but I've also seen it trigger much later. Since I think > it's cheap I don't see a reason not to put it in cfgcleanup, IMO it's the > best fit conceptually. I'm not convinced. We do have CFG matching passes like phi-opt, if-combine. With your logic even simple jump-threading like if (a) if (a) ... would be a job for CFG cleanup? + gsi2 = gsi_start_nondebug_after_labels_bb (bb); + if (gsi_end_p (gsi2)) +return false; + + gimple *stmt = gsi_stmt (gsi2); + if (gimple_code (stmt) != GIMPLE_SWITCH) +return false; stmt = last_stmt (bb); if (!stmt || gimple_code (stmt) != GIMPLE_SWITHC) return false; is shorter and has the same effect. + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) +{ + basic_block dest = e->dest; + if (find_edge (pred_bb, dest)) + { + /* If a destination block is reached from both switches, any +phi nodes there would become corrupted. */ + gphi_iterator psi = gsi_start_phis (dest); + if (!gsi_end_p (psi)) + return false; + } +} I wonder if this tests all required blocks. Why can't there be a forwarder between pred_bb and dest for example? I think you want sth like if (! dominated_by_p (CDI_DOMINATORS, dest, bb)) ... verify no PHIs ... + auto_vec new_labels; + please pre-reserve enough space to hold labels from both switches and use quick_push below. + /* Merge adjacent ranges. */ + size_t new_count = new_labels.length (); + for (i1 = i2 = 1; i1 < new_count; ) +{ + ce1 = new_labels[i1++]; + tree high = CASE_HIGH (ce1); + if (high == NULL_TREE) + high = CASE_LOW (ce1); + while (i1 < new_count) + { + ce2 = new_labels[i1]; what about re-using the code from tree-cfg.c instead? (group_case_labels_stmt) + if (dom_info_available_p (CDI_DOMINATORS)) I think DOM info is always available during CFG cleanup +{ + basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb); + gcc_assert (dombb == pred_bb); + vec fix_bbs; + fix_bbs = get_all_dominated_blocks (CDI_DOMINATORS, pred_bb); + iterate_fix_dominators (CDI_DOMINATORS, fix_bbs, false); + fix_bbs.release (); +} Richard. > > Bernd >
Re: Merge switch statements in tree-cfgcleanup
On 07/20/2016 06:09 PM, Jeff Law wrote: So I'm going to let Richi run with the review on this one since the two of you are already iterating. But I did have one comment on the placement of the pass. I believe one of the key things to consider for whether or not something like this belongs in the cfgcleanup code is whether or not the optimization is likely exposed repeatedly through the optimization pipeline. If it's mostly a source level issue or only usually exposed by a limited set of optimizers, then a separate pass might be better. It can trigger before switchconv, and could expose optimization opportunities there, but I've also seen it trigger much later. Since I think it's cheap I don't see a reason not to put it in cfgcleanup, IMO it's the best fit conceptually. Bernd
Re: Merge switch statements in tree-cfgcleanup
On 07/20/2016 05:14 AM, Bernd Schmidt wrote: On 07/19/2016 01:18 PM, Richard Biener wrote: On Tue, Jul 19, 2016 at 1:07 PM, Bernd Schmidtwrote: On 07/19/2016 12:35 PM, Richard Biener wrote: I think that start/end_recording_case_labels also merged adjacent labels via group_case_labels_stmt. Not sure why you need to stop recording case labels during the transform. Is this because you are building a new switch stmt? It's because the cached mapping gets invalidated. Look in tree-cfg, it has a edge_to_cases map which I think cannot be maintained if you modify the structure. I certainly got lots of internal errors until I added that pair of calls. Yeah, I see that. OTOH cfgcleanup relies on this cache to be efficient and you (repeatedly) clear it. Clearing parts of it should be sufficient and if you used redirect_edge_and_branch instead of redirect_edge_pred it would have maintained the cache as far as I can see, I don't think that would work, since we're modifying and/or discarding case labels as well and they can't remain part of the cache. or you can make sure to maintain it yourself or just clear the info associated with the edges you redirect from one switch to another. How's this? Tested as before. So I'm going to let Richi run with the review on this one since the two of you are already iterating. But I did have one comment on the placement of the pass. I believe one of the key things to consider for whether or not something like this belongs in the cfgcleanup code is whether or not the optimization is likely exposed repeatedly through the optimization pipeline. If it's mostly a source level issue or only usually exposed by a limited set of optimizers, then a separate pass might be better. jeff
Re: Merge switch statements in tree-cfgcleanup
On 07/19/2016 01:18 PM, Richard Biener wrote: On Tue, Jul 19, 2016 at 1:07 PM, Bernd Schmidtwrote: On 07/19/2016 12:35 PM, Richard Biener wrote: I think that start/end_recording_case_labels also merged adjacent labels via group_case_labels_stmt. Not sure why you need to stop recording case labels during the transform. Is this because you are building a new switch stmt? It's because the cached mapping gets invalidated. Look in tree-cfg, it has a edge_to_cases map which I think cannot be maintained if you modify the structure. I certainly got lots of internal errors until I added that pair of calls. Yeah, I see that. OTOH cfgcleanup relies on this cache to be efficient and you (repeatedly) clear it. Clearing parts of it should be sufficient and if you used redirect_edge_and_branch instead of redirect_edge_pred it would have maintained the cache as far as I can see, I don't think that would work, since we're modifying and/or discarding case labels as well and they can't remain part of the cache. or you can make sure to maintain it yourself or just clear the info associated with the edges you redirect from one switch to another. How's this? Tested as before. Bernd * tree-cfgcleanup.c (try_merge_switches): New static function. (cleanup_tree_cfg_bb): Call it. * tree-cfg.c (discard_case_labels_for): New function. * tree-cfg.h (discard_case_labels_for): Declare it. * c-c++-common/merge-switch-1.c: New test. * c-c++-common/merge-switch-2.c: New test. * c-c++-common/merge-switch-3.c: New test. Index: gcc/tree-cfg.c === --- gcc/tree-cfg.c (revision 237797) +++ gcc/tree-cfg.c (working copy) @@ -1185,6 +1185,32 @@ end_recording_case_labels (void) BITMAP_FREE (touched_switch_bbs); } +/* Discard edge information for a single switch. */ +void +discard_case_labels_for (gswitch *t) +{ + if (!recording_case_labels_p ()) +return; + + basic_block bb = gimple_bb (t); + edge_iterator ei; + edge e; + FOR_EACH_EDGE (e, ei, bb->succs) +{ + tree *slot = edge_to_cases->get (e); + if (!slot) + continue; + edge_to_cases->remove (e); + tree t, next; + for (t = *slot; t; t = next) + { + next = CASE_CHAIN (t); + CASE_CHAIN (t) = NULL; + } + *slot = NULL; +} +} + /* If we are inside a {start,end}_recording_cases block, then return a chain of CASE_LABEL_EXPRs from T which reference E. Index: gcc/tree-cfg.h === --- gcc/tree-cfg.h (revision 237797) +++ gcc/tree-cfg.h (working copy) @@ -33,6 +33,7 @@ extern void init_empty_tree_cfg_for_func extern void init_empty_tree_cfg (void); extern void start_recording_case_labels (void); extern void end_recording_case_labels (void); +extern void discard_case_labels_for (gswitch *); extern basic_block label_to_block_fn (struct function *, tree); #define label_to_block(t) (label_to_block_fn (cfun, t)) extern void cleanup_dead_labels (void); Index: gcc/tree-cfgcleanup.c === --- gcc/tree-cfgcleanup.c (revision 237797) +++ gcc/tree-cfgcleanup.c (working copy) @@ -630,6 +630,233 @@ fixup_noreturn_call (gimple *stmt) return changed; } +/* Look for situations where we have a switch inside the default case of + another, and they switch on the same condition. We look for the + second switch in BB. If we find such a situation, merge the two + switch statements. */ + +static bool +try_merge_switches (basic_block bb) +{ + if (!single_pred_p (bb)) +return false; + basic_block pred_bb = single_pred (bb); + + /* Look for a structure with two switch statements on the same value. */ + gimple_stmt_iterator gsi1, gsi2; + gsi1 = gsi_last_nondebug_bb (pred_bb); + gimple *pred_end = last_stmt (pred_bb); + if (! pred_end || gimple_code (pred_end) != GIMPLE_SWITCH) +return false; + + gsi2 = gsi_start_nondebug_after_labels_bb (bb); + if (gsi_end_p (gsi2)) +return false; + + gimple *stmt = gsi_stmt (gsi2); + if (gimple_code (stmt) != GIMPLE_SWITCH) +return false; + + gswitch *sw1 = as_a (pred_end); + gswitch *sw2 = as_a (stmt); + tree idx1 = gimple_switch_index (sw1); + tree idx2 = gimple_switch_index (sw2); + if (TREE_CODE (idx1) != SSA_NAME || idx1 != idx2) +return false; + size_t n1 = gimple_switch_num_labels (sw1); + size_t n2 = gimple_switch_num_labels (sw2); + if (n1 <= 1 || n2 <= 1) +return false; + tree sw1_default = gimple_switch_default_label (sw1); + if (label_to_block (CASE_LABEL (sw1_default)) != bb) +return false; + + /* We know we have the basic structure of what we are looking for. Sort + out some special cases regarding phi nodes. */ + if (!gsi_end_p (gsi_start_phis (bb))) +return false; + + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) +{ + basic_block dest = e->dest; + if (find_edge
Re: Merge switch statements in tree-cfgcleanup
On Tue, Jul 19, 2016 at 1:07 PM, Bernd Schmidtwrote: > On 07/19/2016 12:35 PM, Richard Biener wrote: > >> I think that start/end_recording_case_labels also merged adjacent labels >> via group_case_labels_stmt. Not sure why you need to stop recording >> case labels during the transform. Is this because you are building a new >> switch stmt? > > > It's because the cached mapping gets invalidated. Look in tree-cfg, it has a > edge_to_cases map which I think cannot be maintained if you modify the > structure. I certainly got lots of internal errors until I added that pair > of calls. Yeah, I see that. OTOH cfgcleanup relies on this cache to be efficient and you (repeatedly) clear it. Clearing parts of it should be sufficient and if you used redirect_edge_and_branch instead of redirect_edge_pred it would have maintained the cache as far as I can see, or you can make sure to maintain it yourself or just clear the info associated with the edges you redirect from one switch to another. Btw, + gimple_stmt_iterator gsi1, gsi2; + gsi1 = gsi_last_nondebug_bb (pred_bb); + if (gsi_end_p (gsi1)) +return false; + gimple *pred_end = gsi_stmt (gsi1); + if (gimple_code (pred_end) != GIMPLE_SWITCH) this is just gimple *pred_end = last_stmt (pred_bb); if (! pred_end || gimple_code (pred_end) != GIMPLE_SWITCH) ... Richard. Richard. > > Bernd >
Re: Merge switch statements in tree-cfgcleanup
On 07/19/2016 12:35 PM, Richard Biener wrote: I think that start/end_recording_case_labels also merged adjacent labels via group_case_labels_stmt. Not sure why you need to stop recording case labels during the transform. Is this because you are building a new switch stmt? It's because the cached mapping gets invalidated. Look in tree-cfg, it has a edge_to_cases map which I think cannot be maintained if you modify the structure. I certainly got lots of internal errors until I added that pair of calls. Bernd
Re: Merge switch statements in tree-cfgcleanup
On Tue, Jul 19, 2016 at 12:25 PM, Bernd Schmidtwrote: > On 07/19/2016 12:22 PM, Marc Glisse wrote: >> >> On Tue, 19 Jul 2016, Bernd Schmidt wrote: >> >>> On 07/19/2016 12:09 PM, Richard Biener wrote: >>> I saw walks over stmts of a BB. IMHO that's a no-go. >>> >>> >>> Only to find the first or last nondebug one. Is that unacceptable? >> >> >> Does gsi_start_nondebug_after_labels_bb not fit? > > > It might, if one realizes that such a thing exists. Will try. I think that start/end_recording_case_labels also merged adjacent labels via group_case_labels_stmt. Not sure why you need to stop recording case labels during the transform. Is this because you are building a new switch stmt? Richard. > Bernd >
Re: Merge switch statements in tree-cfgcleanup
On 07/19/2016 12:22 PM, Marc Glisse wrote: On Tue, 19 Jul 2016, Bernd Schmidt wrote: On 07/19/2016 12:09 PM, Richard Biener wrote: I saw walks over stmts of a BB. IMHO that's a no-go. Only to find the first or last nondebug one. Is that unacceptable? Does gsi_start_nondebug_after_labels_bb not fit? It might, if one realizes that such a thing exists. Will try. Bernd
Re: Merge switch statements in tree-cfgcleanup
On Tue, 19 Jul 2016, Bernd Schmidt wrote: On 07/19/2016 12:09 PM, Richard Biener wrote: I saw walks over stmts of a BB. IMHO that's a no-go. Only to find the first or last nondebug one. Is that unacceptable? Does gsi_start_nondebug_after_labels_bb not fit? -- Marc Glisse
Re: Merge switch statements in tree-cfgcleanup
On 07/19/2016 12:09 PM, Richard Biener wrote: I saw walks over stmts of a BB. IMHO that's a no-go. Only to find the first or last nondebug one. Is that unacceptable? I'm thinking of switch (a) { ... case n: do-stuff; default: switch (a) { case n: do-stuff; ... } } yes, you can simply drop cases when there is no code in the outer switch. We check that the second switch has a single predecessor block so this case can't happen. Bernd
Re: Merge switch statements in tree-cfgcleanup
On Tue, Jul 19, 2016 at 11:52 AM, Bernd Schmidtwrote: > On 07/19/2016 10:07 AM, Richard Biener wrote: >> >> >> This is not appropriate for CFG cleanup due to its complexity not >> being O(# bbs + # edges). >> I tried hard in the past to make it so (at least when no transform is >> done). > > > Why wouldn't it be, if no transform is done? Assuming we visit each bb once, > we have at worst one walk over the successor edges of the predecessor (if we > even find two switches on the same variable), and then we can decide whether > or not to do the transformation. I saw walks over stmts of a BB. IMHO that's a no-go. That said, CFG cleanup is not the place for this optimization. The only trivial CFG cleanup transform for switches I can see is transforming them to a simple if / else in case there is a single non-default label. And it's not even doing that currently. > When performing the transformation I could imagine one could construct a > testcase where lots of these switches are nested inside each other, but I'm > not convinved that's really a realistic worry. > >> Please move this transform elsewhere. I suggest the switch-conversion >> pass or if that >> is not a good fit then maybe if-combine (whose transforms are remotely >> related). > > > One problem is that this triggers rarely, but when it does, it occurs at > various stages in the compilation after other optimizations have been done. > Moving it to any given point is likely to limit the effectiveness. Well, that's true for all optimization passes we have. The opportunity once it arises is not likely to be removed by any pass. >> Not looking closer at the patch but missing some comments on how it deals >> with >> common cases (you see to handle fallthrus to the default label by >> ignoring them?) > > > If you are thinking of > > switch (a) > { > case n: > case m: > default: >switch (a) { } > } > > then the cases for n and m can simply be dropped when merging from the > second switch into the first one. That's what happens, and there's a comment > for it. So please elaborate what you mean. I'm thinking of switch (a) { ... case n: do-stuff; default: switch (a) { case n: do-stuff; ... } } yes, you can simply drop cases when there is no code in the outer switch. Richard. > > > Bernd >
Re: Merge switch statements in tree-cfgcleanup
On 07/19/2016 10:07 AM, Richard Biener wrote: This is not appropriate for CFG cleanup due to its complexity not being O(# bbs + # edges). I tried hard in the past to make it so (at least when no transform is done). Why wouldn't it be, if no transform is done? Assuming we visit each bb once, we have at worst one walk over the successor edges of the predecessor (if we even find two switches on the same variable), and then we can decide whether or not to do the transformation. When performing the transformation I could imagine one could construct a testcase where lots of these switches are nested inside each other, but I'm not convinved that's really a realistic worry. Please move this transform elsewhere. I suggest the switch-conversion pass or if that is not a good fit then maybe if-combine (whose transforms are remotely related). One problem is that this triggers rarely, but when it does, it occurs at various stages in the compilation after other optimizations have been done. Moving it to any given point is likely to limit the effectiveness. Not looking closer at the patch but missing some comments on how it deals with common cases (you see to handle fallthrus to the default label by ignoring them?) If you are thinking of switch (a) { case n: case m: default: switch (a) { } } then the cases for n and m can simply be dropped when merging from the second switch into the first one. That's what happens, and there's a comment for it. So please elaborate what you mean. Bernd
Re: Merge switch statements in tree-cfgcleanup
On Mon, Jul 18, 2016 at 6:07 PM, Bernd Schmidtwrote: > The motivating example for this patch was a change that was submitted for > genattrtab last year, which would have made us generate > > switch (type = get_attr_type (insn)) > { >... some cases ... >default: > switch (type = get_attr_type (insn))) >{ >... some other cases ... >} > } > > The idea was to optimize this by merging the code into a single switch. My > expectation was that this was most likely to occur in machine-generated > code, but there are a few instances of this pattern in the gcc sources > themselves. One case is > >code = gimple_code (stmt); >switch (code) > { > > default: >if (is_gimple_omp (code)) > { > } > } > > where is_gimple_omp expands into another switch. More cases exist in the > compiler as shown by various bootstrap failures along the way; sometimes > these are exposed after other optimizations. One is in the Ada runtime > library somewhere, and another (which currently cannot be transformed by the > patch) is in the Fortran frontend. > > In the future we could also look for if statements making another comparison > of the variable in the default branch, that would be a minor extension. > > The motivating example currently can't be transformed because get_attr_type > calls are in the way. > > Bootstrapped and tested on x86_64-linux. Ok? This is not appropriate for CFG cleanup due to its complexity not being O(# bbs + # edges). I tried hard in the past to make it so (at least when no transform is done). Please move this transform elsewhere. I suggest the switch-conversion pass or if that is not a good fit then maybe if-combine (whose transforms are remotely related). Not looking closer at the patch but missing some comments on how it deals with common cases (you see to handle fallthrus to the default label by ignoring them?) Thanks, Richard. > > Bernd
Merge switch statements in tree-cfgcleanup
The motivating example for this patch was a change that was submitted for genattrtab last year, which would have made us generate switch (type = get_attr_type (insn)) { ... some cases ... default: switch (type = get_attr_type (insn))) { ... some other cases ... } } The idea was to optimize this by merging the code into a single switch. My expectation was that this was most likely to occur in machine-generated code, but there are a few instances of this pattern in the gcc sources themselves. One case is code = gimple_code (stmt); switch (code) { default: if (is_gimple_omp (code)) { } } where is_gimple_omp expands into another switch. More cases exist in the compiler as shown by various bootstrap failures along the way; sometimes these are exposed after other optimizations. One is in the Ada runtime library somewhere, and another (which currently cannot be transformed by the patch) is in the Fortran frontend. In the future we could also look for if statements making another comparison of the variable in the default branch, that would be a minor extension. The motivating example currently can't be transformed because get_attr_type calls are in the way. Bootstrapped and tested on x86_64-linux. Ok? Bernd * tree-cfgcleanup.c (try_merge_switches): New function. (cleanup_tree_cfg_bb): Call it. * c-c++-common/merge-switch-1.c: New test. * c-c++-common/merge-switch-2.c: New test. * c-c++-common/merge-switch-3.c: New test. Index: gcc/tree-cfgcleanup.c === --- gcc/tree-cfgcleanup.c (revision 237797) +++ gcc/tree-cfgcleanup.c (working copy) @@ -630,6 +630,242 @@ fixup_noreturn_call (gimple *stmt) return changed; } +/* Look for situations where we have a switch inside the default case of + another, and they switch on the same condition. We look for the + second switch in BB. If we find such a situation, merge the two + switch statements. */ + +static bool +try_merge_switches (basic_block bb) +{ + if (!single_pred_p (bb)) +return false; + basic_block pred_bb = single_pred (bb); + + /* Look for a structure with two switch statements on the same value. */ + gimple_stmt_iterator gsi1, gsi2; + gsi1 = gsi_last_nondebug_bb (pred_bb); + if (gsi_end_p (gsi1)) +return false; + gimple *pred_end = gsi_stmt (gsi1); + if (gimple_code (pred_end) != GIMPLE_SWITCH) +return false; + + gsi2 = gsi_after_labels (bb); + if (gsi_end_p (gsi2)) +return false; + + gimple *stmt = gsi_stmt (gsi2); + while (is_gimple_debug (stmt)) +{ + gsi_next (); + if (gsi_end_p (gsi2)) + return false; + stmt = gsi_stmt (gsi2); +} + + if (gimple_code (stmt) != GIMPLE_SWITCH) +return false; + gswitch *sw1 = as_a (pred_end); + gswitch *sw2 = as_a (stmt); + tree idx1 = gimple_switch_index (sw1); + tree idx2 = gimple_switch_index (sw2); + if (TREE_CODE (idx1) != SSA_NAME || idx1 != idx2) +return false; + size_t n1 = gimple_switch_num_labels (sw1); + size_t n2 = gimple_switch_num_labels (sw2); + if (n1 <= 1 || n2 <= 1) +return false; + tree sw1_default = gimple_switch_default_label (sw1); + if (label_to_block (CASE_LABEL (sw1_default)) != bb) +return false; + + /* We know we have the basic structure of what we are looking for. Sort + out some special cases regarding phi nodes. */ + if (!gsi_end_p (gsi_start_phis (bb))) +return false; + + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) +{ + basic_block dest = e->dest; + if (find_edge (pred_bb, dest)) + { + /* If a destination block is reached from both switches, any + phi nodes there would become corrupted. */ + gphi_iterator psi = gsi_start_phis (dest); + if (!gsi_end_p (psi)) + return false; + } +} + + /* At this point we know we are making the transformation. + Clear the CASE_CHAIN values to avoid inconsistencies. */ + end_recording_case_labels (); + + /* Start at 1 to skip default labels. */ + size_t i1 = 1; + size_t i2 = 1; + tree ce1 = gimple_switch_label (sw1, i1); + tree ce2 = gimple_switch_label (sw2, i2); + auto_vec new_labels; + + /* Keep track of the blocks that were reachable from the second switch, + and whose edges should be redirected to the first. Sometimes we + eliminate cases from the second switch entirely since they are + unreachable; in such cases, the bit for the destination block + remains clear. */ + bitmap_head redirect_bbs; + bitmap_initialize (_bbs, _default_obstack); + + /* Merge case labels from sw2 into those of sw1. */ + while (i1 < n1 && i2 < n2) +{ + tree min1 = CASE_LOW (ce1); + tree min2 = CASE_LOW (ce2); + tree max1 = CASE_HIGH (ce1); + tree max2 = CASE_HIGH (ce2); + if (max1 == NULL_TREE) + max1 = min1; + if (max2 == NULL_TREE) + max2 = min2; + + if