This adds missing code to correctly set the counts of the exit blocks we
create when building the CFG for a vectorized early break loop.
Tested as a series on aarch64-linux-gnu, arm-linux-gnueabihf, and
x86_64-linux-gnu. OK for trunk?
Thanks,
Alex
gcc/ChangeLog:
PR tree-optimization/117790
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Set profile counts for {main,alt}_loop_exit_block.
---
gcc/tree-vect-loop-manip.cc | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 5d1b70aea43..53d36eaa25f 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1686,6 +1686,16 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
set_immediate_dominator (CDI_DOMINATORS, new_preheader,
loop->header);
+
+ /* Fix up the profile counts of the new exit blocks.
+ main_loop_exit_block was created by duplicating the
+ preheader, so needs its count scaling according to the main
+ exit edge's probability. The remaining count from the
+ preheader goes to the alt_loop_exit_block, since all
+ alternative exits have been redirected there. */
+ main_loop_exit_block->count = loop_exit->count ();
+ alt_loop_exit_block->count
+ = preheader->count - main_loop_exit_block->count;
}
/* Adjust the epilog loop PHI entry values to continue iteration.