This patch fixes performance regression for AutoFDO. When the entry block count is 0, which is quite possible in AutoFDO, it can still make right optimization decision.
Bootstrapped passed regression test and performance test (improve 0.5% on average). OK for google-4_8? Thanks, Dehao
Index: gcc/bb-reorder.c =================================================================== --- gcc/bb-reorder.c (revision 207066) +++ gcc/bb-reorder.c (working copy) @@ -1564,15 +1564,14 @@ find_rarely_executed_basic_blocks_and_crossing_edg /* Mark which partition (hot/cold) each basic block belongs in. */ FOR_EACH_BB (bb) { - bool cold_bb = false; + bool cold_bb = probably_never_executed_bb_p (cfun, bb); - if (probably_never_executed_bb_p (cfun, bb)) + if (!flag_auto_profile && cold_bb) { /* Handle profile insanities created by upstream optimizations by also checking the incoming edge weights. If there is a non-cold incoming edge, conservatively prevent this block from being split into the cold section. */ - cold_bb = true; FOR_EACH_EDGE (e, ei, bb->preds) if (!probably_never_executed_edge_p (cfun, e)) { Index: gcc/predict.c =================================================================== --- gcc/predict.c (revision 207066) +++ gcc/predict.c (working copy) @@ -2902,7 +2902,7 @@ counts_to_freqs (void) /* Don't overwrite the estimated frequencies when the profile for the function is missing. We may drop this function PROFILE_GUESSED later in drop_profile (). */ - if (!ENTRY_BLOCK_PTR->count) + if (!flag_auto_profile && !ENTRY_BLOCK_PTR->count) return 0; FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) @@ -3161,7 +3161,8 @@ rebuild_frequencies (void) count_max = MAX (bb->count, count_max); if (profile_status == PROFILE_GUESSED - || (profile_status == PROFILE_READ && count_max < REG_BR_PROB_BASE/10)) + || (!flag_auto_profile && profile_status == PROFILE_READ + && count_max < REG_BR_PROB_BASE/10)) { loop_optimizer_init (0); add_noreturn_fake_exit_edges ();