Can someone review and ok the attached patch for trunk? It has been bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by enabling -freorder-blocks-and-partition enabled for a profiledbootstrap as well.
(Honza, see more responses inlined below. Rong, please see note below as well). Thanks, Teresa On Fri, Aug 30, 2013 at 2:14 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >> Great! Is this the LTO merging you were talking about in an earlier >> message, or the gcov runtime fix (that would presumably not be >> lto-specific)? > > It is the LTO path - we need to merge profiles there anyway for his code > unification > work. Rong - can you send a summary of the approach you are working on? Is it LIPO-specific? > >> > I have patch to track this. Moreover vforks seems to produce repeated >> > merging of results. >> >> Aha, does this explain the gimp issue as well? > > Not really - we still need to debug why we hit cold section so many times with > partitioning. I sitll think easier approach will be to lock the cold section > and > then start probably with testsuite (i.e. write script to compile the small > testcases > with FDO + partitioning and see what crash by hitting cold section). Ok, that is on my todo list. > >> > >> > Is it really necessary to run this from internal loop of the cfgcleanup? >> >> The reason I added it here is that just below there is a call to >> verify_flow_info, and that will fail with the new verification. > > Hmm, OK, I suppose we run the cleanup after partitioning just once or twice, > right? > We can track this incrementally - I am not sure if we put it from the > internal iteration > loop we would get anything substantial either. > Removing unreachable blocks twice is however ugly. When I was debugging the issue that led to this change I seemed to see 1-2 iterations typically. Although I haven't measured it scientifically. It would be good to revisit that and see if we can pull both parts out of the loop, but as a separate patch. > >> +/* Ensure that all hot bbs are included in a hot path through the >> + procedure. This is done by calling this function twice, once >> + with WALK_UP true (to look for paths from the entry to hot bbs) and >> + once with WALK_UP false (to look for paths from hot bbs to the exit). >> + Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs >> + to BBS_IN_HOT_PARTITION. */ >> + >> +static unsigned int >> +sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count, >> + vec<basic_block> *bbs_in_hot_partition) >> +{ >> + /* Callers check this. */ >> + gcc_checking_assert (cold_bb_count); >> + >> + /* Keep examining hot bbs while we still have some left to check >> + and there are remaining cold bbs. */ >> + vec<basic_block> hot_bbs_to_check = bbs_in_hot_partition->copy (); >> + while (! hot_bbs_to_check.is_empty () >> + && cold_bb_count) >> + { >> + basic_block bb = hot_bbs_to_check.pop (); >> + vec<edge, va_gc> *edges = walk_up ? bb->preds : bb->succs; >> + edge e; >> + edge_iterator ei; >> + int highest_probability = 0; >> + int highest_freq = 0; >> + gcov_type highest_count = 0; >> + bool found = false; >> + >> + /* Walk the preds/succs and check if there is at least one already >> + marked hot. Keep track of the most frequent pred/succ so that we >> + can mark it hot if we don't find one. */ >> + FOR_EACH_EDGE (e, ei, edges) >> + { >> + basic_block reach_bb = walk_up ? e->src : e->dest; >> + >> + if (e->flags & EDGE_DFS_BACK) >> + continue; >> + >> + if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION) >> + { >> + found = true; >> + break; >> + } >> + /* The following loop will look for the hottest edge via >> + the edge count, if it is non-zero, then fallback to the edge >> + frequency and finally the edge probability. */ >> + if (e->count > highest_count) >> + highest_count = e->count; >> + int edge_freq = EDGE_FREQUENCY (e); >> + if (edge_freq > highest_freq) >> + highest_freq = edge_freq; >> + if (e->probability > highest_probability) >> + highest_probability = e->probability; >> + } >> + >> + /* If bb is reached by (or reaches, in the case of !WALK_UP) another >> hot >> + block (or unpartitioned, e.g. the entry block) then it is ok. If >> not, >> + then the most frequent pred (or succ) needs to be adjusted. In the >> + case where multiple preds/succs have the same frequency (e.g. a >> + 50-50 branch), then both will be adjusted. */ >> + if (found) >> + continue; >> + >> + FOR_EACH_EDGE (e, ei, edges) >> + { >> + if (e->flags & EDGE_DFS_BACK) >> + continue; >> + /* Select the hottest edge using the edge count, if it is >> non-zero, >> + then fallback to the edge frequency and finally the edge >> + probability. */ >> + if (highest_count) >> + { >> + if (e->count < highest_count) >> + continue; >> + } >> + else if (highest_freq) > > The frequency condition needs to be done only when you walk predecestors - > when > you walk down the edge probabilities are just fine. True. For simplicity I think it should be fine to leave as-is so there isn't more special casing as the current approach works in both directions. > > The patch seems OK to me now. I will make our FDO tester to use partitioning > so we get > this benchmarked a bit. Ok thanks. > > Honza -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
patch.diff
Description: Binary data