The recent basic block profiling changes broke a couple of libgomp OpenACC execution tests involving reductions with nvptx offloading. For gang and worker reductions, the nvptx BE updates the original reduction variable using a lock-free atomic algorithm. This lock-free algorithm utilizes a polling loop to check the state of the variable being updated. This loop introduced a new basic block edge, but it wasn't assigned a branch probability. Because of the highly threaded nature of CUDA accelerators, I set the branch probability for that edge as even.
Similarly, for nvptx vector reductions, when it comes time to initialize the reduction variable, the nvptx BE constructs a branch so that only vector lanes 1 to vector_length-1 are initialized the the default value for a given reduction type, where vector lane 0 retains the original value of the reduction variable. For similar reason to the gang and worker reductions, I set the probability of the new edge introduced for the vector reduction to even. Is this OK for trunk? Cesar
2017-07-13 Cesar Philippidis <ce...@codesourcery.com> gcc * config/nvptx/nvptx.c (nvptx_lockless_update): Update edge profiling information. (nvptx_lockfull_update): Likewise. (nvptx_goacc_reduction_init): Likewise. diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index c8847a5dbba..3a24bd375ca 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -4985,6 +4985,7 @@ nvptx_lockless_update (location_t loc, gimple_stmt_iterator *gsi, post_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU; edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_FALSE_VALUE); + loop_edge->probability = profile_probability::even (); set_immediate_dominator (CDI_DOMINATORS, loop_bb, pre_bb); set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb); @@ -5057,7 +5058,8 @@ nvptx_lockfull_update (location_t loc, gimple_stmt_iterator *gsi, /* Create the lock loop ... */ locked_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU; - make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE); + edge e = make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE); + e->probability = profile_probability::even (); set_immediate_dominator (CDI_DOMINATORS, lock_bb, entry_bb); set_immediate_dominator (CDI_DOMINATORS, update_bb, lock_bb); @@ -5211,6 +5213,7 @@ nvptx_goacc_reduction_init (gcall *call) /* Create false edge from call_bb to dst_bb. */ edge nop_edge = make_edge (call_bb, dst_bb, EDGE_FALSE_VALUE); + nop_edge->probability = profile_probability::even (); /* Create phi node in dst block. */ gphi *phi = create_phi_node (lhs, dst_bb);