https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87615
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #15)
> Still:
> tree FRE : 43.43 ( 50%) 3299k ( 1%)
It's the usual problematic dominated_by_p_w_unex:
Samples: 121K of event 'cycles:Pu', Event count (approx.): 157797213236
Overhead Samples Command Shared Object Symbol
18.77% 23425 cc1 cc1 [.]
dominated_by_p_w_unex
15.33% 19144 cc1 cc1 [.]
dominated_by_p(cdi_di
6.85% 8087 cc1 cc1 [.]
bitmap_set_bit(bitmap
3.36% 4080 cc1 cc1 [.]
back_jt_path_registry
in this case the two-step single succ skip:
/* Iterate to the single executable bb2 successor. */
if (EDGE_COUNT (bb2->succs) > 1)
{
edge succe = NULL;
FOR_EACH_EDGE (e, ei, bb2->succs)
if ((e->flags & EDGE_EXECUTABLE)
|| (!allow_back && (e->flags & EDGE_DFS_BACK)))
{
if (succe)
{
succe = NULL;
break;
}
succe = e;
}
if (succe)
{
/* Verify the reached block is only reached through succe.
If there is only one edge we can spare us the dominator
check and iterate directly. */
if (EDGE_COUNT (succe->dest->preds) > 1)
{
FOR_EACH_EDGE (e, ei, succe->dest->preds)
if (e != succe
&& ((e->flags & EDGE_EXECUTABLE)
|| (!allow_back && (e->flags & EDGE_DFS_BACK))))
{
succe = NULL;
break;
}
I do have some ideas to cache & update the "single executable succ/pred"
state during the RPO iteration but I did not get to implement that yet
(there's also the initial state of no executable edge when iterating and
when succe->dest isn't visited yet).
I have a patch to add the limit imposed on the single-succ case also here.