[Bug c++/53958] set_slot_part and canon_value_cmp using 90% of compile time

jakub at gcc dot gnu.org Mon, 20 Jan 2014 03:03:24 -0800

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53958


--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #6)
> As usual, the iteration order imposed by pre_and_rev_postorder isn't the best
> one for a forward dataflow problem.  Using inverted_post_order_compute
> improves compile-time somewhat.  But I can still confirm
> 
>  variable tracking       :   0.39 ( 1%) usr   0.00 ( 0%) sys   0.41 ( 1%)
> wall    5504 kB ( 6%) ggc
>  var-tracking dataflow   :  34.96 (84%) usr   0.16 (53%) sys  35.10 (84%)
> wall      47 kB ( 0%) ggc
> 
> as general observation I wonder why the dataflow problem computes the
> _union_ as opposed to the intersection of the info on the preds
> in the !MAY_HAVE_DEBUG_INSNS case.
> 
> Also if the MAY_HAVE_DEBUG_INSNS case really computes an intersection
> (as documented) then it can avoid repeated clearing of the in set
> and only needs to dataflow_set_merge from changed edges.
> 
> Now the discrepancy in wrt the !MAY_HAVE_DEBUG_INSNS case makes me not
> trust that comment blindly ...

I think it isn't about MAY_HAVE_DEBUG_INSNS or not, but the union vs.
intersection depends on if the var is tracked by VALUEs or not, i.e. onepart
decls vs. others.

> From a quick look var-tracking doesn't seem to take the opportunity of
> pruning its sets based on variable scopes (it would need to compute
> scope liveness in vt_initialize).

That has been discussed and refused by GDB folks AFAIK.
E.g. see http://gcc.gnu.org/ml/gcc-patches/2010-03/msg00960.html
(though I think it was mostly IRC based later on).

> Anyway, here's a patch improving compile-time for this testcase by ~6%
> 
> Index: gcc/var-tracking.c
> ===================================================================
> --- gcc/var-tracking.c  (revision 206599)
> +++ gcc/var-tracking.c  (working copy)
> @@ -6934,12 +6934,12 @@ vt_find_locations (void)
>    bool success = true;
>  
>    timevar_push (TV_VAR_TRACKING_DATAFLOW);
> -  /* Compute reverse completion order of depth first search of the CFG
> +  /* Compute reverse top sord order of the inverted CFG
>       so that the data-flow runs faster.  */
> -  rc_order = XNEWVEC (int, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS);
> +  rc_order = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
>    bb_order = XNEWVEC (int, last_basic_block_for_fn (cfun));
> -  pre_and_rev_post_order_compute (NULL, rc_order, false);
> -  for (i = 0; i < n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS; i++)
> +  int num = inverted_post_order_compute (rc_order);
> +  for (i = 0; i < num; i++)
>      bb_order[rc_order[i]] = i;
>    free (rc_order);

If it doesn't regress on other testcases (var-tracking speed wise), then that
LGTM.

[Bug c++/53958] set_slot_part and canon_value_cmp using 90% of compile time

Reply via email to