On Fri, Jan 13, 2012 at 6:18 PM, Michael Matz <[email protected]> wrote:
> Hi,
>
> the stack-var conflict generation code needs 13 (out of 34) seconds, with
> -O0 on the second testcase of PR46590. Most of the time is spent in
> generating the same conflicts again and again at each basic block (the
> time right now is O(nr-of-bbs * N^2) where the number of conflicts is
> O(N^2)).
>
> If we simply remember for which partitions we already generated the lower
> triangular conflict matrix we don't have to do that again, lowering the
> overall time to O(N^2).
>
> This patch does that. Time for variable expansion now is 0.4 seconds (of
> overall 22).
>
> Regstrapping in progress on x86_64-linux, okay for trunk?
Ok.
Thanks,
Richard.
>
> Ciao,
> Michael.
>
> * cfgexpand.c (add_scope_conflicts_1): New old_conflicts argument,
> use it in remembering which conflicts we already created.
> (add_scope_conflicts): Adjust call to above, (de)allocate helper
> bitmap.
>
> Index: cfgexpand.c
> ===================================================================
> *** cfgexpand.c (revision 183155)
> --- cfgexpand.c (working copy)
> *************** visit_conflict (gimple stmt ATTRIBUTE_UN
> *** 441,451 ****
>
> /* Helper routine for add_scope_conflicts, calculating the active partitions
> at the end of BB, leaving the result in WORK. We're called to generate
> ! conflicts when FOR_CONFLICT is true, otherwise we're just tracking
> ! liveness. */
>
> static void
> ! add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict)
> {
> edge e;
> edge_iterator ei;
> --- 441,452 ----
>
> /* Helper routine for add_scope_conflicts, calculating the active partitions
> at the end of BB, leaving the result in WORK. We're called to generate
> ! conflicts when OLD_CONFLICTS is non-null, otherwise we're just tracking
> ! liveness. If we generate conflicts then OLD_CONFLICTS stores the bits
> ! for which we generated conflicts already. */
>
> static void
> ! add_scope_conflicts_1 (basic_block bb, bitmap work, bitmap old_conflicts)
> {
> edge e;
> edge_iterator ei;
> *************** add_scope_conflicts_1 (basic_block bb, b
> *** 482,488 ****
> }
> else if (!is_gimple_debug (stmt))
> {
> ! if (for_conflict
> && visit == visit_op)
> {
> /* If this is the first real instruction in this BB we need
> --- 483,489 ----
> }
> else if (!is_gimple_debug (stmt))
> {
> ! if (old_conflicts
> && visit == visit_op)
> {
> /* If this is the first real instruction in this BB we need
> *************** add_scope_conflicts_1 (basic_block bb, b
> *** 490,505 ****
> Unlike classical liveness for named objects we can't
> rely on seeing a def/use of the names we're interested in.
> There might merely be indirect loads/stores. We'd not add any
> ! conflicts for such partitions. */
> bitmap_iterator bi;
> unsigned i;
> ! EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
> {
> unsigned j;
> bitmap_iterator bj;
> ! EXECUTE_IF_SET_IN_BITMAP (work, i + 1, j, bj)
> add_stack_var_conflict (i, j);
> }
> visit = visit_conflict;
> }
> walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
> --- 491,521 ----
> Unlike classical liveness for named objects we can't
> rely on seeing a def/use of the names we're interested in.
> There might merely be indirect loads/stores. We'd not add any
> ! conflicts for such partitions. We know that we generated
> ! conflicts between all partitions in old_conflicts already,
> ! so we need to generate only the new ones, avoiding to
> ! repeatedly pay the O(N^2) cost for each basic block. */
> bitmap_iterator bi;
> unsigned i;
> !
> ! /* First the conflicts between new and old_conflicts members.
> */
> ! EXECUTE_IF_AND_COMPL_IN_BITMAP (work, old_conflicts, 0, i, bi)
> {
> unsigned j;
> bitmap_iterator bj;
> ! EXECUTE_IF_SET_IN_BITMAP (old_conflicts, 0, j, bj)
> add_stack_var_conflict (i, j);
> }
> + /* Then the conflicts between only the new members. */
> + EXECUTE_IF_AND_COMPL_IN_BITMAP (work, old_conflicts, 0, i, bi)
> + {
> + unsigned j;
> + bitmap_iterator bj;
> + EXECUTE_IF_AND_COMPL_IN_BITMAP (work, old_conflicts, 0, j,
> bj)
> + add_stack_var_conflict (i, j);
> + }
> + /* And remember for the next basic block. */
> + bitmap_ior_into (old_conflicts, work);
> visit = visit_conflict;
> }
> walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
> *************** add_scope_conflicts (void)
> *** 516,521 ****
> --- 532,538 ----
> basic_block bb;
> bool changed;
> bitmap work = BITMAP_ALLOC (NULL);
> + bitmap old_conflicts;
>
> /* We approximate the live range of a stack variable by taking the first
> mention of its name as starting point(s), and by the end-of-scope
> *************** add_scope_conflicts (void)
> *** 537,551 ****
> FOR_EACH_BB (bb)
> {
> bitmap active = (bitmap)bb->aux;
> ! add_scope_conflicts_1 (bb, work, false);
> if (bitmap_ior_into (active, work))
> changed = true;
> }
> }
>
> FOR_EACH_BB (bb)
> ! add_scope_conflicts_1 (bb, work, true);
>
> BITMAP_FREE (work);
> FOR_ALL_BB (bb)
> BITMAP_FREE (bb->aux);
> --- 554,571 ----
> FOR_EACH_BB (bb)
> {
> bitmap active = (bitmap)bb->aux;
> ! add_scope_conflicts_1 (bb, work, NULL);
> if (bitmap_ior_into (active, work))
> changed = true;
> }
> }
>
> + old_conflicts = BITMAP_ALLOC (NULL);
> +
> FOR_EACH_BB (bb)
> ! add_scope_conflicts_1 (bb, work, old_conflicts);
>
> + BITMAP_FREE (old_conflicts);
> BITMAP_FREE (work);
> FOR_ALL_BB (bb)
> BITMAP_FREE (bb->aux);