On Fri, Nov 13, 2015 at 7:31 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Fri, Nov 13, 2015 at 2:13 PM, Jeff Law <l...@redhat.com> wrote: >> On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote: >> >>> >>> 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch >>> >>> >>> From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 2001 >>> From: Ajit Kumar Agarwal<ajit...@xilix.com> >>> Date: Wed, 7 Oct 2015 20:50:40 +0200 >>> Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used >>> inside >>> loop for LICM and IVOPTS. >>> >>> Changes are done in the Loop Invariant(LICM) at RTL level and also the >>> Induction variable optimization based on SSA representation. The current >>> logic used in LICM for register used inside the loops is changed. The >>> Live Out of the loop latch node and the Live in of the destination of >>> the exit nodes is used to set the Loops Liveness at the exit of the Loop. >>> The register used is the number of live variables at the exit of the >>> Loop calculated above. >>> >>> For Induction variable optimization on tree SSA representation, the >>> register >>> used logic is based on the number of phi nodes at the loop header to >>> represent >>> the liveness at the loop. Current Logic used only the number of phi nodes >>> at >>> the loop header. Changes are made to represent the phi operands also live >>> at >>> the loop. Thus number of phi operands also gets incremented in the number >>> of >>> registers used. >>> >>> ChangeLog: >>> 2015-10-09 Ajit Agarwal<ajit...@xilinx.com> >>> >>> * loop-invariant.c (compute_loop_liveness): New. >>> (determine_regs_used): New. >>> (find_invariants_to_move): Use of determine_regs_used. >>> * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi >>> arguments for register used. >> >> I think Bin rejected the tree-ssa-loop-ivopts change. However, the >> loop-invariant change is still pending, right? > Ah, reject is a strong word, I am just being dumb and don't understand > why it's a general better estimation yet. > Maybe Richard have some inputs here?
Not really. I agree with Bin that the change doesn't look like an improvement by design (might be one by accident for some benchmarks). Richard. > Thanks, > bin >> >> >>> >>> Signed-off-by:Ajit agarwalajit...@xilinx.com >>> --- >>> gcc/loop-invariant.c | 72 >>> +++++++++++++++++++++++++++++++++++++--------- >>> gcc/tree-ssa-loop-ivopts.c | 4 +-- >>> 2 files changed, 60 insertions(+), 16 deletions(-) >>> >>> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c >>> index 52c8ae8..e4291c9 100644 >>> --- a/gcc/loop-invariant.c >>> +++ b/gcc/loop-invariant.c >>> @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain) >>> } >>> } >>> >>> +static int >>> +determine_regs_used() >>> +{ >>> + unsigned int j; >>> + unsigned int reg_used = 2; >>> + bitmap_iterator bi; >>> + >>> + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi) >>> + (reg_used) ++; >>> + >>> + return reg_used; >>> +} >> >> Isn't this just bitmap_count_bits (regs_live) + 2? >> >> >>> @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void) >>> } >>> } >>> >>> - >>> +static void >>> +calculate_loop_liveness (void) >> >> Needs a function comment. >> >> >>> +{ >>> + basic_block bb; >>> + struct loop *loop; >>> >>> -/* Move the invariants out of the loops. */ >>> + FOR_EACH_LOOP (loop, 0) >>> + if (loop->aux == NULL) >>> + { >>> + loop->aux = xcalloc (1, sizeof (struct loop_data)); >>> + bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); >>> + } >>> + >>> + FOR_EACH_BB_FN (bb, cfun) >> >> Why loop over blocks here? Why not just iterate through all the loops in >> the loop structure. Order isn't particularly important AFAICT for this >> code. >> >> >> >>> + { >>> + int i; >>> + edge e; >>> + vec<edge> edges; >>> + edges = get_loop_exit_edges (loop); >>> + FOR_EACH_VEC_ELT (edges, i, e) >>> + { >>> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, >>> DF_LR_OUT(e->src)); >>> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, >>> DF_LR_IN(e->dest)); >> >> Space before the open-paren in the previous two lines >> DF_LR_OUT (e->src) and FD_LR_INT (e->dest)) >> >> >>> + } >>> + } >>> + } >>> +} >>> + >>> +/* Move the invariants ut of the loops. */ >> >> Looks like you introduced a typo. >> >> I'd like to see testcases which show the change in # regs used computation >> helping generate better code. >> >> And I'd also like to see some background information on why you think this >> is a more accurate measure for the number of registers used in the loop. >> regs_used AFAICT is supposed to be an estimate of the registers live around >> the loop. So ISTM that you get that value by live-out set on the backedge >> of the loop. I guess you get somethign similar by looking at the exit >> edge's source block's live-out set. But I don't see any value in including >> stuff live at the block outside the loop. >> >> It also seems fairly non-intuitive. Get the block's latch and use its >> live-out set. That seems more intuitive. >>