This patch adds a few key messages summarizing loop unrolls and peels using the new dump infrastructure, along with an appropriate location for each loop. See also the following threads for context:
http://gcc.gnu.org/ml/gcc/2012-12/msg00056.html http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01000.html The new high-level optimization summary messages are emitted for both the tree complete unroller and the rtl loop unroller/peeler, and are emitted to both the dump files under the appropriate -fdump-* option, as well as under -fopt-info. A few existing messages were removed as they are subsumed by the new summary messages. Also included is a new routine (get_loop_location), used to identify an appropriate location for the loop to use with the summary message (usually the line containing the loop condition), which was ported from the google branches. I also made a change to dump_loc to enable the filename to be printed when dumping the source location, as we already were doing when the location was unknown and we used the function location. This makes it feasible to grep the output for messages and see the full source location in each message. Sharad plans to submit a follow-on patch that will convert the bulk of the existing dump messages to the new dump infrastructure. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? 2012-12-17 Teresa Johnson <tejohn...@google.com> * dumpfile.c (dump_loc): Print filename with location. * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Use new location_t parameter to emit complete unroll message with new dump framework. (canonicalize_loop_induction_variables): Compute loops location and pass to try_unroll_loop_completely. * loop-unroll.c (report_unroll_peel): New function. (peel_loops_completely): Use new dump format with location for main dumpfile message, and invoke report_unroll_peel on success. (decide_unrolling_and_peeling): Ditto. (decide_peel_once_rolling): Remove old dumpfile message subsumed by report_unroll_peel. (decide_peel_completely): Ditto. (decide_unroll_constant_iterations): Ditto. (decide_unroll_runtime_iterations): Ditto. (decide_peel_simple): Ditto. (decide_unroll_stupid): Ditto. * cfgloop.c (get_loop_location): New function. * cfgloop.h (get_loop_location): Declare. testsuite/ * gcc.dg/tree-ssa/loop-1.c: Update expected dump message. * gcc.dg/tree-ssa/loop-23.c: Ditto. * gcc.dg/tree-ssa/cunroll-1.c: Ditto. * gcc.dg/tree-ssa/cunroll-2.c: Ditto. * gcc.dg/tree-ssa/cunroll-3.c: Ditto. * gcc.dg/tree-ssa/cunroll-4.c: Ditto. * gcc.dg/tree-ssa/cunroll-5.c: Ditto. * testsuite/gcc.dg/unroll_1.c: * testsuite/gcc.dg/unroll_2.c: * testsuite/gcc.dg/unroll_3.c: * testsuite/gcc.dg/unroll_4.c: Index: dumpfile.c =================================================================== --- dumpfile.c (revision 194516) +++ dumpfile.c (working copy) @@ -265,7 +265,9 @@ dump_loc (int dump_kind, FILE *dfile, source_locat DECL_SOURCE_FILE (current_function_decl), DECL_SOURCE_LINE (current_function_decl)); else - fprintf (dfile, "\n%d: ", LOCATION_LINE (loc)); + fprintf (dfile, "\n%s:%d: note: ", + LOCATION_FILE (loc), + LOCATION_LINE (loc)); } } Index: testsuite/gcc.dg/unroll_2.c =================================================================== --- testsuite/gcc.dg/unroll_2.c (revision 194516) +++ testsuite/gcc.dg/unroll_2.c (working copy) @@ -28,6 +28,6 @@ int foo2(void) return 1; } -/* { dg-final { scan-rtl-dump-times "Decided to peel loop completely" 1 "loop2_unroll" } } */ +/* { dg-final { scan-rtl-dump-times "Turned loop into non-loop; it never loops" 1 "loop2_unroll" } } */ /* { dg-final { cleanup-rtl-dump "loop2_unroll" } } */ /* { dg-excess-errors "extra notes" } */ Index: testsuite/gcc.dg/unroll_3.c =================================================================== --- testsuite/gcc.dg/unroll_3.c (revision 194516) +++ testsuite/gcc.dg/unroll_3.c (working copy) @@ -28,6 +28,6 @@ int foo2(void) return 1; } -/* { dg-final { scan-rtl-dump-times "Decided to peel loop completely" 1 "loop2_unroll" } } */ +/* { dg-final { scan-rtl-dump-times "Turned loop into non-loop; it never loops" 1 "loop2_unroll" } } */ /* { dg-final { cleanup-rtl-dump "loop2_unroll" } } */ /* { dg-excess-errors "extra notes" } */ Index: testsuite/gcc.dg/tree-ssa/loop-1.c =================================================================== --- testsuite/gcc.dg/tree-ssa/loop-1.c (revision 194516) +++ testsuite/gcc.dg/tree-ssa/loop-1.c (working copy) @@ -33,7 +33,7 @@ int xxx(void) /* { dg-final { scan-tree-dump-times "Added canonical iv to loop 1, 4 iterations" 1 "ivcanon"} } */ /* { dg-final { cleanup-tree-dump "ivcanon" } } */ -/* { dg-final { scan-tree-dump-times "Unrolled loop 1 completely" 1 "cunroll"} } */ +/* { dg-final { scan-tree-dump-times "Completely unroll loop 4 times" 1 "cunroll"} } */ /* { dg-final { cleanup-tree-dump "cunroll" } } */ /* { dg-final { scan-tree-dump-times "foo" 5 "optimized"} } */ /* { dg-final { cleanup-tree-dump "optimized" } } */ Index: testsuite/gcc.dg/tree-ssa/cunroll-1.c =================================================================== --- testsuite/gcc.dg/tree-ssa/cunroll-1.c (revision 194516) +++ testsuite/gcc.dg/tree-ssa/cunroll-1.c (working copy) @@ -8,6 +8,6 @@ test(int c) a[i]=5; } /* Array bounds says the loop will not roll much. */ -/* { dg-final { scan-tree-dump "Unrolled loop 1 completely .duplicated 2 times.." "cunrolli"} } */ +/* { dg-final { scan-tree-dump "Completely unroll loop 2 times" "cunrolli"} } */ /* { dg-final { scan-tree-dump "Last iteration exit edge was proved true." "cunrolli"} } */ /* { dg-final { cleanup-tree-dump "cunrolli" } } */ Index: testsuite/gcc.dg/tree-ssa/cunroll-2.c =================================================================== --- testsuite/gcc.dg/tree-ssa/cunroll-2.c (revision 194516) +++ testsuite/gcc.dg/tree-ssa/cunroll-2.c (working copy) @@ -12,5 +12,5 @@ test(int c) } } /* We are not able to get rid of the final conditional because the loop has two exits. */ -/* { dg-final { scan-tree-dump "Unrolled loop 1 completely .duplicated 1 times.." "cunroll"} } */ +/* { dg-final { scan-tree-dump "Completely unroll loop 1 times" "cunroll"} } */ /* { dg-final { cleanup-tree-dump "cunroll" } } */ Index: testsuite/gcc.dg/tree-ssa/loop-23.c =================================================================== --- testsuite/gcc.dg/tree-ssa/loop-23.c (revision 194516) +++ testsuite/gcc.dg/tree-ssa/loop-23.c (working copy) @@ -24,6 +24,6 @@ int foo(void) return sum; } -/* { dg-final { scan-tree-dump-times "Unrolled loop 1 completely" 1 "cunroll" } } */ +/* { dg-final { scan-tree-dump-times "Completely unroll loop 3 times" 1 "cunroll" } } */ /* { dg-final { cleanup-tree-dump "cunroll" } } */ Index: testsuite/gcc.dg/tree-ssa/cunroll-3.c =================================================================== --- testsuite/gcc.dg/tree-ssa/cunroll-3.c (revision 194516) +++ testsuite/gcc.dg/tree-ssa/cunroll-3.c (working copy) @@ -11,5 +11,5 @@ test(int c) } /* If we start duplicating headers prior curoll, this loop will have 0 iterations. */ -/* { dg-final { scan-tree-dump "Unrolled loop 1 completely .duplicated 1 times.." "cunrolli"} } */ +/* { dg-final { scan-tree-dump "Completely unroll loop 1 times" "cunrolli"} } */ /* { dg-final { cleanup-tree-dump "cunrolli" } } */ Index: testsuite/gcc.dg/tree-ssa/cunroll-4.c =================================================================== --- testsuite/gcc.dg/tree-ssa/cunroll-4.c (revision 194516) +++ testsuite/gcc.dg/tree-ssa/cunroll-4.c (working copy) @@ -16,6 +16,6 @@ test(int c) /* We should do this as part of cunrolli, but our cost model do not take into account early exit from the last iteration. */ -/* { dg-final { scan-tree-dump "Turned loop 1 to non-loop; it never loops." "ivcanon"} } */ +/* { dg-final { scan-tree-dump "Turned loop into non-loop; it never loops." "ivcanon"} } */ /* { dg-final { scan-tree-dump "Last iteration exit edge was proved true." "ivcanon"} } */ /* { dg-final { cleanup-tree-dump "ivcanon" } } */ Index: testsuite/gcc.dg/tree-ssa/cunroll-5.c =================================================================== --- testsuite/gcc.dg/tree-ssa/cunroll-5.c (revision 194516) +++ testsuite/gcc.dg/tree-ssa/cunroll-5.c (working copy) @@ -8,7 +8,7 @@ test(int c) a[i]=5; } /* Basic testcase for complette unrolling. */ -/* { dg-final { scan-tree-dump "Unrolled loop 1 completely .duplicated 5 times.." "cunroll"} } */ +/* { dg-final { scan-tree-dump "Completely unroll loop 5 times" "cunroll"} } */ /* { dg-final { scan-tree-dump "Exit condition of peeled iterations was eliminated." "cunroll"} } */ /* { dg-final { scan-tree-dump "Last iteration exit edge was proved true." "cunroll"} } */ /* { dg-final { cleanup-tree-dump "cunroll" } } */ Index: testsuite/gcc.dg/unroll_4.c =================================================================== --- testsuite/gcc.dg/unroll_4.c (revision 194516) +++ testsuite/gcc.dg/unroll_4.c (working copy) @@ -28,6 +28,6 @@ int foo2(void) return 1; } -/* { dg-final { scan-rtl-dump-times "Decided to peel loop completely" 1 "loop2_unroll" } } */ +/* { dg-final { scan-rtl-dump-times "Turned loop into non-loop; it never loops" 1 "loop2_unroll" } } */ /* { dg-final { cleanup-rtl-dump "loop2_unroll" } } */ /* { dg-excess-errors "extra notes" } */ Index: testsuite/gcc.dg/unroll_1.c =================================================================== --- testsuite/gcc.dg/unroll_1.c (revision 194516) +++ testsuite/gcc.dg/unroll_1.c (working copy) @@ -28,5 +28,5 @@ int foo2(void) return 1; } -/* { dg-final { scan-rtl-dump-times "Decided to peel loop completely" 2 "loop2_unroll" } } */ +/* { dg-final { scan-rtl-dump-times "Turned loop into non-loop; it never loops" 2 "loop2_unroll" } } */ /* { dg-final { cleanup-rtl-dump "loop2_unroll" } } */ Index: tree-ssa-loop-ivcanon.c =================================================================== --- tree-ssa-loop-ivcanon.c (revision 194516) +++ tree-ssa-loop-ivcanon.c (working copy) @@ -639,22 +639,24 @@ unloop_loops (bitmap loop_closed_ssa_invalidated, /* Tries to unroll LOOP completely, i.e. NITER times. UL determines which loops we are allowed to unroll. - EXIT is the exit of the loop that should be eliminated. + EXIT is the exit of the loop that should be eliminated. MAXITER specfy bound on number of iterations, -1 if it is - not known or too large for HOST_WIDE_INT. */ + not known or too large for HOST_WIDE_INT. The location + LOCUS corresponding to the loop is used when emitting + a summary of the unroll to the dump file. */ static bool try_unroll_loop_completely (struct loop *loop, edge exit, tree niter, enum unroll_level ul, - HOST_WIDE_INT maxiter) + HOST_WIDE_INT maxiter, + location_t locus) { unsigned HOST_WIDE_INT n_unroll, ninsns, max_unroll, unr_insns; gimple cond; struct loop_size size; bool n_unroll_found = false; edge edge_to_cancel = NULL; - int num = loop->num; /* See if we proved number of iterations to be low constant. @@ -862,14 +864,25 @@ try_unroll_loop_completely (struct loop *loop, loops_to_unloop.safe_push (loop); loops_to_unloop_nunroll.safe_push (n_unroll); - if (dump_file && (dump_flags & TDF_DETAILS)) + if (dump_enabled_p ()) { if (!n_unroll) - fprintf (dump_file, "Turned loop %d to non-loop; it never loops.\n", - num); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS | TDF_DETAILS, locus, + "Turned loop into non-loop; it never loops.\n"); else - fprintf (dump_file, "Unrolled loop %d completely " - "(duplicated %i times).\n", num, (int)n_unroll); + { + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS | TDF_DETAILS, locus, + "Completely unroll loop %d times", (int)n_unroll); + if (profile_info) + dump_printf (MSG_OPTIMIZED_LOCATIONS | TDF_DETAILS, + " (header execution count %d)", + (int)loop->header->count); + dump_printf (MSG_OPTIMIZED_LOCATIONS | TDF_DETAILS, "\n"); + } + } + + if (dump_file && (dump_flags & TDF_DETAILS)) + { if (exit) fprintf (dump_file, "Exit condition of peeled iterations was " "eliminated.\n"); @@ -898,15 +911,19 @@ canonicalize_loop_induction_variables (struct loop tree niter; HOST_WIDE_INT maxiter; bool modified = false; + location_t locus = UNKNOWN_LOCATION; niter = number_of_latch_executions (loop); if (TREE_CODE (niter) == INTEGER_CST) - exit = single_exit (loop); + { + exit = single_exit (loop); + locus = gimple_location (last_stmt (exit->src)); + } else { /* If the loop has more than one exit, try checking all of them for # of iterations determinable through scev. */ - if (!single_exit (loop)) + if (!(exit = single_exit (loop))) niter = find_loop_niter (loop, &exit); /* Finally if everything else fails, try brute force evaluation. */ @@ -915,6 +932,9 @@ canonicalize_loop_induction_variables (struct loop || TREE_CODE (niter) != INTEGER_CST)) niter = find_loop_niter_by_eval (loop, &exit); + if (exit) + locus = gimple_location (last_stmt (exit->src)); + if (TREE_CODE (niter) != INTEGER_CST) exit = NULL; } @@ -949,7 +969,7 @@ canonicalize_loop_induction_variables (struct loop populates the loop bounds. */ modified |= remove_redundant_iv_tests (loop); - if (try_unroll_loop_completely (loop, exit, niter, ul, maxiter)) + if (try_unroll_loop_completely (loop, exit, niter, ul, maxiter, locus)) return true; if (create_iv Index: loop-unroll.c =================================================================== --- loop-unroll.c (revision 194516) +++ loop-unroll.c (working copy) @@ -148,6 +148,61 @@ static void combine_var_copies_in_loop_exit (struc basic_block); static rtx get_expansion (struct var_to_expand *); +/* Emit a message summarizing the unroll or peel that will be + performed for LOOP, along with the loop's location LOCUS, if + appropriate given the dump or -fopt-info settings. */ + +static void +report_unroll_peel(struct loop *loop, location_t locus) +{ + struct niter_desc *desc; + int niters = 0; + int report_flags = MSG_OPTIMIZED_LOCATIONS | TDF_RTL | TDF_DETAILS; + + if (!dump_enabled_p ()) + return; + + /* In the special case where the loop never iterated, emit + a different message so that we don't report an unroll by 0. + This matches the equivalent message emitted during tree unrolling. */ + if (loop->lpt_decision.decision == LPT_PEEL_COMPLETELY + && !loop->lpt_decision.times) + { + dump_printf_loc (report_flags, locus, + "Turned loop into non-loop; it never loops.\n"); + return; + } + + desc = get_simple_loop_desc (loop); + + if (desc->const_iter) + niters = desc->niter; + else if (loop->header->count) + niters = expected_loop_iterations (loop); + + dump_printf_loc (report_flags, locus, + "%s loop %d times", + (loop->lpt_decision.decision == LPT_PEEL_COMPLETELY + ? "Completely unroll" + : (loop->lpt_decision.decision == LPT_PEEL_SIMPLE + ? "Peel" : "Unroll")), + loop->lpt_decision.times); + if (profile_info) + dump_printf (report_flags, + " (header execution count %d", + (int)loop->header->count); + if (loop->lpt_decision.decision == LPT_PEEL_COMPLETELY) + dump_printf (report_flags, + "%s%s iterations %d)", + profile_info ? ", " : " (", + desc->const_iter ? "const" : "average", + niters); + else if (profile_info) + dump_printf (report_flags, ")"); + + dump_printf (report_flags, "\n"); +} + /* Unroll and/or peel (depending on FLAGS) LOOPS. */ void unroll_and_peel_loops (int flags) @@ -234,11 +289,13 @@ peel_loops_completely (int flags) FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST) { loop->lpt_decision.decision = LPT_NONE; + location_t locus = get_loop_location(loop); - if (dump_file) - fprintf (dump_file, - "\n;; *** Considering loop %d for complete peeling ***\n", - loop->num); + if (dump_enabled_p ()) + dump_printf_loc (TDF_RTL, locus, + ";; *** Considering loop %d at BB %d for " + "complete peeling ***\n", + loop->num, loop->header->index); loop->ninsns = num_loop_insns (loop); @@ -248,6 +305,7 @@ peel_loops_completely (int flags) if (loop->lpt_decision.decision == LPT_PEEL_COMPLETELY) { + report_unroll_peel(loop, locus); peel_loop_completely (loop); #ifdef ENABLE_CHECKING verify_loop_structure (); @@ -267,9 +325,13 @@ decide_unrolling_and_peeling (int flags) FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST) { loop->lpt_decision.decision = LPT_NONE; + location_t locus = get_loop_location(loop); - if (dump_file) - fprintf (dump_file, "\n;; *** Considering loop %d ***\n", loop->num); + if (dump_enabled_p ()) + dump_printf_loc (TDF_RTL, locus, + ";; *** Considering loop %d at BB %d for " + "unrolling and peeling ***\n", + loop->num, loop->header->index); /* Do not peel cold areas. */ if (optimize_loop_for_size_p (loop)) @@ -309,6 +371,8 @@ decide_unrolling_and_peeling (int flags) decide_unroll_stupid (loop, flags); if (loop->lpt_decision.decision == LPT_NONE) decide_peel_simple (loop, flags); + + report_unroll_peel(loop, locus); } } @@ -348,8 +412,6 @@ decide_peel_once_rolling (struct loop *loop, int f } /* Success. */ - if (dump_file) - fprintf (dump_file, ";; Decided to peel exactly once rolling loop\n"); loop->lpt_decision.decision = LPT_PEEL_COMPLETELY; } @@ -429,8 +491,6 @@ decide_peel_completely (struct loop *loop, int fla } /* Success. */ - if (dump_file) - fprintf (dump_file, ";; Decided to peel loop completely\n"); loop->lpt_decision.decision = LPT_PEEL_COMPLETELY; } @@ -608,10 +668,6 @@ decide_unroll_constant_iterations (struct loop *lo loop->lpt_decision.decision = LPT_UNROLL_CONSTANT; loop->lpt_decision.times = best_unroll; - - if (dump_file) - fprintf (dump_file, ";; Decided to unroll the loop %d times (%d copies).\n", - loop->lpt_decision.times, best_copies); } /* Unroll LOOP with constant number of iterations LOOP->LPT_DECISION.TIMES times. @@ -893,10 +949,6 @@ decide_unroll_runtime_iterations (struct loop *loo loop->lpt_decision.decision = LPT_UNROLL_RUNTIME; loop->lpt_decision.times = i - 1; - - if (dump_file) - fprintf (dump_file, ";; Decided to unroll the loop %d times.\n", - loop->lpt_decision.times); } /* Splits edge E and inserts the sequence of instructions INSNS on it, and @@ -1305,10 +1357,6 @@ decide_peel_simple (struct loop *loop, int flags) /* Success. */ loop->lpt_decision.decision = LPT_PEEL_SIMPLE; loop->lpt_decision.times = npeel; - - if (dump_file) - fprintf (dump_file, ";; Decided to simply peel the loop %d times.\n", - loop->lpt_decision.times); } /* Peel a LOOP LOOP->LPT_DECISION.TIMES times. The transformation does this: @@ -1460,10 +1508,6 @@ decide_unroll_stupid (struct loop *loop, int flags loop->lpt_decision.decision = LPT_UNROLL_STUPID; loop->lpt_decision.times = i - 1; - - if (dump_file) - fprintf (dump_file, ";; Decided to unroll the loop stupidly %d times.\n", - loop->lpt_decision.times); } /* Unroll a LOOP LOOP->LPT_DECISION.TIMES times. The transformation does this: Index: cfgloop.c =================================================================== --- cfgloop.c (revision 194516) +++ cfgloop.c (working copy) @@ -1666,3 +1666,59 @@ loop_exits_from_bb_p (struct loop *loop, basic_blo return false; } + +/* Return location corresponding to the loop control condition if possible. */ + +location_t +get_loop_location (struct loop *loop) +{ + rtx insn = NULL; + struct niter_desc *desc = NULL; + edge exit; + + /* For a for or while loop, we would like to return the location + * of the for or while statement, if possible. To do this, look + * for the branch guarding the loop back-edge. + */ + + /* If this is a simple loop with an in_edge, then the loop control + * branch is typically at the end of its source. + */ + desc = get_simple_loop_desc (loop); + if (desc->in_edge) + { + FOR_BB_INSNS_REVERSE (desc->in_edge->src, insn) + { + if (INSN_P (insn) && INSN_HAS_LOCATION (insn)) + return INSN_LOCATION (insn); + } + } + /* If loop has a single exit, then the loop control branch + * must be at the end of its source. + */ + if ((exit = single_exit(loop))) + { + FOR_BB_INSNS_REVERSE (exit->src, insn) + { + if (INSN_P (insn) && INSN_HAS_LOCATION (insn)) + return INSN_LOCATION (insn); + } + } + /* Next check the latch, to see if it is non-empty. */ + FOR_BB_INSNS_REVERSE (loop->latch, insn) + { + if (INSN_P (insn) && INSN_HAS_LOCATION (insn)) + return INSN_LOCATION (insn); + } + /* Finally, if none of the above identifies the loop control branch, + * return the first location in the loop header. + */ + FOR_BB_INSNS (loop->header, insn) + { + if (INSN_P (insn) && INSN_HAS_LOCATION (insn)) + return INSN_LOCATION (insn); + } + /* If all else fails, simply return the current function location. */ + return DECL_SOURCE_LOCATION (current_function_decl); +} + Index: cfgloop.h =================================================================== --- cfgloop.h (revision 194516) +++ cfgloop.h (working copy) @@ -239,6 +239,7 @@ extern bool loop_exit_edge_p (const struct loop *, extern bool loop_exits_to_bb_p (struct loop *, basic_block); extern bool loop_exits_from_bb_p (struct loop *, basic_block); extern void mark_loop_exit_edges (void); +extern location_t get_loop_location (struct loop *loop); /* Loops & cfg manipulation. */ extern basic_block *get_loop_body (const struct loop *); -- This patch is available for review at http://codereview.appspot.com/6941070