On Sat, Jul 4, 2015 at 2:39 PM, Ajit Kumar Agarwal <ajit.kumar.agar...@xilinx.com> wrote: > > > -----Original Message----- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Tuesday, June 30, 2015 4:42 PM > To: Ajit Kumar Agarwal > Cc: l...@redhat.com; GCC Patches; Vinod Kathail; Shail Aditya Gupta; > Vidhumouli Hunsigida; Nagaraju Mekala > Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree > ssa representation > > On Tue, Jun 30, 2015 at 10:16 AM, Ajit Kumar Agarwal > <ajit.kumar.agar...@xilinx.com> wrote: >> All: >> >> The below patch added a new path Splitting optimization pass on SSA >> representation. The Path Splitting optimization Pass moves the join >> block of if-then-else same as loop latch to its predecessors and get merged >> with the predecessors Preserving the SSA representation. >> >> The patch is tested for Microblaze and i386 target. The EEMBC/Mibench >> benchmarks is run with the Microblaze target And the performance gain >> of 9.15% and rgbcmy01_lite(EEMBC benchmarks). The Deja GNU tests is run for >> Mircroblaze Target and no regression is seen for Microblaze target and the >> new testcase attached are passed. >> >> For i386 bootstrapping goes through fine and the Spec cpu2000 >> benchmarks is run with this patch. Following observation were seen with spec >> cpu2000 benchmarks. >> >> Ratio of path splitting change vs Ratio of not having path splitting change >> is 3653.353 vs 3652.14 for INT benchmarks. >> Ratio of path splitting change vs Ratio of not having path splitting change >> is 4353.812 vs 4345.351 for FP benchmarks. >> >> Based on comments from RFC patch following changes were done. >> >> 1. Added a new pass for path splitting changes. >> 2. Placed the new path Splitting Optimization pass before the copy >> propagation pass. >> 3. The join block same as the Loop latch is wired into its >> predecessors so that the CFG Cleanup pass will merge the blocks Wired >> together. >> 4. Copy propagation routines added for path splitting changes is not >> needed as suggested by Jeff. They are removed in the patch as The copy >> propagation in the copied join blocks will be done by the existing copy >> propagation pass and the update ssa pass. >> 5. Only the propagation of phi results of the join block with the phi >> argument is done which will not be done by the existing update_ssa Or copy >> propagation pass on tree ssa representation. >> 6. Added 2 tests. >> a) compilation check tests. >> b) execution tests. >> 7. Refactoring of the code for the feasibility check and finding the join >> block same as loop latch node. >> >> [Patch,tree-optimization]: Add new path Splitting pass on tree ssa >> representation. >> >> Added a new pass on path splitting on tree SSA representation. The path >> splitting optimization does the CFG transformation of join block of the >> if-then-else same as the loop latch node is moved and merged with the >> predecessor blocks after preserving the SSA representation. >> >> ChangeLog: >> 2015-06-30 Ajit Agarwal <ajit...@xilinx.com> >> >> * gcc/Makefile.in: Add the build of the new file >> tree-ssa-path-split.c >> * gcc/common.opt: Add the new flag ftree-path-split. >> * gcc/opts.c: Add an entry for Path splitting pass >> with optimization flag greater and equal to O2. >> * gcc/passes.def: Enable and add new pass path splitting. >> * gcc/timevar.def: Add the new entry for TV_TREE_PATH_SPLIT. >> * gcc/tree-pass.h: Extern Declaration of make_pass_path_split. >> * gcc/tree-ssa-path-split.c: New file for path splitting pass. >> * gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c: New testcase. >> * gcc/testsuite/gcc.dg/path-split-1.c: New testcase. > >>>I'm not 100% sure I understand the transform but what I see from the >>>testcases it tail-duplicates from a conditional up to a loop latch block >>>(not sure if it >>includes it and thus ends up creating a loop nest or not). > >>>An observation I have is that the pass should at least share the transform >>>stage to some extent with the existing tracer pass (tracer.c) which >>>essentially does >>the same but not restricted to loops in any way. > > The following piece of code from tracer.c can be shared with the existing > path splitting pass. > > { > e = find_edge (bb, bb2); > > copy = duplicate_block (bb2, e, bb); > flush_pending_stmts (e); > > add_phi_args_after_copy (©, 1, NULL); > } > > Sharing the above code of the transform stage of tracer.c with the path > splitting pass has the following limitation. > > 1. The duplicated loop latch node is wired to its predecessors and the > existing phi node in the loop latch node with the > Phi arguments from its corresponding predecessors is moved to the duplicated > loop latch node that is wired into its predecessors. Due > To this, the duplicated loop latch nodes wired into its predecessors will not > be merged with the original predecessors by CFG cleanup phase . > >>> So I wonder if your pass could be simply another heuristic to compute paths >>> to trace in the existing tracer pass. > > Sorry, I am not very clear when you say the above. I am trying to figure out > whether you expect the existing pass of tracer.c should be modified > Or the path splitting pass should coexist.
Yes, I was wondering whether tracer.c could be simply modified. Both transforms are doing something very similar. > My understanding of existing tracer pass is to find out the traces based on > the frequency with the profile data. > Based on the traces, tail duplication is done in order to enable the > superblock regions. So I wonder the path splitting pass could be > incorporated in the > existing pass to compute the path of the traces. Yes, your pass would simply compute extra traces based on the new heuristic. Richard. > Thanks & Regards > Ajit > > Thanks, > Richard. > >> Signed-off-by:Ajit Agarwal ajit...@xilinx.com. >> >> gcc/Makefile.in | 1 + >> gcc/common.opt | 4 + >> gcc/opts.c | 1 + >> gcc/passes.def | 1 + >> gcc/testsuite/gcc.dg/path-split-1.c | 65 ++++ >> gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c | 62 ++++ >> gcc/timevar.def | 1 + >> gcc/tree-pass.h | 1 + >> gcc/tree-ssa-path-split.c | 462 >> +++++++++++++++++++++++++++ >> 9 files changed, 598 insertions(+) >> create mode 100644 gcc/testsuite/gcc.dg/path-split-1.c >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c >> create mode 100644 gcc/tree-ssa-path-split.c >> >> diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 5f9261f..35ac363 >> 100644 >> --- a/gcc/Makefile.in >> +++ b/gcc/Makefile.in >> @@ -1476,6 +1476,7 @@ OBJS = \ >> tree-vect-slp.o \ >> tree-vectorizer.o \ >> tree-vrp.o \ >> + tree-ssa-path-split.o \ >> tree.o \ >> valtrack.o \ >> value-prof.o \ >> diff --git a/gcc/common.opt b/gcc/common.opt index e104269..c63b100 >> 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -2328,6 +2328,10 @@ ftree-vrp >> Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value >> Range Propagation on trees >> >> +ftree-path-split >> +Common Report Var(flag_tree_path_split) Init(0) Optimization Perform >> +Path Splitting >> + >> funit-at-a-time >> Common Report Var(flag_unit_at_a_time) Init(1) Optimization Compile >> whole compilation unit at a time diff --git a/gcc/opts.c b/gcc/opts.c >> index 8a16116..31947ff 100644 >> --- a/gcc/opts.c >> +++ b/gcc/opts.c >> @@ -508,6 +508,7 @@ static const struct default_options >> default_options_table[] = >> { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 >> }, >> { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 }, >> { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 }, >> + { OPT_LEVELS_2_PLUS, OPT_ftree_path_split, NULL, 1 }, >> >> /* -O3 optimizations. */ >> { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 >> }, diff --git a/gcc/passes.def b/gcc/passes.def index c0ddee4..43618eb >> 100644 >> --- a/gcc/passes.def >> +++ b/gcc/passes.def >> @@ -155,6 +155,7 @@ along with GCC; see the file COPYING3. If not see >> NEXT_PASS (pass_ccp); >> /* After CCP we rewrite no longer addressed locals into SSA >> form if possible. */ >> + NEXT_PASS (pass_path_split); >> NEXT_PASS (pass_copy_prop); >> NEXT_PASS (pass_complete_unrolli); >> NEXT_PASS (pass_phiprop); >> diff --git a/gcc/testsuite/gcc.dg/path-split-1.c >> b/gcc/testsuite/gcc.dg/path-split-1.c >> new file mode 100644 >> index 0000000..075dc87 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/path-split-1.c >> @@ -0,0 +1,65 @@ >> +/* { dg-do run } */ >> +/* { dg-options "-O2 " } */ >> + >> +#include <stdio.h> >> +#include <stdlib.h> >> + >> +#define RGBMAX 255 >> + >> +int >> +test() >> +{ >> + int i, Pels; >> + unsigned char sum = 0; >> + unsigned char xr, xg, xb; >> + unsigned char xc, xm, xy, xk; >> + unsigned char *ReadPtr, *EritePtr; >> + >> + ReadPtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100); >> + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100); >> + >> + for (i = 0; i < 100;i++) >> + { >> + ReadPtr[i] = 100 - i; >> + } >> + >> + for (i = 0; i < 100; i++) >> + { >> + xr = *ReadPtr++; >> + xg = *ReadPtr++; >> + xb = *ReadPtr++; >> + >> + xc = (unsigned char) (RGBMAX - xr); >> + xm = (unsigned char) (RGBMAX - xg); >> + xy = (unsigned char) (RGBMAX - xb); >> + >> + if (xc < xm) >> + { >> + xk = (unsigned char) (xc < xy ? xc : xy); >> + } >> + else >> + { >> + xk = (unsigned char) (xm < xy ? xm : xy); >> + } >> + >> + xc = (unsigned char) (xc - xk); >> + xm = (unsigned char) (xm - xk); >> + xy = (unsigned char) (xy - xk); >> + >> + *EritePtr++ = xc; >> + *EritePtr++ = xm; >> + *EritePtr++ = xy; >> + *EritePtr++ = xk; >> + sum += *EritePtr; >> + } >> + return sum; >> +} >> + >> +int >> +main() >> +{ >> + if (test() != 33) >> + abort(); >> + >> + return 0; >> +} >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c >> b/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c >> new file mode 100644 >> index 0000000..19f277c >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c >> @@ -0,0 +1,62 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fdump-tree-path_split" } */ >> + >> +#include <stdio.h> >> +#include <stdlib.h> >> + >> +#define RGBMAX 255 >> + >> +int >> +test() >> +{ >> + int i, Pels; >> + unsigned char sum = 0; >> + unsigned char xr, xg, xb; >> + unsigned char xc, xm, xy, xk; >> + unsigned char *ReadPtr, *EritePtr; >> + >> + ReadPtr = (unsigned char *) malloc (sizeof (unsigned char) * 100); >> + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100); >> + >> + for (i = 0; i < 100;i++) >> + { >> + ReadPtr[i] = 100 - i; >> + } >> + >> + for (i = 0; i < 100; i++) >> + { >> + xr = *ReadPtr++; >> + xg = *ReadPtr++; >> + xb = *ReadPtr++; >> + >> + xc = ( unsigned char) (RGBMAX - xr); >> + xm = ( unsigned char) (RGBMAX - xg); >> + xy = ( unsigned char) (RGBMAX - xb); >> + >> + if (xc < xm) >> + { >> + xk = ( unsigned char) (xc < xy ? xc : xy); >> + } >> + else >> + { >> + xk = ( unsigned char) (xm < xy ? xm : xy); >> + } >> + >> + xc = (unsigned char) (xc - xk); >> + xm = (unsigned char) (xm - xk); >> + xy = (unsigned char) (xy - xk); >> + >> + *EritePtr++ = xc; >> + *EritePtr++ = xm; >> + *EritePtr++ = xy; >> + *EritePtr++ = xk; >> + sum += *EritePtr; >> + } >> + return sum; >> +} >> + >> +/* { dg-final { scan-tree-dump "xc_[0-9][0-9]* -> { xc_[0-9][0-9]* }" >> +"path_split"} } */ >> +/* { dg-final { scan-tree-dump "xm_[0-9][0-9]* -> { xm_[0-9][0-9]* }" >> +"path_split"} } */ >> +/* { dg-final { scan-tree-dump "xy_[0-9][0-9]* -> { xy_[0-9][0-9]* }" >> +"path_split"} } */ >> +/* { dg-final { scan-tree-dump "Merging blocks" "path_split"} } */ >> +/* { dg-final { cleanup-tree-dump "path_split" } } */ >> diff --git a/gcc/timevar.def b/gcc/timevar.def index 711bbed..6217a8e >> 100644 >> --- a/gcc/timevar.def >> +++ b/gcc/timevar.def >> @@ -288,3 +288,4 @@ DEFTIMEVAR (TV_JIT_REPLAY , "replay of JIT client >> activity") >> DEFTIMEVAR (TV_ASSEMBLE , "assemble JIT code") >> DEFTIMEVAR (TV_LINK , "link JIT code") >> DEFTIMEVAR (TV_LOAD , "load JIT result") >> +DEFTIMEVAR (TV_TREE_PATH_SPLIT , "tree path_split") >> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 398ab83..e00639e >> 100644 >> --- a/gcc/tree-pass.h >> +++ b/gcc/tree-pass.h >> @@ -379,6 +379,7 @@ extern gimple_opt_pass *make_pass_iv_optimize >> (gcc::context *ctxt); extern gimple_opt_pass >> *make_pass_tree_loop_done (gcc::context *ctxt); extern >> gimple_opt_pass *make_pass_ch (gcc::context *ctxt); extern >> gimple_opt_pass *make_pass_ccp (gcc::context *ctxt); >> +extern gimple_opt_pass *make_pass_path_split (gcc::context *ctxt); >> extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context >> *ctxt); extern gimple_opt_pass *make_pass_build_ssa (gcc::context >> *ctxt); extern gimple_opt_pass *make_pass_build_alias (gcc::context >> *ctxt); diff --git a/gcc/tree-ssa-path-split.c >> b/gcc/tree-ssa-path-split.c new file mode 100644 index >> 0000000..3da7791 >> --- /dev/null >> +++ b/gcc/tree-ssa-path-split.c >> @@ -0,0 +1,462 @@ >> +/* Support routines for Path Splitting. >> + Copyright (C) 2015 Free Software Foundation, Inc. >> + Contributed by Ajit Kumar Agarwal <ajit...@xilinx.com>. >> + >> + This file is part of GCC. >> + >> + GCC is free software; you can redistribute it and/or modify it under >> + the terms of the GNU General Public License as published by the Free >> + Software Foundation; either version 3, or (at your option) any later >> + version. >> + >> +GCC is distributed in the hope that it will be useful, but WITHOUT >> +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or >> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License >> +for more details. >> + >> +You should have received a copy of the GNU General Public License >> +along with GCC; see the file COPYING3. If not see >> +<http://www.gnu.org/licenses/>. */ >> + >> +#include "config.h" >> +#include "system.h" >> +#include "coretypes.h" >> +#include "tm.h" >> +#include "flags.h" >> +#include "tree.h" >> +#include "stor-layout.h" >> +#include "calls.h" >> +#include "predict.h" >> +#include "vec.h" >> +#include "hashtab.h" >> +#include "hash-set.h" >> +#include "machmode.h" >> +#include "hard-reg-set.h" >> +#include "input.h" >> +#include "function.h" >> +#include "dominance.h" >> +#include "cfg.h" >> +#include "cfganal.h" >> +#include "basic-block.h" >> +#include "tree-ssa-alias.h" >> +#include "internal-fn.h" >> +#include "gimple-fold.h" >> +#include "tree-eh.h" >> +#include "gimple-expr.h" >> +#include "is-a.h" >> +#include "gimple.h" >> +#include "gimple-iterator.h" >> +#include "gimple-walk.h" >> +#include "gimple-ssa.h" >> +#include "tree-cfg.h" >> +#include "tree-phinodes.h" >> +#include "ssa-iterators.h" >> +#include "stringpool.h" >> +#include "tree-ssanames.h" >> +#include "tree-ssa-loop-manip.h" >> +#include "tree-ssa-loop-niter.h" >> +#include "tree-ssa-loop.h" >> +#include "tree-into-ssa.h" >> +#include "tree-ssa.h" >> +#include "tree-pass.h" >> +#include "tree-dump.h" >> +#include "gimple-pretty-print.h" >> +#include "diagnostic-core.h" >> +#include "intl.h" >> +#include "cfgloop.h" >> +#include "tree-scalar-evolution.h" >> +#include "tree-ssa-propagate.h" >> +#include "tree-chrec.h" >> +#include "tree-ssa-threadupdate.h" >> +#include "expr.h" >> +#include "insn-codes.h" >> +#include "optabs.h" >> +#include "tree-ssa-threadedge.h" >> +#include "wide-int.h" >> + >> +/* Replace_uses_phi function propagates the phi results with the >> + first phi argument into each of the copied join blocks wired into >> + its predecessors. This function is called from the replace_uses_phi >> + to replace the uses of first phi arguments with the second >> + phi arguments in the next copy of join block. */ >> + >> +static void >> +replace_use_phi_operand1_with_operand2 (basic_block b, >> + tree use1, >> + tree use2) { >> + use_operand_p use; >> + ssa_op_iter iter; >> + gimple_stmt_iterator gsi; >> + >> + for (gsi = gsi_start_bb (b); !gsi_end_p (gsi);) >> + { >> + gimple stmt = gsi_stmt (gsi); >> + FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE) >> + { >> + tree tuse = USE_FROM_PTR (use); >> + if (use1 == tuse || use1 == NULL_TREE) >> + { >> + propagate_value (use, use2); >> + update_stmt(stmt); >> + } >> + } >> + gsi_next(&gsi); >> + } >> +} >> + >> +/* This function propagates the phi result into the use points with >> + the phi arguments. The join block is copied and wired into the >> + predecessors. Since the use points of the phi results will be same >> + in the each of the copy join blocks in the predecessors, it >> + propagates the phi arguments in the copy of the join blocks wired >> + into its predecessor. */ >> + >> +static >> +void replace_uses_phi (basic_block b, basic_block temp_bb) { >> + gimple_seq phis = phi_nodes (b); >> + gimple phi = gimple_seq_first_stmt (phis); >> + tree def = gimple_phi_result (phi), use = gimple_phi_arg_def >> +(phi,0); >> + tree use2 = gimple_phi_arg_def (phi,1); >> + >> + if (virtual_operand_p (def)) >> + { >> + imm_use_iterator iter; >> + use_operand_p use_p; >> + gimple stmt; >> + >> + FOR_EACH_IMM_USE_STMT (stmt, iter, def) >> + FOR_EACH_IMM_USE_ON_STMT (use_p, iter) >> + SET_USE (use_p, use); >> + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (def)) >> + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (use) = 1; >> + } >> + else >> + replace_uses_by (def, use); >> + replace_use_phi_operand1_with_operand2 (temp_bb, use, use2); } >> + >> +/* Returns true if the block bb has label or call statements. >> + Otherwise return false. */ >> + >> +static bool >> +is_block_has_label_call (basic_block bb) { >> + gimple_stmt_iterator gsi; >> + >> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) >> + { >> + gimple stmt = gsi_stmt(gsi); >> + if (dyn_cast <glabel *> (stmt)) >> + { >> + return true; >> + } >> + if (is_gimple_call (stmt)) >> + return true; >> + } >> + return false; >> +} >> + >> +/* This function performs the feasibility tests for path splitting >> + to perform. Return false if the feasibility for path splitting >> + is not done and returns true if the feasbility for path splitting >> + is done. Following feasibility tests are performed. >> + >> + 1. Return false if the join block has call gimple statements. >> + 2. Return false if the join block has rhs casting for assign >> + gimple statements. >> + 3. If the number of phis is greater than 1 or the phi node in >> + the join block has virtual operand return false. >> + 4. Return false if the number of sequential statements is >> + greater than 2. >> + 5. If the predecessors blocks has labels and call statements >> + return false. >> + 6. If the phi result in the phi node of the join block is not >> + used inside the same join block return false. >> + 7. Otherwise returns true. */ >> + >> +static bool >> +is_feasible_path_splitting (basic_block join_node, basic_block pred1, >> + basic_block pred2) { >> + int num_stmt = 0, num_phis = 0; >> + gimple_stmt_iterator psi, gsi; >> + >> + for (gsi = gsi_start_bb (join_node); !gsi_end_p (gsi); gsi_next (&gsi)) >> + { >> + gimple stmt = gsi_stmt(gsi); >> + >> + if (gimple_assign_cast_p (stmt)) >> + return false; >> + >> + if (is_gimple_call (stmt)) >> + return false; >> + >> + if (!is_gimple_debug(stmt)) >> + { >> + num_stmt++; >> + } >> + } >> + >> + if (pred1 && pred2 && (num_stmt > 2)) >> + { >> + bool found_virtual_result = false; >> + >> + for (psi = gsi_start_phis (join_node); !gsi_end_p (psi); ) >> + { >> + use_operand_p use_p; >> + imm_use_iterator iter; >> + gimple stmt = gsi_stmt(psi); >> + >> + if (!virtual_operand_p (gimple_phi_result (stmt))) >> + num_phis++; >> + else >> + found_virtual_result = true; >> + >> + FOR_EACH_IMM_USE_FAST (use_p, iter, gimple_phi_result (stmt)) >> + { >> + gimple use_stmt = USE_STMT (use_p); >> + >> + if (gimple_bb (use_stmt) != join_node) >> + return false; >> + } >> + >> + gsi_next(&psi); >> + } >> + >> + if ((num_phis >1) || found_virtual_result) >> + return false; >> + >> + if(is_block_has_label_call(pred1) || is_block_has_label_call(pred2)) >> + return false; >> + >> + return true; >> + } >> + return false; >> +} >> + >> +/* Update the statements in the basic block with the basic >> + basic block. */ >> + >> +static void >> +update_stmt_bb(basic_block b) >> +{ >> + gimple_stmt_iterator gsi; >> + for(gsi = gsi_start_bb(b); !gsi_end_p(gsi); gsi_next(&gsi)) >> + { >> + gimple stmt = gsi_stmt(gsi); >> + gimple_set_bb(stmt,b); >> + } >> +} >> + >> +/* This function gets the join blocks same as the source >> + node of the loop latch nodes and the predecessors of >> + the join block is updated in the pred1 and pred2 passed >> + as the reference arguments into the function. Return >> + the join block. */ >> + >> +static basic_block >> +get_join_blk_same_as_loop_latch (basic_block bb, >> + basic_block &pred1, >> + basic_block &pred2) { >> + vec<basic_block> bbs; >> + basic_block bb1; >> + unsigned int i; >> + edge_iterator ei; >> + edge e1; >> + bool found = false ,found1; >> + bbs = get_all_dominated_blocks (CDI_DOMINATORS, >> + bb ); >> + FOR_EACH_VEC_ELT (bbs, i, bb1) >> + { >> + found1 = false; >> + FOR_EACH_EDGE (e1, ei, bb->succs) >> + { >> + if ( bb1 == e1->dest) >> + { >> + found = true; >> + found1 = true; >> + } >> + } >> + if (!found1 && found) >> + { >> + found = false; >> + FOR_EACH_EDGE (e1, ei, bb1->succs) >> + { >> + if (e1->flags & (EDGE_DFS_BACK)) >> + found = true; >> + } >> + >> + if (found && EDGE_COUNT(bb1->preds) == 2) >> + { >> + unsigned int k = 0; >> + FOR_EACH_EDGE (e1, ei, bb1->preds) >> + { >> + if ((e1->flags & (EDGE_DFS_BACK))) >> + continue; >> + >> + if ( k == 1) >> + { >> + if (single_succ_p(e1->src) && >> + single_succ_edge (e1->src)->flags & EDGE_FALLTHRU) >> + { >> + pred2 = e1->src; >> + } >> + } >> + else >> + { >> + if (single_succ_p(e1->src) && >> + single_succ_edge (e1->src)->flags & EDGE_FALLTHRU) >> + { >> + pred1 = e1->src; >> + } >> + } >> + k++; >> + } >> + bbs.release(); >> + return bb1; >> + } >> + } >> + } >> + bbs.release(); >> + return NULL; >> +} >> + >> +/* This is the core function to perform path splitting. The join >> + same as the source of the loop latch node is identified along >> + with their predecessors. Based on the feasibility tests for >> + path splitting the path splitting is performed by wiring the >> + copy of join blocks into the predecessors and propagating the phi >> + result with the corresponding phi arguments into each of the copy >> + of join blocks wired with the original predecessors of the join >> + block. >> + >> + The tree-cfg-cleanup will merge the blocks in the predecessors >> + path and the update-ssa will update the ssa representation after >> + the path splitting is performed. */ >> + >> +static void >> +perform_path_splitting (basic_block bb) { >> + basic_block pred1 = NULL, pred2 = NULL, join_block = NULL; >> + >> + join_block = get_join_blk_same_as_loop_latch (bb, pred1, pred2); >> + >> + if (join_block && >> + is_feasible_path_splitting (join_block, pred1, pred2)) >> + { >> + basic_block new_bb1 = NULL, new_bb2 = NULL; >> + gimple_stmt_iterator last; >> + basic_block temp_bb = NULL; >> + edge_iterator ei; >> + edge e1; >> + >> + temp_bb = duplicate_block (join_block, NULL, NULL); >> + >> + FOR_EACH_EDGE (e1, ei, pred1->succs) >> + new_bb1 = split_edge (e1); >> + >> + FOR_EACH_EDGE (e1, ei, pred2->succs) >> + new_bb2 = split_edge (e1); >> + >> + last = gsi_start_bb (new_bb1); >> + gsi_insert_seq_after (&last, bb_seq (join_block), GSI_NEW_STMT); >> + last = gsi_start_bb (new_bb2); >> + gsi_insert_seq_after (&last, bb_seq (temp_bb), GSI_NEW_STMT); >> + update_stmt_bb (new_bb1); >> + update_stmt_bb (new_bb2); >> + >> + replace_uses_phi (join_block, new_bb2); >> + >> + set_bb_seq (join_block, NULL); >> + set_bb_seq(temp_bb,NULL); >> + delete_basic_block (temp_bb); >> + return; >> + } >> +} >> + >> +static unsigned int >> +execute_path_split (void) >> +{ >> + basic_block bb; >> + >> + loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); >> + initialize_original_copy_tables(); >> + >> + calculate_dominance_info (CDI_DOMINATORS); >> + calculate_dominance_info (CDI_POST_DOMINATORS); >> + >> + mark_dfs_back_edges (); >> + >> + FOR_EACH_BB_FN (bb, cfun) >> + { >> + gimple last; >> + >> + /* We only care about blocks ending in a COND_EXPR. */ >> + >> + last = gsi_stmt (gsi_last_bb (bb)); >> + >> + /* We're basically looking for a switch or any kind of conditional with >> + integral or pointer type arguments. Note the type of the second >> + argument will be the same as the first argument, so no need to >> + check it explicitly. */ >> + if ((last && (gimple_code (last) == GIMPLE_COND >> + && TREE_CODE (gimple_cond_lhs (last)) == SSA_NAME >> + && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (last))) >> + || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (last)))) >> + && (TREE_CODE (gimple_cond_rhs (last)) == SSA_NAME >> + || is_gimple_min_invariant (gimple_cond_rhs (last)))))) >> + { >> + >> + if (gimple_code(last) == GIMPLE_COND) >> + { >> + perform_path_splitting (bb); >> + } >> + } >> + } >> + >> + loop_optimizer_finalize (); >> + free_original_copy_tables (); >> + free_dominance_info (CDI_DOMINATORS); >> + free_dominance_info (CDI_POST_DOMINATORS); >> + return 0; >> +} >> + >> +namespace { >> + >> +const pass_data pass_data_path_split = { >> + GIMPLE_PASS, /* type */ >> + "path_split", /* name */ >> + OPTGROUP_NONE, /* optinfo_flags */ >> + TV_TREE_PATH_SPLIT, /* tv_id */ >> + PROP_ssa, /* properties_required */ >> + 0, /* properties_provided */ >> + 0, /* properties_destroyed */ >> + 0, /* todo_flags_start */ >> + ( TODO_cleanup_cfg | TODO_update_ssa ), /* todo_flags_finish */ >> +}; >> + >> +class pass_path_split : public gimple_opt_pass { >> + public: >> + pass_path_split (gcc::context *ctxt) >> + : gimple_opt_pass (pass_data_path_split, ctxt) >> + {} >> + >> + /* opt_pass methods: */ >> + opt_pass * clone () { return new pass_path_split (m_ctxt); } >> + virtual bool gate (function *) { return flag_tree_path_split != 0; } >> + virtual unsigned int execute (function *) { return >> + execute_path_split (); } >> + >> +}; // class pass_path_split >> + >> +} // anon namespace >> + >> +gimple_opt_pass * >> +make_pass_path_split (gcc::context *ctxt) { >> + return new pass_path_split (ctxt); >> +} >> -- >> 1.8.2.1 >> >> Thanks & Regards >> Ajit