On Sat, Jul 4, 2015 at 2:39 PM, Ajit Kumar Agarwal
<ajit.kumar.agar...@xilinx.com> wrote:
>
>
> -----Original Message-----
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, June 30, 2015 4:42 PM
> To: Ajit Kumar Agarwal
> Cc: l...@redhat.com; GCC Patches; Vinod Kathail; Shail Aditya Gupta; 
> Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree 
> ssa representation
>
> On Tue, Jun 30, 2015 at 10:16 AM, Ajit Kumar Agarwal 
> <ajit.kumar.agar...@xilinx.com> wrote:
>> All:
>>
>> The below patch added a new path Splitting optimization pass on SSA
>> representation. The Path Splitting optimization Pass moves the join
>> block of if-then-else same as loop latch to its predecessors and get merged 
>> with the predecessors Preserving the SSA representation.
>>
>> The patch is tested for Microblaze and i386 target. The EEMBC/Mibench
>> benchmarks is run with the Microblaze target And the performance gain
>> of 9.15% and rgbcmy01_lite(EEMBC benchmarks). The Deja GNU tests is run for 
>> Mircroblaze Target and no regression is seen for Microblaze target and the 
>> new testcase attached are passed.
>>
>> For i386 bootstrapping goes through fine and the Spec cpu2000
>> benchmarks is run with this patch. Following observation were seen with spec 
>> cpu2000 benchmarks.
>>
>> Ratio of path splitting change vs Ratio of not having path splitting change 
>> is 3653.353 vs 3652.14 for INT benchmarks.
>> Ratio of path splitting change vs Ratio of not having path splitting change 
>> is  4353.812 vs 4345.351 for FP benchmarks.
>>
>> Based on comments from RFC patch following changes were done.
>>
>> 1. Added a new pass for path splitting changes.
>> 2. Placed the new path  Splitting Optimization pass before the copy 
>> propagation pass.
>> 3. The join block same as the Loop latch is wired into its
>> predecessors so that the CFG Cleanup pass will merge the blocks Wired 
>> together.
>> 4. Copy propagation routines added for path splitting changes is not
>> needed as suggested by Jeff. They are removed in the patch as The copy 
>> propagation in the copied join blocks will be done by the existing copy 
>> propagation pass and the update ssa pass.
>> 5. Only the propagation of phi results of the join block with the phi
>> argument is done which will not be done by the existing update_ssa Or copy 
>> propagation pass on tree ssa representation.
>> 6. Added 2 tests.
>>     a) compilation check  tests.
>>    b) execution tests.
>> 7. Refactoring of the code for the feasibility check and finding the join 
>> block same as loop latch node.
>>
>>     [Patch,tree-optimization]: Add new path Splitting pass on tree ssa 
>> representation.
>>
>>     Added a new pass on path splitting on tree SSA representation. The path
>>     splitting optimization does the CFG transformation of join block of the
>>     if-then-else same as the loop latch node is moved and merged with the
>>     predecessor blocks after preserving the SSA representation.
>>
>>     ChangeLog:
>>     2015-06-30  Ajit Agarwal  <ajit...@xilinx.com>
>>
>>         * gcc/Makefile.in: Add the build of the new file
>>         tree-ssa-path-split.c
>>         * gcc/common.opt: Add the new flag ftree-path-split.
>>         * gcc/opts.c: Add an entry for Path splitting pass
>>         with optimization flag greater and equal to O2.
>>         * gcc/passes.def: Enable and add new pass path splitting.
>>         * gcc/timevar.def: Add the new entry for TV_TREE_PATH_SPLIT.
>>         * gcc/tree-pass.h: Extern Declaration of make_pass_path_split.
>>         * gcc/tree-ssa-path-split.c: New file for path splitting pass.
>>         * gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c: New testcase.
>>         * gcc/testsuite/gcc.dg/path-split-1.c: New testcase.
>
>>>I'm not 100% sure I understand the transform but what I see from the 
>>>testcases it tail-duplicates from a conditional up to a loop latch block 
>>>(not sure if it >>includes it and thus ends up creating a loop nest or not).
>
>>>An observation I have is that the pass should at least share the transform 
>>>stage to some extent with the existing tracer pass (tracer.c) which 
>>>essentially does >>the same but not restricted to loops in any way.
>
> The following piece of code from tracer.c can be shared with the existing 
> path splitting pass.
>
> {
>              e = find_edge (bb, bb2);
>
>               copy = duplicate_block (bb2, e, bb);
>               flush_pending_stmts (e);
>
>               add_phi_args_after_copy (&copy, 1, NULL);
> }
>
> Sharing the above code of the transform stage of tracer.c with the path 
> splitting pass has the following limitation.
>
> 1. The duplicated loop latch node is wired to its predecessors and the 
> existing phi node in the loop latch node with the
> Phi arguments from its corresponding predecessors is moved to the duplicated 
> loop latch node that is wired into its predecessors. Due
> To this, the duplicated loop latch nodes wired into its predecessors will not 
> be merged with the original predecessors by CFG cleanup phase .
>
>>> So I wonder if your pass could be simply another heuristic to compute paths 
>>> to trace in the existing tracer pass.
>
> Sorry, I am not very clear when you say the above.  I am trying to figure out 
> whether you expect the existing pass of tracer.c should be modified
> Or the path splitting pass should coexist.

Yes, I was wondering whether tracer.c could be simply modified.  Both
transforms are doing something very similar.

> My understanding of existing tracer pass is to  find out the traces  based on 
> the frequency  with the profile data.
> Based on the traces,  tail duplication is done in order to enable the 
> superblock regions. So I wonder the path splitting pass could be  
> incorporated in the
> existing  pass  to compute the path of the traces.

Yes, your pass would simply compute extra traces based on the new heuristic.

Richard.

> Thanks & Regards
> Ajit
>
> Thanks,
> Richard.
>
>>     Signed-off-by:Ajit Agarwal ajit...@xilinx.com.
>>
>> gcc/Makefile.in                              |   1 +
>>  gcc/common.opt                               |   4 +
>>  gcc/opts.c                                   |   1 +
>>  gcc/passes.def                               |   1 +
>>  gcc/testsuite/gcc.dg/path-split-1.c          |  65 ++++
>>  gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c |  62 ++++
>>  gcc/timevar.def                              |   1 +
>>  gcc/tree-pass.h                              |   1 +
>>  gcc/tree-ssa-path-split.c                    | 462 
>> +++++++++++++++++++++++++++
>>  9 files changed, 598 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/path-split-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>>  create mode 100644 gcc/tree-ssa-path-split.c
>>
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 5f9261f..35ac363
>> 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -1476,6 +1476,7 @@ OBJS = \
>>         tree-vect-slp.o \
>>         tree-vectorizer.o \
>>         tree-vrp.o \
>> +        tree-ssa-path-split.o \
>>         tree.o \
>>         valtrack.o \
>>         value-prof.o \
>> diff --git a/gcc/common.opt b/gcc/common.opt index e104269..c63b100
>> 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2328,6 +2328,10 @@ ftree-vrp
>>  Common Report Var(flag_tree_vrp) Init(0) Optimization  Perform Value
>> Range Propagation on trees
>>
>> +ftree-path-split
>> +Common Report Var(flag_tree_path_split) Init(0) Optimization Perform
>> +Path Splitting
>> +
>>  funit-at-a-time
>>  Common Report Var(flag_unit_at_a_time) Init(1) Optimization  Compile
>> whole compilation unit at a time diff --git a/gcc/opts.c b/gcc/opts.c
>> index 8a16116..31947ff 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -508,6 +508,7 @@ static const struct default_options 
>> default_options_table[] =
>>      { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 
>> },
>>      { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 },
>>      { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 },
>> +    { OPT_LEVELS_2_PLUS, OPT_ftree_path_split, NULL, 1 },
>>
>>      /* -O3 optimizations.  */
>>      { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1
>> }, diff --git a/gcc/passes.def b/gcc/passes.def index c0ddee4..43618eb
>> 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -155,6 +155,7 @@ along with GCC; see the file COPYING3.  If not see
>>        NEXT_PASS (pass_ccp);
>>        /* After CCP we rewrite no longer addressed locals into SSA
>>          form if possible.  */
>> +      NEXT_PASS (pass_path_split);
>>        NEXT_PASS (pass_copy_prop);
>>        NEXT_PASS (pass_complete_unrolli);
>>        NEXT_PASS (pass_phiprop);
>> diff --git a/gcc/testsuite/gcc.dg/path-split-1.c
>> b/gcc/testsuite/gcc.dg/path-split-1.c
>> new file mode 100644
>> index 0000000..075dc87
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/path-split-1.c
>> @@ -0,0 +1,65 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 " } */
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +
>> +#define RGBMAX 255
>> +
>> +int
>> +test()
>> +{
>> +  int i, Pels;
>> +  unsigned char sum = 0;
>> +  unsigned char xr, xg, xb;
>> +  unsigned char xc, xm, xy, xk;
>> +  unsigned char *ReadPtr, *EritePtr;
>> +
>> +  ReadPtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> +
>> +  for (i = 0; i < 100;i++)
>> +     {
>> +       ReadPtr[i] = 100 - i;
>> +     }
>> +
>> +  for (i = 0; i < 100; i++)
>> +     {
>> +       xr = *ReadPtr++;
>> +       xg = *ReadPtr++;
>> +       xb = *ReadPtr++;
>> +
>> +       xc = (unsigned char) (RGBMAX - xr);
>> +       xm = (unsigned char) (RGBMAX - xg);
>> +       xy = (unsigned char) (RGBMAX - xb);
>> +
>> +       if (xc < xm)
>> +         {
>> +           xk = (unsigned char) (xc < xy ? xc : xy);
>> +         }
>> +       else
>> +        {
>> +          xk = (unsigned char) (xm < xy ? xm : xy);
>> +        }
>> +
>> +       xc = (unsigned char) (xc - xk);
>> +       xm = (unsigned char) (xm - xk);
>> +       xy = (unsigned char) (xy - xk);
>> +
>> +       *EritePtr++ = xc;
>> +       *EritePtr++ = xm;
>> +       *EritePtr++ = xy;
>> +       *EritePtr++ = xk;
>> +       sum += *EritePtr;
>> +    }
>> +  return sum;
>> +}
>> +
>> +int
>> +main()
>> +{
>> +  if (test() != 33)
>> +    abort();
>> +
>> +  return 0;
>> +}
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>> new file mode 100644
>> index 0000000..19f277c
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>> @@ -0,0 +1,62 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-path_split" } */
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +
>> +#define RGBMAX 255
>> +
>> +int
>> +test()
>> +{
>> +  int i, Pels;
>> +  unsigned char sum = 0;
>> +  unsigned char xr, xg, xb;
>> +  unsigned char xc, xm, xy, xk;
>> +  unsigned char *ReadPtr, *EritePtr;
>> +
>> +  ReadPtr = (unsigned char *) malloc (sizeof (unsigned char) * 100);
>> + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> +
>> +  for (i = 0; i < 100;i++)
>> +     {
>> +       ReadPtr[i] = 100 - i;
>> +     }
>> +
>> +  for (i = 0; i < 100; i++)
>> +     {
>> +       xr = *ReadPtr++;
>> +       xg = *ReadPtr++;
>> +       xb = *ReadPtr++;
>> +
>> +       xc = ( unsigned char) (RGBMAX - xr);
>> +       xm = ( unsigned char) (RGBMAX - xg);
>> +       xy = ( unsigned char) (RGBMAX - xb);
>> +
>> +       if (xc < xm)
>> +         {
>> +           xk = ( unsigned char) (xc < xy ? xc : xy);
>> +         }
>> +       else
>> +         {
>> +           xk = ( unsigned char) (xm < xy ? xm : xy);
>> +         }
>> +
>> +       xc = (unsigned char) (xc - xk);
>> +       xm = (unsigned char) (xm - xk);
>> +       xy = (unsigned char) (xy - xk);
>> +
>> +       *EritePtr++ = xc;
>> +       *EritePtr++ = xm;
>> +       *EritePtr++ = xy;
>> +       *EritePtr++ = xk;
>> +       sum += *EritePtr;
>> +    }
>> +  return sum;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "xc_[0-9][0-9]* -> { xc_[0-9][0-9]* }"
>> +"path_split"} } */
>> +/* { dg-final { scan-tree-dump "xm_[0-9][0-9]* -> { xm_[0-9][0-9]* }"
>> +"path_split"} } */
>> +/* { dg-final { scan-tree-dump "xy_[0-9][0-9]* -> { xy_[0-9][0-9]* }"
>> +"path_split"} } */
>> +/* { dg-final { scan-tree-dump "Merging blocks" "path_split"} } */
>> +/* { dg-final { cleanup-tree-dump "path_split" } } */
>> diff --git a/gcc/timevar.def b/gcc/timevar.def index 711bbed..6217a8e
>> 100644
>> --- a/gcc/timevar.def
>> +++ b/gcc/timevar.def
>> @@ -288,3 +288,4 @@ DEFTIMEVAR (TV_JIT_REPLAY        , "replay of JIT client 
>> activity")
>>  DEFTIMEVAR (TV_ASSEMBLE             , "assemble JIT code")
>>  DEFTIMEVAR (TV_LINK                 , "link JIT code")
>>  DEFTIMEVAR (TV_LOAD                 , "load JIT result")
>> +DEFTIMEVAR (TV_TREE_PATH_SPLIT  , "tree path_split")
>> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 398ab83..e00639e
>> 100644
>> --- a/gcc/tree-pass.h
>> +++ b/gcc/tree-pass.h
>> @@ -379,6 +379,7 @@ extern gimple_opt_pass *make_pass_iv_optimize
>> (gcc::context *ctxt);  extern gimple_opt_pass
>> *make_pass_tree_loop_done (gcc::context *ctxt);  extern
>> gimple_opt_pass *make_pass_ch (gcc::context *ctxt);  extern
>> gimple_opt_pass *make_pass_ccp (gcc::context *ctxt);
>> +extern gimple_opt_pass *make_pass_path_split (gcc::context *ctxt);
>>  extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context
>> *ctxt);  extern gimple_opt_pass *make_pass_build_ssa (gcc::context
>> *ctxt);  extern gimple_opt_pass *make_pass_build_alias (gcc::context
>> *ctxt); diff --git a/gcc/tree-ssa-path-split.c
>> b/gcc/tree-ssa-path-split.c new file mode 100644 index
>> 0000000..3da7791
>> --- /dev/null
>> +++ b/gcc/tree-ssa-path-split.c
>> @@ -0,0 +1,462 @@
>> +/* Support routines for Path Splitting.
>> +   Copyright (C) 2015 Free Software Foundation, Inc.
>> +   Contributed by Ajit Kumar Agarwal <ajit...@xilinx.com>.
>> +
>> + This file is part of GCC.
>> +
>> + GCC is free software; you can redistribute it and/or modify it under
>> + the terms of the GNU General Public License as published by the Free
>> + Software Foundation; either version 3, or (at your option) any later
>> + version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT
>> +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with GCC; see the file COPYING3.  If not see
>> +<http://www.gnu.org/licenses/>.  */
>> +
>> +#include "config.h"
>> +#include "system.h"
>> +#include "coretypes.h"
>> +#include "tm.h"
>> +#include "flags.h"
>> +#include "tree.h"
>> +#include "stor-layout.h"
>> +#include "calls.h"
>> +#include "predict.h"
>> +#include "vec.h"
>> +#include "hashtab.h"
>> +#include "hash-set.h"
>> +#include "machmode.h"
>> +#include "hard-reg-set.h"
>> +#include "input.h"
>> +#include "function.h"
>> +#include "dominance.h"
>> +#include "cfg.h"
>> +#include "cfganal.h"
>> +#include "basic-block.h"
>> +#include "tree-ssa-alias.h"
>> +#include "internal-fn.h"
>> +#include "gimple-fold.h"
>> +#include "tree-eh.h"
>> +#include "gimple-expr.h"
>> +#include "is-a.h"
>> +#include "gimple.h"
>> +#include "gimple-iterator.h"
>> +#include "gimple-walk.h"
>> +#include "gimple-ssa.h"
>> +#include "tree-cfg.h"
>> +#include "tree-phinodes.h"
>> +#include "ssa-iterators.h"
>> +#include "stringpool.h"
>> +#include "tree-ssanames.h"
>> +#include "tree-ssa-loop-manip.h"
>> +#include "tree-ssa-loop-niter.h"
>> +#include "tree-ssa-loop.h"
>> +#include "tree-into-ssa.h"
>> +#include "tree-ssa.h"
>> +#include "tree-pass.h"
>> +#include "tree-dump.h"
>> +#include "gimple-pretty-print.h"
>> +#include "diagnostic-core.h"
>> +#include "intl.h"
>> +#include "cfgloop.h"
>> +#include "tree-scalar-evolution.h"
>> +#include "tree-ssa-propagate.h"
>> +#include "tree-chrec.h"
>> +#include "tree-ssa-threadupdate.h"
>> +#include "expr.h"
>> +#include "insn-codes.h"
>> +#include "optabs.h"
>> +#include "tree-ssa-threadedge.h"
>> +#include "wide-int.h"
>> +
>> +/* Replace_uses_phi function propagates the phi results with the
>> +   first phi argument into each of the copied join blocks wired into
>> +   its predecessors. This function is called from the replace_uses_phi
>> +   to replace the uses of first phi arguments with the second
>> +   phi arguments in the next copy of join block.  */
>> +
>> +static void
>> +replace_use_phi_operand1_with_operand2 (basic_block b,
>> +                                        tree use1,
>> +                                        tree use2) {
>> +  use_operand_p use;
>> +  ssa_op_iter iter;
>> +  gimple_stmt_iterator gsi;
>> +
>> +  for (gsi = gsi_start_bb (b); !gsi_end_p (gsi);)
>> +     {
>> +       gimple stmt = gsi_stmt (gsi);
>> +       FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>> +       {
>> +         tree tuse = USE_FROM_PTR (use);
>> +          if (use1 == tuse || use1 == NULL_TREE)
>> +            {
>> +              propagate_value (use, use2);
>> +              update_stmt(stmt);
>> +            }
>> +        }
>> +       gsi_next(&gsi);
>> +     }
>> +}
>> +
>> +/* This function propagates the phi result into the use points with
>> +   the phi arguments. The join block is copied and wired into the
>> +   predecessors. Since the use points of the phi results will be same
>> +   in the each of the copy join blocks in the  predecessors, it
>> +   propagates the phi arguments in the copy of the join blocks wired
>> +   into its predecessor.  */
>> +
>> +static
>> +void replace_uses_phi (basic_block b, basic_block temp_bb) {
>> +  gimple_seq phis = phi_nodes (b);
>> +  gimple phi = gimple_seq_first_stmt (phis);
>> +  tree def = gimple_phi_result (phi), use = gimple_phi_arg_def
>> +(phi,0);
>> +  tree use2 = gimple_phi_arg_def (phi,1);
>> +
>> +  if (virtual_operand_p (def))
>> +    {
>> +      imm_use_iterator iter;
>> +      use_operand_p use_p;
>> +      gimple stmt;
>> +
>> +      FOR_EACH_IMM_USE_STMT (stmt, iter, def)
>> +        FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
>> +          SET_USE (use_p, use);
>> +      if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (def))
>> +        SSA_NAME_OCCURS_IN_ABNORMAL_PHI (use) = 1;
>> +    }
>> +   else
>> +     replace_uses_by (def, use);
>> +   replace_use_phi_operand1_with_operand2 (temp_bb, use, use2); }
>> +
>> +/* Returns true if the block bb has label or call statements.
>> +   Otherwise return false.  */
>> +
>> +static bool
>> +is_block_has_label_call (basic_block bb) {
>> +  gimple_stmt_iterator gsi;
>> +
>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> +     {
>> +       gimple stmt = gsi_stmt(gsi);
>> +       if (dyn_cast <glabel *> (stmt))
>> +         {
>> +           return true;
>> +         }
>> +       if (is_gimple_call (stmt))
>> +         return true;
>> +     }
>> +  return false;
>> +}
>> +
>> +/* This function performs the feasibility tests for path splitting
>> +   to perform. Return false if the feasibility for path splitting
>> +   is not done and returns true if the feasbility for path splitting
>> +   is done. Following feasibility tests are performed.
>> +
>> +   1. Return false if the join block has call gimple statements.
>> +   2. Return false if the join block has rhs casting for assign
>> +      gimple statements.
>> +   3. If the number of phis is greater than 1 or the phi node in
>> +      the join block has virtual operand return false.
>> +   4. Return false if the number of sequential statements is
>> +      greater than 2.
>> +   5. If the predecessors blocks has labels and call statements
>> +      return false.
>> +   6. If the phi result in the phi node of the join block is not
>> +      used inside the same join block return false.
>> +   7. Otherwise returns true.  */
>> +
>> +static bool
>> +is_feasible_path_splitting (basic_block join_node, basic_block pred1,
>> +                           basic_block pred2) {
>> +  int num_stmt = 0, num_phis = 0;
>> +  gimple_stmt_iterator psi, gsi;
>> +
>> +  for (gsi = gsi_start_bb (join_node); !gsi_end_p (gsi); gsi_next (&gsi))
>> +     {
>> +       gimple stmt = gsi_stmt(gsi);
>> +
>> +       if (gimple_assign_cast_p (stmt))
>> +         return false;
>> +
>> +       if (is_gimple_call (stmt))
>> +         return false;
>> +
>> +       if (!is_gimple_debug(stmt))
>> +         {
>> +           num_stmt++;
>> +         }
>> +     }
>> +
>> +   if (pred1 && pred2 && (num_stmt > 2))
>> +     {
>> +       bool found_virtual_result = false;
>> +
>> +       for (psi = gsi_start_phis (join_node); !gsi_end_p (psi); )
>> +          {
>> +            use_operand_p use_p;
>> +            imm_use_iterator iter;
>> +            gimple stmt = gsi_stmt(psi);
>> +
>> +            if (!virtual_operand_p (gimple_phi_result (stmt)))
>> +              num_phis++;
>> +            else
>> +              found_virtual_result = true;
>> +
>> +            FOR_EACH_IMM_USE_FAST (use_p, iter, gimple_phi_result (stmt))
>> +            {
>> +              gimple use_stmt = USE_STMT (use_p);
>> +
>> +              if (gimple_bb (use_stmt) != join_node)
>> +                return false;
>> +            }
>> +
>> +            gsi_next(&psi);
>> +         }
>> +
>> +       if ((num_phis >1) || found_virtual_result)
>> +          return false;
>> +
>> +       if(is_block_has_label_call(pred1) || is_block_has_label_call(pred2))
>> +         return false;
>> +
>> +       return true;
>> +    }
>> +  return false;
>> +}
>> +
>> +/* Update the statements in the basic block with the basic
>> +   basic block.  */
>> +
>> +static void
>> +update_stmt_bb(basic_block b)
>> +{
>> +  gimple_stmt_iterator gsi;
>> +  for(gsi = gsi_start_bb(b); !gsi_end_p(gsi); gsi_next(&gsi))
>> +   {
>> +     gimple stmt = gsi_stmt(gsi);
>> +     gimple_set_bb(stmt,b);
>> +   }
>> +}
>> +
>> +/* This function gets the join blocks same as the source
>> +   node of the loop latch nodes and the predecessors of
>> +   the join block is updated in the pred1 and pred2 passed
>> +   as the reference arguments into the function. Return
>> +   the join block.  */
>> +
>> +static basic_block
>> +get_join_blk_same_as_loop_latch (basic_block bb,
>> +                                 basic_block &pred1,
>> +                                 basic_block &pred2) {
>> +  vec<basic_block> bbs;
>> +  basic_block bb1;
>> +  unsigned int i;
>> +  edge_iterator ei;
>> +  edge e1;
>> +  bool found = false ,found1;
>> +  bbs = get_all_dominated_blocks (CDI_DOMINATORS,
>> +                                  bb );
>> +  FOR_EACH_VEC_ELT (bbs, i, bb1)
>> +  {
>> +    found1 = false;
>> +    FOR_EACH_EDGE (e1, ei, bb->succs)
>> +    {
>> +      if ( bb1 == e1->dest)
>> +        {
>> +          found = true;
>> +          found1 = true;
>> +        }
>> +    }
>> +    if (!found1 && found)
>> +      {
>> +        found = false;
>> +        FOR_EACH_EDGE (e1, ei, bb1->succs)
>> +        {
>> +          if (e1->flags & (EDGE_DFS_BACK))
>> +            found = true;
>> +        }
>> +
>> +        if (found && EDGE_COUNT(bb1->preds) == 2)
>> +          {
>> +            unsigned int k = 0;
>> +            FOR_EACH_EDGE (e1, ei, bb1->preds)
>> +            {
>> +              if ((e1->flags & (EDGE_DFS_BACK)))
>> +                continue;
>> +
>> +              if ( k == 1)
>> +                {
>> +                  if (single_succ_p(e1->src) &&
>> +                      single_succ_edge (e1->src)->flags & EDGE_FALLTHRU)
>> +                    {
>> +                      pred2 = e1->src;
>> +                    }
>> +                }
>> +                else
>> +                  {
>> +                    if (single_succ_p(e1->src) &&
>> +                        single_succ_edge (e1->src)->flags & EDGE_FALLTHRU)
>> +                      {
>> +                        pred1 = e1->src;
>> +                      }
>> +                  }
>> +                k++;
>> +            }
>> +            bbs.release();
>> +            return bb1;
>> +          }
>> +       }
>> +   }
>> +   bbs.release();
>> +   return NULL;
>> +}
>> +
>> +/* This is the core function to perform path splitting. The join
>> +   same as the source of the loop latch node is identified along
>> +   with their predecessors. Based on the feasibility tests for
>> +   path splitting the path splitting is performed by wiring the
>> +   copy of join blocks into the predecessors and propagating the phi
>> +   result with the corresponding phi arguments into each of the copy
>> +   of join blocks wired with the original predecessors of the join
>> +   block.
>> +
>> +   The  tree-cfg-cleanup will merge the blocks in the predecessors
>> +   path and the update-ssa will update the ssa representation after
>> +   the path splitting is performed.  */
>> +
>> +static void
>> +perform_path_splitting (basic_block bb) {
>> +  basic_block pred1 = NULL, pred2 = NULL, join_block = NULL;
>> +
>> +  join_block = get_join_blk_same_as_loop_latch (bb, pred1, pred2);
>> +
>> +  if (join_block  &&
>> +      is_feasible_path_splitting (join_block, pred1, pred2))
>> +    {
>> +      basic_block new_bb1 = NULL, new_bb2 = NULL;
>> +      gimple_stmt_iterator last;
>> +      basic_block temp_bb = NULL;
>> +      edge_iterator ei;
>> +      edge e1;
>> +
>> +      temp_bb = duplicate_block (join_block, NULL, NULL);
>> +
>> +      FOR_EACH_EDGE (e1, ei, pred1->succs)
>> +        new_bb1 = split_edge (e1);
>> +
>> +      FOR_EACH_EDGE (e1, ei, pred2->succs)
>> +        new_bb2 = split_edge (e1);
>> +
>> +      last = gsi_start_bb (new_bb1);
>> +      gsi_insert_seq_after (&last, bb_seq (join_block), GSI_NEW_STMT);
>> +      last = gsi_start_bb (new_bb2);
>> +      gsi_insert_seq_after (&last, bb_seq (temp_bb), GSI_NEW_STMT);
>> +      update_stmt_bb (new_bb1);
>> +      update_stmt_bb (new_bb2);
>> +
>> +      replace_uses_phi (join_block, new_bb2);
>> +
>> +      set_bb_seq (join_block, NULL);
>> +      set_bb_seq(temp_bb,NULL);
>> +      delete_basic_block (temp_bb);
>> +      return;
>> +    }
>> +}
>> +
>> +static unsigned int
>> +execute_path_split (void)
>> +{
>> +  basic_block bb;
>> +
>> +  loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
>> + initialize_original_copy_tables();
>> +
>> +  calculate_dominance_info (CDI_DOMINATORS);
>> + calculate_dominance_info (CDI_POST_DOMINATORS);
>> +
>> +  mark_dfs_back_edges ();
>> +
>> +  FOR_EACH_BB_FN (bb, cfun)
>> +  {
>> +    gimple last;
>> +
>> +    /* We only care about blocks ending in a COND_EXPR. */
>> +
>> +    last = gsi_stmt (gsi_last_bb (bb));
>> +
>> +    /* We're basically looking for a switch or any kind of conditional with
>> +       integral or pointer type arguments.  Note the type of the second
>> +       argument will be the same as the first argument, so no need to
>> +       check it explicitly.  */
>> +    if ((last && (gimple_code (last) == GIMPLE_COND
>> +            && TREE_CODE (gimple_cond_lhs (last)) == SSA_NAME
>> +            && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (last)))
>> +            || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (last))))
>> +            && (TREE_CODE (gimple_cond_rhs (last)) == SSA_NAME
>> +            || is_gimple_min_invariant (gimple_cond_rhs (last))))))
>> +      {
>> +
>> +         if (gimple_code(last) == GIMPLE_COND)
>> +           {
>> +              perform_path_splitting (bb);
>> +           }
>> +      }
>> +   }
>> +
>> +   loop_optimizer_finalize ();
>> +   free_original_copy_tables ();
>> +   free_dominance_info (CDI_DOMINATORS);
>> +   free_dominance_info (CDI_POST_DOMINATORS);
>> +   return 0;
>> +}
>> +
>> +namespace {
>> +
>> +const pass_data pass_data_path_split = {
>> +   GIMPLE_PASS, /* type */
>> +   "path_split", /* name */
>> +    OPTGROUP_NONE, /* optinfo_flags */
>> +    TV_TREE_PATH_SPLIT, /* tv_id */
>> +    PROP_ssa, /* properties_required */
>> +    0, /* properties_provided */
>> +    0, /* properties_destroyed */
>> +    0, /* todo_flags_start */
>> +    ( TODO_cleanup_cfg | TODO_update_ssa ), /* todo_flags_finish */
>> +};
>> +
>> +class pass_path_split : public gimple_opt_pass {
>> +   public:
>> +    pass_path_split (gcc::context *ctxt)
>> +      : gimple_opt_pass (pass_data_path_split, ctxt)
>> +    {}
>> +
>> +   /* opt_pass methods: */
>> +   opt_pass * clone () { return new pass_path_split (m_ctxt); }
>> +   virtual bool gate (function *) { return flag_tree_path_split != 0; }
>> +   virtual unsigned int execute (function *) { return
>> + execute_path_split (); }
>> +
>> +}; // class pass_path_split
>> +
>> +} // anon namespace
>> +
>> +gimple_opt_pass *
>> +make_pass_path_split (gcc::context *ctxt) {
>> +  return new pass_path_split (ctxt);
>> +}
>> --
>> 1.8.2.1
>>
>> Thanks & Regards
>> Ajit

Reply via email to