On Tue, Dec 10, 2024 at 11:04 AM Feng Xue OS
<[email protected]> wrote:
>
> Thanks, and please see my comments as below:
>
> >> Currently, if could, scev-cprop unconditionally replaces loop closed ssa
> >> with
> >> an expression built from loop initial value and loop niter, which might
> >> cause
> >> redundant code-gen when all interior computations related to IV inside loop
> >> are also neccessary. As example, for the below case:
> >>
> >> p = init_addr;
> >>
> >> for (i = 0; i < N; i++)
> >> {
> >> p++;
> >> *p = ...;
> >> }
> >>
> >> . = p;
> >>
> >> Then scev-cprop would end up with code:
> >>
> >> p = init_addr;
> >>
> >> for (i = 0; i < N; i++)
> >> {
> >> p++;
> >> *p = ...;
> >> }
> >>
> >> . = init_addr + N; // Redundant computation
> >>
> >> For bitmask-manipulation loop, it may result in more and costy
> >> re-evaluation,
> >> such as popcount. To target the issue, we need a means as statement
> >> necessity
> >> propagation used in DCE, to figure out if impacted IVs are really needed.
> >> As
> >> pointed out by Richard, we could wire scev-cprop into DCE, here this patch
> >> makes the thing.
> >>
> >> But one difference is that we consider retaining scev-cprop pass, and
> >> extends
> >> its opt flag to support both this new (-ftree-scev-cprop[=1] by default)
> >> and
> >> the original (-ftree-scev-cprop=2). In reality, I think the new way could
> >> get us more compact and faster code at most occasions, however, it is
> >> possible
> >> the original handling might be better, because replacement could impact
> >> folding
> >> of statements following loop, for example,
> >>
> >> p = init_addr;
> >>
> >> for (i = 0; i < N; i++)
> >> {
> >> p++;
> >> *p = ...;
> >> }
> >>
> >> p1 = p;
> >>
> >> ...
> >>
> >> a = p1 - init_addr; // a = (init_addr + N) - init_addr = N
> >> b = p1 - N; // b = (init_addr + N) - N = init_addr
> >>
> >> It is hard to take this into cost-model consideration, in that global-wide
> >> check on folding opportunities might be time-consuming, and the above case
> >> is
> >> not that common. Therefore, as a backup, we leave the original means still
> >> there, so that give user an ability to enable it when some case matches
> >> with
> >> the scenario.
> >
> > Thanks for working on this. I don't think we want to preserve the "old" way
> > in a conditional way. There's also already pass_cd_dce two passes after
> > the old scev_cprop, so adding another effective DCE pass looks excessive to
> > me.
> >
> > As Honza said in another PR (wrt loop split), passes might invoke final
> > value
> > replacement on select IVs themselves.
> >
> >> Thanks,
> >> Feng
> >> ---
> >> gcc/
> >> PR tree-optimization/90594
> >> * common.opt (ftree-scev-cprop=): New option.
> >> (ftree-scev-cprop): Change it to be alias of ftree-scev-cprop=.
> >
> > So this would become a gate on the DCE functionality.
>
> OK. And I think we need one more control on triggering times of scev-cprop in
> DCE, as I could image, there are three schemes:
>
> 1. just only once, as originally. We need to add another pass parameter
> for DCE,
> and explicitly specifies it in some pass_dce or pass_cd_dce.
Yes. I'd only ever do it for cd_dce since only that pass would be
able to elide a loop
completely.
> 2. only enabled when loop structure is initialized. We could do that by
> relying on
> existing check of "scev_initialized_p"
If you do it from the CD-DCE instance next to SCCP currently that only
gets run when
loops are (were) there.
> 3. every time when DCE is called. This would lead to compile-time increase
> since
> as a basic utility pass, DCE is called quite a few times.
>
> >
> > Some more comments inline - note I think this should now wait for next
> > stage1,
>
> stage1 of gcc-16?
Yes.
> > but the refactored API could be used for example to solve the
>
> It is OK to check-in the first patch in this release?
Yes.
> > loop-split issue.
> >
> >> * tree-scalar-evolution.cc (simple_scev_final_value): Make it be
> >> global function.
> >> (apply_scev_final_value_replacement): Likewise.
> >> * tree-scalar-evolution.h (scev_const_prop): Remove declaration.
> >> (simple_scev_final_value): Add new declaration.
> >> (apply_scev_final_value_replacement): Likewise.
> >> * tree-ssa-dce.cc (stmt_stats): Add new field sccp_replaced_phis.
> >> (scev_cprop_entry): New struct.
> >> (scev_cprop_level): New static variable.
> >> (scev_cprop_map): Likewise.
> >> (mark_expr_operand_necessary): New function.
> >> (get_loop_closed_phi_scev_replacement): Likewise.
> >> (propagate_necessity): Change neccssity propagation for loop closed
> >> phi when scev-cpropr is enabled.
> >> (fold_scev_cprop_entry): New function.
> >> (remove_dead_phis): Rename to replace_or_remove_phis. And do scev
> >> final value replacement for loop closed phi.
> >> (eliminate_unnecessary_stmts): Changed to call
> >> replace_or_remove_phis.
> >> (print_stats): Print stats for replaced phi.
> >> (tree_dce_init): Initialize scev_cprop_map.
> >> (tree_dce_done): Delete scev_cprop_map.
> >> (perform_tree_ssa_dce): Make it be global function. Add scev-cprop
> >> specific handling.
> >> * tree-ssa-dce.h (perform_tree_ssa_dce): Add new declaration.
> >> * tree-ssa-loop.cc (pass_scev_cprop::execute): Changed to call
> >> perform_tree_ssa_dce.
> >> ---
> >> gcc/common.opt | 9 +-
> >> gcc/tree-scalar-evolution.cc | 6 +-
> >> gcc/tree-scalar-evolution.h | 3 +-
> >> gcc/tree-ssa-dce.cc | 251 ++++++++++++++++++++++++++++++++---
> >> gcc/tree-ssa-dce.h | 1 +
> >> gcc/tree-ssa-loop.cc | 8 +-
> >> 6 files changed, 249 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/gcc/common.opt b/gcc/common.opt
> >> index a42537c5f1e..98210ed72fd 100644
> >> --- a/gcc/common.opt
> >> +++ b/gcc/common.opt
> >> @@ -3425,8 +3425,15 @@ ftree-vect-loop-version
> >> Common Ignore
> >> Does nothing. Preserved for backward compatibility.
> >>
> >> +; If this option is 1, only perform scev cprop when all statements to
> >> evaluate
> >> +; related IV inside loop could be eliminated, if it is 2, perform scev
> >> cprop
> >> +; unconditionally.
> >> +ftree-scev-cprop=
> >> +Common Joined RejectNegative UInteger Var(flag_tree_scev_cprop) Init(1)
> >> Optimization IntegerRange(0, 2)
> >> +Enable copy propagation of scalar-evolution information.
> >> +
> >> ftree-scev-cprop
> >> -Common Var(flag_tree_scev_cprop) Init(1) Optimization
> >> +Common Alias(ftree-scev-cprop=,1,0)
> >> Enable copy propagation of scalar-evolution information.
> >>
> >> ftrivial-auto-var-init=
> >> diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> >> index 5733632aa78..9e51b18b237 100644
> >> --- a/gcc/tree-scalar-evolution.cc
> >> +++ b/gcc/tree-scalar-evolution.cc
> >> @@ -3381,7 +3381,7 @@ scev_finalize (void)
> >> }
> >>
> >> /* Returns true if the expression EXPR is considered to be too expensive
> >> - for scev_const_prop. Sets *COND_OVERFLOW_P to true when the
> >> + for scev const prop. Sets *COND_OVERFLOW_P to true when the
> >> expression might contain a sub-expression that is subject to undefined
> >> overflow behavior and conditionally evaluated. */
> >>
> >> @@ -3782,7 +3782,7 @@ analyze_and_compute_bitop_with_inv_effect (class
> >> loop* loop, tree phidef,
> >> final value to avoid overflow UB when replacement would really happen
> >> later. */
> >>
> >> -static bool
> >> +bool
> >> simple_scev_final_value (class loop *loop, tree value, tree *final_value,
> >> bool *rewrite_overflow)
> >
> > It might be useful to externalize the cost decision - for example if
> > we can elide
> > the full loop we might be more forgiving. Or if analysis merely wants to
> > use
> > the final value for analysis purposes rather than emit it, it might
> > not care for costs
> > at all.
> >
> >> {
> >> @@ -3877,7 +3877,7 @@ simple_scev_final_value (class loop *loop, tree
> >> value, tree *final_value,
> >> have to leave an unused copy assignment around, if so, ALWAYS_KEEP is
> >> set
> >> to true. */
> >>
> >> -static void
> >> +void
> >> apply_scev_final_value_replacement (gphi *phi, tree final_value,
> >> bool rewrite_overflow, bool
> >> always_keep)
> >> {
> >> diff --git a/gcc/tree-scalar-evolution.h b/gcc/tree-scalar-evolution.h
> >> index 41ba6b237b5..4510e0c52e9 100644
> >> --- a/gcc/tree-scalar-evolution.h
> >> +++ b/gcc/tree-scalar-evolution.h
> >> @@ -35,7 +35,8 @@ extern tree instantiate_scev (edge, class loop *, tree);
> >> extern tree resolve_mixers (class loop *, tree, bool *);
> >> extern void gather_stats_on_scev_database (void);
> >> extern bool final_value_replacement_loop (class loop *);
> >> -extern unsigned int scev_const_prop (void);
> >> +extern bool simple_scev_final_value (class loop *, tree, tree *, bool *);
> >> +extern void apply_scev_final_value_replacement (gphi *, tree, bool, bool);
> >> extern bool expression_expensive_p (tree, bool *);
> >> extern bool simple_iv_with_niters (class loop *, class loop *, tree,
> >> struct affine_iv *, tree *, bool);
> >> diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
> >> index ad3ac2785cf..5a667505dfb 100644
> >> --- a/gcc/tree-ssa-dce.cc
> >> +++ b/gcc/tree-ssa-dce.cc
> >> @@ -59,8 +59,10 @@ along with GCC; see the file COPYING3. If not see
> >> #include "tree-eh.h"
> >> #include "gimplify.h"
> >> #include "gimple-iterator.h"
> >> +#include "gimplify-me.h"
> >> #include "tree-cfg.h"
> >> #include "tree-ssa-loop-niter.h"
> >> +#include "tree-ssa-loop-manip.h"
> >> #include "tree-into-ssa.h"
> >> #include "tree-dfa.h"
> >> #include "cfgloop.h"
> >> @@ -77,8 +79,16 @@ static struct stmt_stats
> >> int total_phis;
> >> int removed;
> >> int removed_phis;
> >> + int sccp_replaced_phis;
> >> } stats;
> >>
> >> +struct scev_cprop_entry
> >> +{
> >> + tree value;
> >> + bool rewrite_overflow;
> >> + bool visited;
> >> +};
> >> +
> >> #define STMT_NECESSARY GF_PLF_1
> >>
> >> static vec<gimple *> worklist;
> >> @@ -94,6 +104,14 @@ static sbitmap last_stmt_necessary;
> >> /* Vector indicating that BB contains statements that are live. */
> >> static sbitmap bb_contains_live_stmts;
> >>
> >> +/* Control level of scev cprop. */
> >> +static int scev_cprop_level;
> >> +
> >> +/* For a loop closed PHI, if the induction variable at loop exit could be
> >> + calculated using initial value and loop niter, we would add a map
> >> between
> >> + definition of the PHI and the induction final value. */
> >> +static hash_map<tree, scev_cprop_entry> *scev_cprop_map;
> >> +
> >> /* Before we can determine whether a control branch is dead, we need to
> >> compute which blocks are control dependent on which edges.
> >>
> >> @@ -241,6 +259,18 @@ mark_operand_necessary (tree op)
> >> worklist.safe_push (stmt);
> >> }
> >>
> >> +/* Mark operand stored in TP as necessary if it is a ssa name. */
> >> +
> >> +tree
> >> +mark_expr_operand_necessary (tree *tp, int *walk_subtrees
> >> ATTRIBUTE_UNUSED,
> >> + void *data ATTRIBUTE_UNUSED)
> >> +{
> >> + if (TREE_CODE (*tp) == SSA_NAME)
> >> + mark_operand_necessary (*tp);
> >> +
> >> + return NULL_TREE;
> >> +}
> >> +
> >> /* Return true if STMT is a call to allocation function that can be
> >> optimized out if the memory block is never used for anything else
> >> then NULL pointer check or free.
> >> @@ -765,6 +795,49 @@ valid_new_delete_pair_p (gimple *new_call, gimple
> >> *delete_call)
> >> return valid_new_delete_pair_p (new_asm, delete_asm);
> >> }
> >>
> >> +/* Given a PHI, check if it is a loop closed PHI, and related induction
> >> + variable has a simple final value that could be directly calculated
> >> + with its initial value and loop niter. If satisfied, insert a new
> >> + entry to describe this when ADD_NEW is true, or return the existing
> >> + matched entry when ADD_NEW is false, otherwise, return NULL. */
> >> +
> >> +static scev_cprop_entry *
> >> +get_loop_closed_phi_scev_replacement (gphi *phi, bool add_new)
> >> +{
> >> + /* Do nothing if scep prop is not enabled or this is definitely
> >> + not a loop closed PHI. */
> >> + if (scev_cprop_level < 1 || gimple_phi_num_args (phi) != 1)
> >
> > Note LC PHIs can have multiple preds - but indeed it gets difficult
>
> Do you mean that a LC PHI may contain def that is not in a loop, as
>
> v = PHI (v1<In loop>, v2<not in loop>);
Yes, or two defs from two different loops.
> > here so I'd only change the comment. All PHIs on loop exits are
> > LC PHI nodes [if their arg is defined in the loop].
> >
> > So ...
> >
> >> + return NULL;
> >> +
> >> + tree result = gimple_phi_result (phi);
> >> +
> >> + if (!add_new)
> >> + return scev_cprop_map->get (result);
> >> +
> >> + edge exit = single_pred_edge (gimple_bb (phi));
> >> + tree arg = gimple_phi_arg_def (phi, 0);
> >> +
> >> + if (TREE_CODE (arg) != SSA_NAME)
> >> + return NULL;
> >> +
> >> + class loop *loop = exit->src->loop_father;
> >> + scev_cprop_entry entry = {};
> >> + bool existed;
> >> +
> >> + if (!simple_scev_final_value (loop, arg, &entry.value,
> >> + &entry.rewrite_overflow))
> >> + return NULL;
> >> +
> >> + auto &new_entry = scev_cprop_map->get_or_insert (result, &existed);
> >> +
> >> + /* Based on the way of DCE propagation, an expression would not be
> >> handled
> >> + more than once. */
> >> + gcc_assert (!existed);
> >> + new_entry = entry;
> >> +
> >> + return &new_entry;
> >> +}
> >> +
> >> /* Propagate necessity using the operands of necessary statements.
> >> Process the uses on each statement in the worklist, and add all
> >> feeding statements which contribute to the calculation of this
> >> @@ -817,12 +890,33 @@ propagate_necessity (bool aggressive)
> >> gphi *phi = as_a <gphi *> (stmt);
> >> size_t k;
> >>
> >> - for (k = 0; k < gimple_phi_num_args (stmt); k++)
> >> - {
> >> - tree arg = PHI_ARG_DEF (stmt, k);
> >> - if (TREE_CODE (arg) == SSA_NAME)
> >> - mark_operand_necessary (arg);
> >> - }
> >> + /* When scev cprop is enabled, computation of induction variable
> >> + might not be really needed, if its final value at loop exit
> >> could
> >> + be directly calculated using initial value and loop niter, and
> >> + any interior evaluation around the induction variable does not
> >> + participate in other necessary statements in the loop. So we
> >> only
> >> + propagate necessity through ssa operands in the final value.
> >> + An example:
> >> +
> >> + v = init;
> >> +
> >> + for (i = 0; i < N; i++)
> >> + v += step;
> >> +
> >> + . = v;
> >> +
> >> + The loop could be completely removed, and replaced with a
> >> simple
> >> + evaluation as: v = init + step * N, therefore, only "init",
> >> "step"
> >> + and "N" are actually necessary. */
> >> + if (auto *entry = get_loop_closed_phi_scev_replacement (phi,
> >> true))
> >> + walk_tree (&entry->value, mark_expr_operand_necessary, NULL,
> >> NULL);
> >> + else
> >> + for (k = 0; k < gimple_phi_num_args (stmt); k++)
> >> + {
> >
> > ... I'd say you want to do get_loop_closed_phi_scev_replacement iff
> > PHI_ARG_EDGE is a loop exit edge only. Having not all edges
> > loop exits or being replaceable might turn out not easily supported
> > (you'd need to split the edge, re-creating unrelated LC PHIs on the
> > original loop exit), so only doing it for single_pred blocks might make
> > sense for simplicity.
>
> In get_loop_closed_phi_scev_replacement, I requires gimple_phi_num_args(phi)
> must be 1,
> which is equivalent to single_pred block.
Yes, that's because "code generation" only can deal with that case
since it emits code
in the block of the PHI rather than on the exit edge.
I still think that get_loop_closed_phi_scev_replacement should be
engineered in a less
restrictive way and the additional limitation imposed by the caller.
> >
> >> + tree arg = PHI_ARG_DEF (stmt, k);
> >> + if (TREE_CODE (arg) == SSA_NAME)
> >> + mark_operand_necessary (arg);
> >> + }
> >>
> >> /* For PHI operands it matters from where the control flow
> >> arrives
> >> to the BB. Consider the following example:
> >> @@ -1104,10 +1198,80 @@ propagate_necessity (bool aggressive)
> >> }
> >> }
> >>
> >> -/* Remove dead PHI nodes from block BB. */
> >> +/* Try to fold the final value of ENTRY to a constant or ssa name. Since
> >> + one entry may refer ssa name that is described by other entry, so this
> >> + function would recursively process folding along the def/use relation.
> >> + If the processed value does not belong to any scev_cprop entry, it is
> >> + stored in VALUE_PTR, and ENTRY is set to NULL. Return true if the
> >> value
> >> + does be simplified. */
> >>
> >> static bool
> >> -remove_dead_phis (basic_block bb)
> >> +fold_scev_cprop_entry (scev_cprop_entry *entry, tree *value_ptr = NULL)
> >> +{
> >> + if (entry)
> >> + {
> >> + if (entry->visited || CONSTANT_CLASS_P (entry->value))
> >> + return true;
> >> +
> >
> > Hmm, how does this work? You are using pointer equality here so
> > rely on the SCEV final value of two PHIs that are dependent on each
> > other to have pointer-equivalent sub-expressions? The final value
> > of a PHI does obviously not refer to the PHI result of another LC PHI
> > here?
>
> See this case:
>
> int foo()
> {
> int i, j, m, n;
>
> m = 0;
> for (i = 0; i < 1000; i++)
> m += 3;
>
> n = 0;
> for (j = 0; j < 2000; j++)
> n += m;
>
> return n;
> }
>
> The final value of LC PHI that defines "n" would refer to result of LC PHI
> that
> defines "m".
>
> When this function tries to fold the final value of "n", it will
> recursively fold
> that of "m", finally we could get a constant for "n".
Ah, I see. So this requires more comments I think. I thought this was for
for (j = 0; j < m; j++)
n = j * 200;
return n + j;
where the final value of n is 200 * m and the final value of j is m
(now imagine more complex 'm'), and the re-use be used to make
the replacement cheaper when you need to preserve multiple
related final values.
> >> + entry->visited = true;
> >> + entry->value = unshare_expr (entry->value);
> >> + value_ptr = &entry->value;
> >> + }
> >> +
> >> + tree value = *value_ptr;
> >> +
> >> + if (TREE_CODE (value) == SSA_NAME)
> >> + {
> >> + scev_cprop_entry *def_entry;
> >> +
> >> + if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (value)
> >> + && (def_entry = scev_cprop_map->get (value)))
> >> + {
> >> + fold_scev_cprop_entry (def_entry);
> >> +
> >> + if (CONSTANT_CLASS_P (def_entry->value) ||
> >
> > ||s go to the next line.
> >
> >> + TREE_CODE (def_entry->value) == SSA_NAME)
> >> + {
> >> + *value_ptr = def_entry->value;
> >> + return true;
> >> + }
> >> + }
> >> +
> >> + return false;
> >> + }
> >> +
> >> + bool changed = false;
> >> +
> >> + if (EXPR_P (value))
> >> + {
> >> + for (int i = 0; i < TREE_OPERAND_LENGTH (value); i++)
> >> + {
> >> + tree &opnd = TREE_OPERAND (value, i);
> >> +
> >> + if (opnd && !CONSTANT_CLASS_P (opnd))
> >> + changed |= fold_scev_cprop_entry (NULL, &opnd);
> >> + }
> >> + }
> >> +
> >> + if (changed)
> >> + {
> >> + /* Only fold the value if any of its operand has been folded. */
> >> + tree new_value = fold (value);
> >> +
> >> + if (new_value == value)
> >> + changed = false;
> >> + else
> >> + *value_ptr = new_value;
> >> + }
> >> +
> >> + return changed;
> >> +}
> >> +
> >> +/* Remove dead PHI nodes from block BB, and try to replace loop closed
> >> PHIs
> >> + with scev final values. */
> >> +
> >> +static bool
> >> +replace_or_remove_phis (basic_block bb)
> >> {
> >> bool something_changed = false;
> >> gphi *phi;
> >> @@ -1158,6 +1322,38 @@ remove_dead_phis (basic_block bb)
> >> stats.removed_phis++;
> >> continue;
> >> }
> >> + else if (auto *entry = get_loop_closed_phi_scev_replacement (phi,
> >> false))
> >> + {
> >> + tree arg = gimple_phi_arg_def (phi, 0);
> >> +
> >> + fold_scev_cprop_entry (entry);
> >> +
> >> + if (scev_cprop_level > 1
> >> + || !bitmap_bit_p (processed, SSA_NAME_VERSION (arg))
> >> + || CONSTANT_CLASS_P (entry->value)
> >> + || TREE_CODE (entry->value) == SSA_NAME)
> >> + {
> >> + something_changed = true;
> >> + stats.sccp_replaced_phis++;
> >> + if (dump_file && (dump_flags & TDF_DETAILS))
> >> + {
> >> + fprintf (dump_file, "Replacing : ");
> >> + print_gimple_stmt (dump_file, phi, 0, TDF_SLIM);
> >> + fprintf (dump_file, " with : ");
> >> + print_generic_expr (dump_file, gimple_phi_result (phi));
> >> + fprintf (dump_file, " = ");
> >> + print_generic_expr (dump_file, entry->value);
> >> + fprintf (dump_file, ";\n\n");
> >> + }
> >> +
> >> + /* Advance iterator before the PHI is removed. */
> >> + gsi_next (&gsi);
> >> + apply_scev_final_value_replacement (phi, entry->value,
> >> + entry->rewrite_overflow,
> >> + false);
> >> + continue;
> >> + }
> >> + }
> >>
> >> gsi_next (&gsi);
> >> }
> >> @@ -1611,7 +1807,7 @@ eliminate_unnecessary_stmts (bool aggressive)
> >> }
> >>
> >> /* Remove dead PHI nodes. */
> >> - something_changed |= remove_dead_phis (bb);
> >> + something_changed |= replace_or_remove_phis (bb);
> >> }
> >>
> >> /* First remove queued edges. */
> >> @@ -1755,6 +1951,11 @@ print_stats (void)
> >>
> >> fprintf (dump_file, "Removed %d of %d PHI nodes (%d%%)\n",
> >> stats.removed_phis, stats.total_phis, (int) percg);
> >> +
> >> + if (stats.sccp_replaced_phis)
> >> + fprintf (dump_file, "Replaced %d PHI node%s by SCCP\n",
> >> + stats.sccp_replaced_phis,
> >> + stats.sccp_replaced_phis > 1 ? "s" : "");
> >> }
> >>
> >> /* Initialization for this pass. Set up the used data structures. */
> >> @@ -1775,6 +1976,11 @@ tree_dce_init (bool aggressive)
> >> processed = sbitmap_alloc (num_ssa_names + 1);
> >> bitmap_clear (processed);
> >>
> >> + if (scev_cprop_level > 0)
> >> + scev_cprop_map = new hash_map<tree, scev_cprop_entry>();
> >> + else
> >> + scev_cprop_map = NULL;
> >> +
> >> worklist.create (64);
> >> cfg_altered = false;
> >> }
> >> @@ -1795,6 +2001,9 @@ tree_dce_done (bool aggressive)
> >>
> >> sbitmap_free (processed);
> >>
> >> + if (scev_cprop_map)
> >> + delete scev_cprop_map;
> >> +
> >> worklist.release ();
> >> }
> >>
> >> @@ -2005,23 +2214,29 @@ make_forwarders_with_degenerate_phis (function *fn)
> >> In aggressive mode, control dependences are taken into account, which
> >> results in more dead code elimination, but at the cost of some time.
> >> */
> >>
> >> -static unsigned int
> >> -perform_tree_ssa_dce (bool aggressive)
> >> +unsigned int
> >> +perform_tree_ssa_dce (bool aggressive, int scev_cprop = 0)
> >> {
> >> bool something_changed = 0;
> >> unsigned todo = 0;
> >> + bool need_loop = aggressive || scev_cprop > 0;
> >>
> >> /* Preheaders are needed for SCEV to work.
> >> Simple lateches and recorded exits improve chances that loop will
> >> proved to be finite in testcases such as in loop-15.c and loop-24.c
> >> */
> >> bool in_loop_pipeline = scev_initialized_p ();
> >> - if (aggressive && ! in_loop_pipeline)
> >> + if (need_loop && ! in_loop_pipeline)
> >> {
> >> loop_optimizer_init (LOOPS_NORMAL
> >> | LOOPS_HAVE_RECORDED_EXITS);
> >
> > add LOOP_CLOSED_SSA here iff scev_cprop, in_loop_pipeline ensures
> > we're in LC SSA.
> >
>
> Is it possible that in_loop_pipeline is true while LOOP_CLOSED_SSA form of
> a loop is broken? If this happen, it is a bug?
Yes, currently we only have SCEV initialized during the loop pipeline and
there LOOP_CLOSED_SSA is active and upon entry and exit to a pass
in the loop pipeline the form should be valid.
> >> scev_initialize ();
> >> }
> >>
> >> + scev_cprop_level = scev_cprop;
> >> +
> >> + if (scev_cprop > 0 && !loops_state_satisfies_p (LOOP_CLOSED_SSA))
> >> + rewrite_into_loop_closed_ssa (NULL, 0);
> >> +
> >> if (aggressive)
> >> todo |= make_forwarders_with_degenerate_phis (cfun);
> >>
> >> @@ -2044,12 +2259,6 @@ perform_tree_ssa_dce (bool aggressive)
> >>
> >> find_obviously_necessary_stmts (aggressive);
> >>
> >> - if (aggressive && ! in_loop_pipeline)
> >> - {
> >> - scev_finalize ();
> >> - loop_optimizer_finalize ();
> >> - }
> >> -
> >> longest_chain = 0;
> >> total_chain = 0;
> >> nr_walks = 0;
> >> @@ -2058,6 +2267,12 @@ perform_tree_ssa_dce (bool aggressive)
> >> propagate_necessity (aggressive);
> >> BITMAP_FREE (visited);
> >>
> >> + if (need_loop && ! in_loop_pipeline)
> >> + {
> >> + scev_finalize ();
> >> + loop_optimizer_finalize ();
> >> + }
> >> +
> >> something_changed |= eliminate_unnecessary_stmts (aggressive);
> >> something_changed |= cfg_altered;
> >>
> >> diff --git a/gcc/tree-ssa-dce.h b/gcc/tree-ssa-dce.h
> >> index b0e92a58ea8..ef5b77ce36a 100644
> >> --- a/gcc/tree-ssa-dce.h
> >> +++ b/gcc/tree-ssa-dce.h
> >> @@ -18,5 +18,6 @@ along with GCC; see the file COPYING3. If not see
> >>
> >> #ifndef TREE_SSA_DCE_H
> >> #define TREE_SSA_DCE_H
> >> +extern unsigned perform_tree_ssa_dce (bool, int = 0);
> >> extern void simple_dce_from_worklist (bitmap, bitmap = nullptr);
> >> #endif
> >> diff --git a/gcc/tree-ssa-loop.cc b/gcc/tree-ssa-loop.cc
> >> index 1d9afd98015..04fd58c0dc5 100644
> >> --- a/gcc/tree-ssa-loop.cc
> >> +++ b/gcc/tree-ssa-loop.cc
> >> @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. If not see
> >> #include "tree-ssa-loop-manip.h"
> >> #include "tree-ssa-loop-niter.h"
> >> #include "tree-ssa-loop.h"
> >> +#include "tree-ssa-dce.h"
> >> #include "cfgloop.h"
> >> #include "tree-inline.h"
> >> #include "tree-scalar-evolution.h"
> >> @@ -402,14 +403,9 @@ public:
> >> unsigned
> >> pass_scev_cprop::execute (function *)
> >> {
> >> - bool any = false;
> >> -
> >> /* Perform final value replacement in loops, in case the replacement
> >> expressions are cheap. */
> >> - for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
> >> - any |= final_value_replacement_loop (loop);
> >> -
> >> - return any ? TODO_cleanup_cfg | TODO_update_ssa_only_virtuals : 0;
> >> + return perform_tree_ssa_dce (false, flag_tree_scev_cprop);
> >> }
> >>
> >> } // anon namespace
> >> --
> >> 2.17.1
> >