On Wed, May 3, 2017 at 6:32 PM, Jeff Law <l...@redhat.com> wrote:
> [ With the patch attached... ]
>
>
> On 05/03/2017 10:31 AM, Jeff Law wrote:
>>
>> This is the first of 3-5 patches to address pr78496.
>>
>> The goal of these patches is to catch jump threads earlier in the pipeline
>> to avoid undesirable behavior in PRE and more generally be able to exploit
>> the secondary opportunities exposed by jump threading.
>>
>> One of the more serious issues I found while investigating 78496 was VRP
>> failing to find what should have been obvious jump threads.  The fundamental
>> issue is VRP will simplify conditionals which are fed by a typecast prior to
>> jump threading.   So something like this:
>>
>> x = (typecast) y;
>> if (x == 42)
>>
>> Can often be transformed into:
>>
>> if (y == 42)
>>
>>
>> The problem is any ASSERT_EXPRS after the conditional will reference "x"
>> rather than "y".  That in turn makes it impossible for VRP to use those
>> ASSERT_EXPRs to thread later jumps that use x == <whatever>
>>
>>
>> More concretely consider this gimple code:
>>
>>
>> ;;   basic block 5, loop depth 0, count 0, freq 10000, maybe hot
>> ;;    prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED)
>> ;;    pred:       3 [50.0%]  (TRUE_VALUE,EXECUTABLE)
>> ;;                4 [100.0%]  (FALLTHRU,EXECUTABLE)
>>    # iftmp.0_2 = PHI <1(3), 0(4)>
>>    in_loop_7 = (unsigned char) iftmp.0_2;
>>    if (in_loop_7 != 0)
>>      goto <bb 6>; [33.00%]
>>    else
>>      goto <bb 12>; [67.00%]
>>
>> ;;    succ:       6 [33.0%]  (TRUE_VALUE,EXECUTABLE)
>> ;;                12 [67.0%]  (FALSE_VALUE,EXECUTABLE)
>>
>> ;;   basic block 12, loop depth 0, count 0, freq 6700, maybe hot
>> ;;    prev block 5, next block 6, flags: (NEW)
>> ;;    pred:       5 [67.0%]  (FALSE_VALUE,EXECUTABLE)
>>    in_loop_15 = ASSERT_EXPR <in_loop_7, in_loop_7 == 0>;
>>    goto <bb 7>; [100.00%]
>> ;;    succ:       7 [100.0%]  (FALLTHRU)
>>
>> ;;   basic block 6, loop depth 0, count 0, freq 3300, maybe hot
>> ;;    prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
>> ;;    pred:       5 [33.0%]  (TRUE_VALUE,EXECUTABLE)
>>    in_loop_14 = ASSERT_EXPR <in_loop_7, in_loop_7 != 0>;
>>    simple_iv ();
>> ;;    succ:       7 [100.0%]  (FALLTHRU,EXECUTABLE)
>>
>> And later we have:
>>
>> ;;   basic block 9, loop depth 0, count 0, freq 8476, maybe hot
>> ;;    prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED)
>> ;;    pred:       7 [84.8%]  (FALSE_VALUE,EXECUTABLE)
>>    if (in_loop_7 == 0)
>>      goto <bb 10>; [36.64%]
>>    else
>>      goto <bb 11>; [63.36%]
>>
>> VRP knows it can replace the uses of in_loop_7 in the conditionals in
>> blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading
>> (but well after ASSERT_EXPR insertion).
>>
>> As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6
>> (which reference in_loop_7) to thread the jump at bb9 (which now references
>> iftmp.0_2).
>>
>>
>> The cases in pr78496 are slightly more complex, but boil down to the same
>> core issue -- simplifying the conditional too early.
>>
>> Thankfully this is easy to fix.  We just split the conditional
>> simplification into two steps so that the transformation noted above occurs
>> after jump threading (the other simplifications we want to occur before jump
>> threading).
>>
>> This allows VRP1 to pick up 27 missed jump threads in the testcase from
>> 78496.  It could well be enough to address 78496, but since we don't have a
>> solid description of the desired end result I won't consider 78496 fixed
>> quite yet as there's significant further improvements we can make.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on the
>> trunk.

I think this is a hack ;)  Basically the issue is that jump-threading
uses ASSERT_EXPRs
at all (which are an implementation detail of VRP).  As far as I
understand it does that
because VRP can do "fancy" things and create ASSERT_EXPRs that do not directly
map to the conditional but to its operand def stmts.

I have meanwhile factored this "fancieness" out into (ok, bad name...)
register_edge_assert_for which records all these fancy asserts in a
vec.  This is
now used from EVRP:

      gimple *stmt = last_stmt (pred_e->src);
      if (stmt
          && gimple_code (stmt) == GIMPLE_COND
          && (op0 = gimple_cond_lhs (stmt))
          && TREE_CODE (op0) == SSA_NAME
          && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)))
              || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)))))
        {
          if (dump_file && (dump_flags & TDF_DETAILS))
            {
              fprintf (dump_file, "Visiting controlling predicate ");
              print_gimple_stmt (dump_file, stmt, 0, 0);
            }
          /* Entering a new scope.  Try to see if we can find a VR
             here.  */
          tree op1 = gimple_cond_rhs (stmt);
          if (TREE_OVERFLOW_P (op1))
            op1 = drop_tree_overflow (op1);
          tree_code code = gimple_cond_code (stmt);

          auto_vec<assert_info, 8> asserts;
          register_edge_assert_for (op0, pred_e, code, op0, op1, asserts);
          if (TREE_CODE (op1) == SSA_NAME)
            register_edge_assert_for (op1, pred_e, code, op0, op1, asserts);

regular VRP transforms those into its own assert representation in
finish_register_edge_assert_for (though assert_info should be enough
for jump threading).

So I think it should be possible for jump threading to use this new
machinery (and even DOM based jump threading or backwards threading
could make use of this!)

Richard.

>>
>> Jeff
>>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 92a4e395ba8..32cc81978df 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,14 @@
> +2017-05-03  Jeff Law  <l...@redhat.com>
> +
> +       PR tree-optimization/78496
> +       * tree-vrp.c (simplify_cond_using_ranges_1): Renamed
> +       from simplify_cond_using_ranges.  Split off code to walk
> +       backwards through casts into ...
> +       (simplify_cond_using_ranges_2): New function.
> +       (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1.
> +       (execute_vrp): After identifying jump threads, call
> +       simplify_cond_using_ranges_2.
> +
>  2017-05-03  Jeff Downs  <heydo...@somuchpressure.net>
>             Rainer Orth  <r...@cebitec.uni-bielefeld.de>
>
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 596468d33e9..55a44e4635a 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2017-05-03  Jeff Law  <l...@redhat.com>
> +
> +       PR tree-optimization/78496
> +       * gcc.dg/tree-ssa/ssa-thread-15.c: New test.
> +
>  2017-05-03  Uros Bizjak  <ubiz...@gmail.com>
>
>         * g++.dg/lto/pr79671_0.C (foo): Fix asm constraints.
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
> new file mode 100644
> index 00000000000..f73268cacbb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
> @@ -0,0 +1,51 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-vrp1" } */
> +
> +/* We should thread the if (!in_loop) completely leaving
> +   just two conditionals.  */
> +/* { dg-final { scan-tree-dump-times "if \\(" 2 "vrp1" } } */
> +
> +
> +union tree_node;
> +typedef union tree_node *tree;
> +
> +enum size_type_kind
> +{
> +  SIZETYPE,
> +  SSIZETYPE,
> +  BITSIZETYPE,
> +  SBITSIZETYPE,
> +  TYPE_KIND_LAST
> +};
> +extern tree size_int_kind (long, enum size_type_kind);
> +
> +
> +
> +typedef struct
> +{
> +
> +  tree base, step;
> +
> +} affine_iv;
> +
> +struct loop
> +{
> +
> +  int num;
> +};
> +extern unsigned char simple_iv ();
> +
> +unsigned char
> +dr_analyze_innermost (struct loop *loop, tree poffset)
> +{
> +  affine_iv offset_iv;
> +  unsigned char in_loop = (loop && loop->num);
> +
> +
> +  if (in_loop)
> +    simple_iv ();
> +
> +  if (!in_loop)
> +    offset_iv.step = size_int_kind (0, SSIZETYPE);
> +
> +}
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index df5e4267491..6a4035dc924 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -9977,7 +9977,7 @@ range_fits_type_p (value_range *vr, unsigned
> dest_precision, signop dest_sgn)
>     the original conditional.  */
>
>  static bool
> -simplify_cond_using_ranges (gcond *stmt)
> +simplify_cond_using_ranges_1 (gcond *stmt)
>  {
>    tree op0 = gimple_cond_lhs (stmt);
>    tree op1 = gimple_cond_rhs (stmt);
> @@ -10080,6 +10080,22 @@ simplify_cond_using_ranges (gcond *stmt)
>             }
>         }
>      }
> +  return false;
> +}
> +
> +/* STMT is a conditional at the end of a basic block.
> +
> +   If the conditional is of the form SSA_NAME op constant and the SSA_NAME
> +   was set via a type conversion, try to replace the SSA_NAME with the RHS
> +   of the type conversion.  Doing so makes the conversion dead which helps
> +   subsequent passes.  */
> +
> +static void
> +simplify_cond_using_ranges_2 (gcond *stmt)
> +{
> +  tree op0 = gimple_cond_lhs (stmt);
> +  tree op1 = gimple_cond_rhs (stmt);
> +  enum tree_code cond_code = gimple_cond_code (stmt);
>
>    /* If we have a comparison of an SSA_NAME (OP0) against a constant,
>       see if OP0 was set by a type conversion where the source of
> @@ -10097,7 +10113,7 @@ simplify_cond_using_ranges (gcond *stmt)
>
>        if (!is_gimple_assign (def_stmt)
>           || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
> -       return false;
> +       return;
>
>        innerop = gimple_assign_rhs1 (def_stmt);
>
> @@ -10142,12 +10158,16 @@ simplify_cond_using_ranges (gcond *stmt)
>               tree newconst = fold_convert (TREE_TYPE (innerop), op1);
>               gimple_cond_set_lhs (stmt, innerop);
>               gimple_cond_set_rhs (stmt, newconst);
> -             return true;
> +             update_stmt (stmt);
> +             if (dump_file && (dump_flags & TDF_DETAILS))
> +               {
> +                 fprintf (dump_file, "Folded into: ");
> +                 print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +                 fprintf (dump_file, "\n");
> +               }
>             }
>         }
>      }
> -
> -  return false;
>  }
>
>  /* Simplify a switch statement using the value range of the switch
> @@ -10746,7 +10766,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator
> *gsi)
>         }
>      }
>    else if (gimple_code (stmt) == GIMPLE_COND)
> -    return simplify_cond_using_ranges (as_a <gcond *> (stmt));
> +    return simplify_cond_using_ranges_1 (as_a <gcond *> (stmt));
>    else if (gimple_code (stmt) == GIMPLE_SWITCH)
>      return simplify_switch_using_ranges (as_a <gswitch *> (stmt));
>    else if (is_gimple_call (stmt)
> @@ -11759,6 +11779,21 @@ execute_vrp (bool warn_array_bounds_p)
>       the datastructures built by VRP.  */
>    identify_jump_threads ();
>
> +  /* A comparison of an SSA_NAME against a constant where the SSA_NAME
> +     was set by a type conversion can often be rewritten to use the
> +     RHS of the type conversion.
> +
> +     However, doing so inhibits jump threading through the comparison.
> +     So that transformation is not performed until after jump threading
> +     is complete.  */
> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      gimple *last = last_stmt (bb);
> +      if (last && gimple_code (last) == GIMPLE_COND)
> +       simplify_cond_using_ranges_2 (as_a <gcond *> (last));
> +    }
> +
>    vrp_free_lattice ();
>
>    free_numbers_of_iterations_estimates (cfun);
>

Reply via email to