On Fri, Sep 27, 2024 at 6:39 PM Artemiy Volkov
<artemiy.vol...@synopsys.com> wrote:
>
> On 9/27/2024 1:24 PM, Richard Biener wrote:
> > On Mon, 23 Sep 2024, Artemiy Volkov wrote:
> >
> >> Implement a match.pd transformation inverting the sign of X in
> >> C1 - X cmp C2, where C1 and C2 are integer constants and X is
> >> of a wrapping signed type, by observing that:
> >>
> >> (a) If cmp is == or !=, simply move X and C2 to opposite sides of
> >> the comparison to arrive at X cmp C1 - C2.
> >>
> >> (b) If cmp is <:
> >>      - C1 - X < C2 means that C1 - X spans the values of -INF,
> >>        -INF + 1, ..., C2 - 1;
> >>          - Therefore, X is one of C1 - -INF, C1 - (-INF + 1), ...,
> >>        C1 - C2 + 1;
> >>      - Subtracting (C1 + 1), X - (C1 + 1) is one of - (-INF) - 1,
> >>            - (-INF) - 2, ..., -C2;
> >>          - Using the fact that - (-INF) - 1 is +INF, derive that
> >>            X - (C1 + 1) spans the values +INF, +INF - 1, ..., -C2;
> >>          - Thus, the original expression can be simplified to
> >>            X - (C1 + 1) > -C2 - 1.
> >>
> >> (c) Similarly, C1 - X <= C2 is equivalent to X - (C1 + 1) >= -C2 - 1.
> >>
> >> (d) The >= and > cases are negations of (b) and (c), respectively.
> >>
> >> (e) In all cases, the expression -C2 - 1 can be shortened to
> >> bit_not (C2).
> >>
> >> This transformation allows to occasionally save load-immediate /
> >> subtraction instructions, e.g. the following statement:
> >>
> >> 10 - (int)f() >= 20;
> >>
> >> now compiles to
> >>
> >> addi    a0,a0,-11
> >> slti    a0,a0,-20
> >>
> >> instead of
> >>
> >> li      a5,10
> >> sub     a0,a5,a0
> >> slti    t0,a0,20
> >> xori    a0,t0,1
> >>
> >> on 32-bit RISC-V when compiled with -fwrapv.
> >>
> >> Additional examples can be found in the newly added test file.  This
> >> patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
> >> and additionally regtested on riscv32.
> >>
> >> gcc/ChangeLog:
> >>
> >>          PR tree-optimization/116024
> >>          * match.pd: New transformation around integer comparison.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>          * gcc.dg/tree-ssa/pr116024-1-fwrapv.c: New test.
> >>
> >> Signed-off-by: Artemiy Volkov <arte...@synopsys.com>
> >> ---
> >>   gcc/match.pd                                  | 20 ++++-
> >>   .../gcc.dg/tree-ssa/pr116024-1-fwrapv.c       | 73 +++++++++++++++++++
> >>   2 files changed, 92 insertions(+), 1 deletion(-)
> >>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
> >>
> >> diff --git a/gcc/match.pd b/gcc/match.pd
> >> index d0489789527..bf3b4a2e3fe 100644
> >> --- a/gcc/match.pd
> >> +++ b/gcc/match.pd
> >> @@ -8970,7 +8970,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>       (cmp (plus @1 (minus @2 @0)) @2))
> >>          (if (cmp == LT_EXPR || cmp == GE_EXPR)
> >>       (cmp (plus @1 (minus @2
> >> -               (plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))))))
> >> +               (plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))
> >> +/* For wrapping signed types (-fwrapv), transform like so (using < as 
> >> example):
> >> +     C1 - X < C2
> >> +      ==>  C1 - X = { -INF, -INF + 1, ..., C2 - 1 }
> >> +      ==>  X = { C1 - (-INF), C1 - (-INF + 1), ..., C1 - C2 + 1 }
> >> +      ==>  X - (C1 + 1) = { - (-INF) - 1, - (-INF) - 2, ..., -C2 }
> >> +      ==>  X - (C1 + 1) = { +INF, +INF - 1, ..., -C2 }
> >> +      ==>  X - (C1 + 1) > -C2 - 1
> >> +      ==>  X - (C1 + 1) > bit_not (C2)
> >> +
> >> +      Similarly,
> >> +     C1 - X <= C2 ==> X - (C1 + 1) >= bit_not (C2);
> >> +     C1 - X >= C2 ==> X - (C1 + 1) <= bit_not (C2);
> >> +     C1 - X > C2 ==> X - (C1 + 1) < bit_not (C2).  */
> >> +   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
> >
> > You need to add && !TYPE_UNSIGNED (TREE_TYPE (@1)) here,
> > TYPE_OVERFLOW_WRAPS is also true for unsigned types.
>
> This pattern is written as a series of nested ifs, and this if is
> actually the else arm of the
>      if (TYPE_UNSIGNED (TREE_TYPE (@1))
> check on line 9078 (introduced in patch #2), so if we got here then
> TYPE_UNSIGNED is already necessarily false (and thus this code already
> only deals with signed wrapping types).  I don't think the extra check
> is necessary, but I can add it if it makes things clearer.  What do you
> think?

I think it's good for documenting to put in a redundant check.  Large
if-elseif-elseif chains can be difficult to understand.  Note there's also

(switch
 (if (...) ...)
 (if (...) ...)
 (if (...) ...))

which for a chain of cases is often better to use.

Richard.

>
> Thanks,
> Artemiy
>
> >
> > OK with that change.
> >
> > Thanks,
> > Richard.
> >
> >> +     (if (cmp == EQ_EXPR || cmp == NE_EXPR)
> >> +    (cmp @1 (minus @0 @2))
> >> +     (rcmp (minus @1 (plus @0 { build_one_cst (TREE_TYPE (@1)); }))
> >> +         (bit_not @2))))))))
> >>
> >>   /* Canonicalizations of BIT_FIELD_REFs.  */
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c 
> >> b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
> >> new file mode 100644
> >> index 00000000000..c2bf1d17234
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
> >> @@ -0,0 +1,73 @@
> >> +/* PR tree-optimization/116024 */
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O1 -fdump-tree-forwprop1-details -fwrapv" } */
> >> +
> >> +#include <stdint.h>
> >> +
> >> +uint32_t f(void);
> >> +
> >> +int32_t i2(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 10 - (int32_t)f();
> >> +  return l <= 20; // f() - 11 >= -21
> >> +}
> >> +
> >> +int32_t i2a(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 10 - (int32_t)f();
> >> +  return l < 30; // f() - 11 > -31
> >> +}
> >> +
> >> +int32_t i2b(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 200 - (int32_t)f();
> >> +  return l <= 100; // f() - 201 >= -101
> >> +}
> >> +
> >> +int32_t i2c(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 300 - (int32_t)f();
> >> +  return l < 100; // f() - 301 > -101
> >> +}
> >> +
> >> +int32_t i2d(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 1000 - (int32_t)f();
> >> +  return l >= 2000; // f() - 1001 <= -2001
> >> +}
> >> +
> >> +int32_t i2e(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 1000 - (int32_t)f();
> >> +  return l > 3000; // f() - 1001 < -3001
> >> +}
> >> +
> >> +int32_t i2f(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 20000 - (int32_t)f();
> >> +  return l >= 10000; // f() - 20001 <= -10001
> >> +}
> >> +
> >> +int32_t i2g(void)
> >> +{
> >> +  int32_t l = 2;
> >> +  l = 30000 - (int32_t)f();
> >> +  return l > 10000; // f() - 30001 < -10001
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-times "Removing dead stmt:.*?- _" 8 
> >> "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -11.*\n.*>= -21" 1 "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -11.*\n.*>= -30" 1 "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -201.*\n.*>= -101" 1 "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -301.*\n.*>= -100" 1 "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -1001.*\n.*< -2000" 1 "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -1001.*\n.*< -3001" 1 "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -20001.*\n.*< -10000" 1 "forwprop1" } } */
> >> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
> >> -30001.*\n.*< -10001" 1 "forwprop1" } } */
> >>
> >
>

Reply via email to