On Fri, 17 May 2024, Manolis Tsamis wrote: > Hi Richard, > > While I was re-testing the latest version of this patch I noticed that > it FAILs an AArch64 test, gcc.target/aarch64/subsp.c. With the patch > we generate one instruction more: > > sbfiz x1, x1, 4, 32 > stp x29, x30, [sp, -16]! > add x1, x1, 16 > mov x29, sp > sub sp, sp, x1 > mov x0, sp > bl foo > > Instead of: > > stp x29, x30, [sp, -16]! > add w1, w1, 1 > mov x29, sp > sub sp, sp, w1, sxtw 4 > mov x0, sp > bl foo > > I've looked at it but can't really find a way to solve the regression. > Any thoughts on this?
Can you explain what goes wrong? As I said rewriting parts of address calculation is tricky, there's always the chance that some cases regress (see your observation in comment#4 of the PR). Note that I still believe that avoiding the early and premature promotion of the addition to unsigned is a good thing. Note the testcase in the PR is fixed with -fwrapv because then we do _not_ perform this premature optimization. Without -fwrapv the optimization is valid but as you note we do not perform it consistently - otherwise we wouldn't regress. Richard. > Thanks, > Manolis > > > > On Thu, May 16, 2024 at 11:15 AM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Tue, May 14, 2024 at 10:58 AM Manolis Tsamis <manolis.tsa...@vrull.eu> > > wrote: > > > > > > New patch with the requested changes can be found below. > > > > > > I don't know how much this affects SCEV, but I do believe that we > > > should incorporate this change somehow. I've seen various cases of > > > suboptimal address calculation codegen that boil down to this. > > > > This misses the ChangeLog (I assume it's unchanged) and indent > > of the match.pd part is now off. > > > > Please fix that, the patch is OK with that change. > > > > Thanks, > > Richard. > > > > > gcc/match.pd | 31 +++++++++++++++++++++++++++++++ > > > gcc/testsuite/gcc.dg/pr109393.c | 16 ++++++++++++++++ > > > 2 files changed, 47 insertions(+) > > > create mode 100644 gcc/testsuite/gcc.dg/pr109393.c > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > index 07e743ae464..1d642c205f0 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -3650,6 +3650,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > (plus (convert @0) (op @2 (convert @1)))))) > > > #endif > > > +/* ((T)(A + CST1)) * CST2 + CST3 > > > + -> ((T)(A) * CST2) + ((T)CST1 * CST2 + CST3) > > > + Where (A + CST1) doesn't need to have a single use. */ > > > +#if GIMPLE > > > + (for op (plus minus) > > > + (simplify > > > + (plus (mult:s (convert:s (op @0 INTEGER_CST@1)) INTEGER_CST@2) > > > + INTEGER_CST@3) > > > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > > > + && INTEGRAL_TYPE_P (type) > > > + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) > > > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > > > + && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)) > > > + && TYPE_OVERFLOW_WRAPS (type)) > > > + (op (mult (convert @0) @2) (plus (mult (convert @1) @2) @3))))) > > > +#endif > > > + > > > +/* ((T)(A + CST1)) * CST2 -> ((T)(A) * CST2) + ((T)CST1 * CST2) */ > > > +#if GIMPLE > > > + (for op (plus minus) > > > + (simplify > > > + (mult (convert:s (op:s @0 INTEGER_CST@1)) INTEGER_CST@2) > > > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > > > + && INTEGRAL_TYPE_P (type) > > > + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) > > > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > > > + && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)) > > > + && TYPE_OVERFLOW_WRAPS (type)) > > > + (op (mult (convert @0) @2) (mult (convert @1) @2))))) > > > +#endif > > > + > > > /* (T)(A) +- (T)(B) -> (T)(A +- B) only when (A +- B) could be simplified > > > to a simple value. */ > > > (for op (plus minus) > > > diff --git a/gcc/testsuite/gcc.dg/pr109393.c > > > b/gcc/testsuite/gcc.dg/pr109393.c > > > new file mode 100644 > > > index 00000000000..e9051273672 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.dg/pr109393.c > > > @@ -0,0 +1,16 @@ > > > +/* PR tree-optimization/109393 */ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > > > +/* { dg-final { scan-tree-dump-times "return 1;" 2 "optimized" } } */ > > > + > > > +int foo(int *a, int j) > > > +{ > > > + int k = j - 1; > > > + return a[j - 1] == a[k]; > > > +} > > > + > > > +int bar(int *a, int j) > > > +{ > > > + int k = j - 1; > > > + return (&a[j + 1] - 2) == &a[k]; > > > +} > > > -- > > > 2.44.0 > > > > > > > > > On Tue, Apr 23, 2024 at 1:33 PM Manolis Tsamis <manolis.tsa...@vrull.eu> > > > wrote: > > > > > > > > The original motivation for this pattern was that the following > > > > function does > > > > not fold to 'return 1': > > > > > > > > int foo(int *a, int j) > > > > { > > > > int k = j - 1; > > > > return a[j - 1] == a[k]; > > > > } > > > > > > > > The expression ((unsigned long) (X +- C1) * C2) appears frequently as > > > > part of > > > > address calculations (e.g. arrays). These patterns help fold and > > > > simplify more > > > > expressions. > > > > > > > > PR tree-optimization/109393 > > > > > > > > gcc/ChangeLog: > > > > > > > > * match.pd: Add new patterns for ((T)(A +- CST1)) * CST2 and > > > > ((T)(A +- CST1)) * CST2 + CST3. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * gcc.dg/pr109393.c: New test. > > > > > > > > Signed-off-by: Manolis Tsamis <manolis.tsa...@vrull.eu> > > > > --- > > > > > > > > gcc/match.pd | 30 ++++++++++++++++++++++++++++++ > > > > gcc/testsuite/gcc.dg/pr109393.c | 16 ++++++++++++++++ > > > > 2 files changed, 46 insertions(+) > > > > create mode 100644 gcc/testsuite/gcc.dg/pr109393.c > > > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > > index d401e7503e6..13c828ba70d 100644 > > > > --- a/gcc/match.pd > > > > +++ b/gcc/match.pd > > > > @@ -3650,6 +3650,36 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > > (plus (convert @0) (op @2 (convert @1)))))) > > > > #endif > > > > > > > > +/* ((T)(A + CST1)) * CST2 + CST3 > > > > + -> ((T)(A) * CST2) + ((T)CST1 * CST2 + CST3) > > > > + Where (A + CST1) doesn't need to have a single use. */ > > > > +#if GIMPLE > > > > + (for op (plus minus) > > > > + (simplify > > > > + (plus (mult (convert:s (op @0 INTEGER_CST@1)) INTEGER_CST@2) > > > > INTEGER_CST@3) > > > > + (if (TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE > > > > + && TREE_CODE (type) == INTEGER_TYPE > > > > + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) > > > > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > > > > + && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)) > > > > + && TYPE_OVERFLOW_WRAPS (type)) > > > > + (op (mult @2 (convert @0)) (plus (mult @2 (convert @1)) @3))))) > > > > +#endif > > > > + > > > > +/* ((T)(A + CST1)) * CST2 -> ((T)(A) * CST2) + ((T)CST1 * CST2) */ > > > > +#if GIMPLE > > > > + (for op (plus minus) > > > > + (simplify > > > > + (mult (convert:s (op:s @0 INTEGER_CST@1)) INTEGER_CST@2) > > > > + (if (TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE > > > > + && TREE_CODE (type) == INTEGER_TYPE > > > > + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) > > > > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > > > > + && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)) > > > > + && TYPE_OVERFLOW_WRAPS (type)) > > > > + (op (mult @2 (convert @0)) (mult @2 (convert @1)))))) > > > > +#endif > > > > + > > > > /* (T)(A) +- (T)(B) -> (T)(A +- B) only when (A +- B) could be > > > > simplified > > > > to a simple value. */ > > > > (for op (plus minus) > > > > diff --git a/gcc/testsuite/gcc.dg/pr109393.c > > > > b/gcc/testsuite/gcc.dg/pr109393.c > > > > new file mode 100644 > > > > index 00000000000..e9051273672 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.dg/pr109393.c > > > > @@ -0,0 +1,16 @@ > > > > +/* PR tree-optimization/109393 */ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > > > > +/* { dg-final { scan-tree-dump-times "return 1;" 2 "optimized" } } */ > > > > + > > > > +int foo(int *a, int j) > > > > +{ > > > > + int k = j - 1; > > > > + return a[j - 1] == a[k]; > > > > +} > > > > + > > > > +int bar(int *a, int j) > > > > +{ > > > > + int k = j - 1; > > > > + return (&a[j + 1] - 2) == &a[k]; > > > > +} > > > > -- > > > > 2.34.1 > > > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)