Hi!

While looking at a bitint ICE, I've noticed we don't optimize
in f1 and f5 functions below the 2 casts into just one at GIMPLE,
even when optimize it in convert_to_integer if it appears in the same
stmt.  The large match.pd simplification of two conversions in a row
has many complex rules and as the testcase shows, everything else from
the narrowest -> widest -> prec_in_between all integer conversions
is already handled, either because the inside_unsignedp == inter_unsignedp
rule kicks in, or the
         && ((inter_unsignedp && inter_prec > inside_prec)
             == (final_unsignedp && final_prec > inter_prec))
one, but there is no reason why sign extension to from narrowest to
widest type followed by truncation to something in between can't be
done just as sign extension from narrowest to the final type.  After all,
if the widest type is signed rather than unsigned, regardless of the final
type signedness we already handle it that way.
And since PR93044 we also handle it if the final precision is not wider
than the inside precision.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-12-14  Jakub Jelinek  <ja...@redhat.com>

        PR tree-optimization/113024
        * match.pd (two conversions in a row): Simplify scalar integer
        sign-extension followed by truncation.

        * gcc.dg/tree-ssa/pr113024.c: New test.

--- gcc/match.pd.jj     2023-12-14 11:59:28.000000000 +0100
+++ gcc/match.pd        2023-12-14 18:25:00.457961975 +0100
@@ -4754,11 +4754,14 @@ (define_operator_list SYNC_FETCH_AND_AND
     /* If we have a sign-extension of a zero-extended value, we can
        replace that by a single zero-extension.  Likewise if the
        final conversion does not change precision we can drop the
-       intermediate conversion.  */
+       intermediate conversion.  Similarly truncation of a sign-extension
+       can be replaced by a single sign-extension.  */
     (if (inside_int && inter_int && final_int
         && ((inside_prec < inter_prec && inter_prec < final_prec
              && inside_unsignedp && !inter_unsignedp)
-            || final_prec == inter_prec))
+            || final_prec == inter_prec
+            || (inside_prec < inter_prec && inter_prec > final_prec
+                && !inside_unsignedp && inter_unsignedp)))
      (ocvt @0))
 
     /* Two conversions in a row are not needed unless:
--- gcc/testsuite/gcc.dg/tree-ssa/pr113024.c.jj 2023-12-14 18:35:30.652225327 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr113024.c    2023-12-14 18:37:42.056403418 
+0100
@@ -0,0 +1,22 @@
+/* PR tree-optimization/113024 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop1" } */
+/* Make sure we have just a single cast per function rather than 2 casts in 
some cases.  */
+/* { dg-final { scan-tree-dump-times " = \\\(\[a-z \]*\\\) \[xy_\]" 16 
"forwprop1" { target { ilp32 || lp64 } } } } */
+
+unsigned int f1 (signed char x) { unsigned long long y = x; return y; }
+unsigned int f2 (unsigned char x) { unsigned long long y = x; return y; }
+unsigned int f3 (signed char x) { long long y = x; return y; }
+unsigned int f4 (unsigned char x) { long long y = x; return y; }
+int f5 (signed char x) { unsigned long long y = x; return y; }
+int f6 (unsigned char x) { unsigned long long y = x; return y; }
+int f7 (signed char x) { long long y = x; return y; }
+int f8 (unsigned char x) { long long y = x; return y; }
+unsigned int f9 (signed char x) { return (unsigned long long) x; }
+unsigned int f10 (unsigned char x) { return (unsigned long long) x; }
+unsigned int f11 (signed char x) { return (long long) x; }
+unsigned int f12 (unsigned char x) { return (long long) x; }
+int f13 (signed char x) { return (unsigned long long) x; }
+int f14 (unsigned char x) { return (unsigned long long) x; }
+int f15 (signed char x) { return (long long) x; }
+int f16 (unsigned char x) { return (long long) x; }

        Jakub

Reply via email to