On Fri, 8 Dec 2023, Jakub Jelinek wrote:

> Hi!
> 
> Before bitint lowering, the IL has:
>   b.0_1 = b;
>   _2 = -b.0_1;
>   _3 = (unsigned _BitInt(512)) _2;
>   a.1_4 = a;
>   a.2_5 = (unsigned _BitInt(512)) a.1_4;
>   _6 = _3 * a.2_5;
> on the first function.  Now, gimple_lower_bitint has an optimization
> (when not -O0) that it avoids assigning underlying VAR_DECLs for certain
> SSA_NAMEs where it is possible to lower it in a single loop (or straight
> line code) rather than in multiple loops.
> So, e.g. the multiplication above uses handle_operand_addr, which can deal
> with INTEGER_CST arguments, loads but also casts, so it is fine
> not to assign an underlying VAR_DECL for SSA_NAMEs a.1_4 and a.2_5, as
> the multiplication can handle it fine.
> The more problematic case is the other multiplication operand.
> It is again a result of a (in this case narrowing) cast, so it is fine
> not to assign VAR_DECL for _3.  Normally we can merge the load (b.0_1)
> with the negation (_2) and even with the following cast (_3).  If _3
> was used in a mergeable operation like addition, subtraction, negation,
> &|^ or equality comparison, all of b.0_1, _2 and _3 could be without
> underlying VAR_DECLs.
> The problem is that the current code does that even when the cast is used
> by a non-mergeable operation, and handle_operand_addr certainly can't handle
> the mergeable operations feeding the rhs1 of the cast, for multiplication
> we don't emit any loop in which it could appear, for other operations like
> shifts or non-equality comparisons we emit loops, but either in the reverse
> direction or with unpredictable indexes (for shifts).
> So, in order to lower the above correctly, we need to have an underlying
> VAR_DECL for either _2 or _3; if we choose _2, then the load and negation
> would be done in one loop and extension handled as part of the
> multiplication, if we choose _3, then the load, negation and cast are done
> in one loop and the multiplication just uses the underlying VAR_DECL
> computed by that.
> It is far easier to do this for _3, which is what the following patch
> implements.
> It actually already had code for most of it, just it did that for widening
> casts only (optimize unless the cast rhs1 is not SSA_NAME, or is SSA_NAME
> defined in some other bb, or with more than one use, etc.).
> This falls through into such code even for the narrowing or same precision
> casts, unless the cast is used in a mergeable operation.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2023-12-08  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/112902
>       * gimple-lower-bitint.cc (gimple_lower_bitint): For a narrowing
>       or same precision cast don't set SSA_NAME_VERSION in m_names only
>       if use_stmt is mergeable_op or fall through into the check that
>       use is a store or rhs1 is not mergeable or other reasons prevent
>       merging.
> 
>       * gcc.dg/bitint-52.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj     2023-12-06 09:55:18.522993378 +0100
> +++ gcc/gimple-lower-bitint.cc        2023-12-07 18:05:17.183692049 +0100
> @@ -5989,10 +5989,11 @@ gimple_lower_bitint (void)
>                   {
>                     if (TREE_CODE (TREE_TYPE (rhs1)) != BITINT_TYPE
>                         || (bitint_precision_kind (TREE_TYPE (rhs1))
> -                           < bitint_prec_large)
> -                       || (TYPE_PRECISION (TREE_TYPE (rhs1))
> -                           >= TYPE_PRECISION (TREE_TYPE (s)))
> -                       || mergeable_op (SSA_NAME_DEF_STMT (s)))
> +                           < bitint_prec_large))
> +                     continue;
> +                   if ((TYPE_PRECISION (TREE_TYPE (rhs1))
> +                        >= TYPE_PRECISION (TREE_TYPE (s)))
> +                       && mergeable_op (use_stmt))
>                       continue;
>                     /* Prevent merging a widening non-mergeable cast
>                        on result of some narrower mergeable op
> @@ -6011,7 +6012,9 @@ gimple_lower_bitint (void)
>                         || !mergeable_op (SSA_NAME_DEF_STMT (rhs1))
>                         || gimple_store_p (use_stmt))
>                       continue;
> -                   if (gimple_assign_cast_p (SSA_NAME_DEF_STMT (rhs1)))
> +                   if ((TYPE_PRECISION (TREE_TYPE (rhs1))
> +                        < TYPE_PRECISION (TREE_TYPE (s)))
> +                       && gimple_assign_cast_p (SSA_NAME_DEF_STMT (rhs1)))
>                       {
>                         /* Another exception is if the widening cast is
>                            from mergeable same precision cast from something
> --- gcc/testsuite/gcc.dg/bitint-52.c.jj       2023-12-08 00:35:39.970953164 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-52.c  2023-12-08 00:35:21.983205440 +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/112902 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +double c;
> +#if __BITINT_MAXWIDTH__ >= 2048
> +_BitInt (512) a;
> +_BitInt (2048) b;
> +
> +void
> +foo (void)
> +{
> +  b = __builtin_mul_overflow_p (40, (_BitInt (512)) (-b * a), 0);
> +}
> +
> +
> +void
> +bar (void)
> +{
> +  c -= (unsigned _BitInt (512)) (a | a << b);
> +}
> +#endif
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to