On Fri, 8 Dec 2023, Jakub Jelinek wrote: > Hi! > > Before bitint lowering, the IL has: > b.0_1 = b; > _2 = -b.0_1; > _3 = (unsigned _BitInt(512)) _2; > a.1_4 = a; > a.2_5 = (unsigned _BitInt(512)) a.1_4; > _6 = _3 * a.2_5; > on the first function. Now, gimple_lower_bitint has an optimization > (when not -O0) that it avoids assigning underlying VAR_DECLs for certain > SSA_NAMEs where it is possible to lower it in a single loop (or straight > line code) rather than in multiple loops. > So, e.g. the multiplication above uses handle_operand_addr, which can deal > with INTEGER_CST arguments, loads but also casts, so it is fine > not to assign an underlying VAR_DECL for SSA_NAMEs a.1_4 and a.2_5, as > the multiplication can handle it fine. > The more problematic case is the other multiplication operand. > It is again a result of a (in this case narrowing) cast, so it is fine > not to assign VAR_DECL for _3. Normally we can merge the load (b.0_1) > with the negation (_2) and even with the following cast (_3). If _3 > was used in a mergeable operation like addition, subtraction, negation, > &|^ or equality comparison, all of b.0_1, _2 and _3 could be without > underlying VAR_DECLs. > The problem is that the current code does that even when the cast is used > by a non-mergeable operation, and handle_operand_addr certainly can't handle > the mergeable operations feeding the rhs1 of the cast, for multiplication > we don't emit any loop in which it could appear, for other operations like > shifts or non-equality comparisons we emit loops, but either in the reverse > direction or with unpredictable indexes (for shifts). > So, in order to lower the above correctly, we need to have an underlying > VAR_DECL for either _2 or _3; if we choose _2, then the load and negation > would be done in one loop and extension handled as part of the > multiplication, if we choose _3, then the load, negation and cast are done > in one loop and the multiplication just uses the underlying VAR_DECL > computed by that. > It is far easier to do this for _3, which is what the following patch > implements. > It actually already had code for most of it, just it did that for widening > casts only (optimize unless the cast rhs1 is not SSA_NAME, or is SSA_NAME > defined in some other bb, or with more than one use, etc.). > This falls through into such code even for the narrowing or same precision > casts, unless the cast is used in a mergeable operation. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
OK. > 2023-12-08 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/112902 > * gimple-lower-bitint.cc (gimple_lower_bitint): For a narrowing > or same precision cast don't set SSA_NAME_VERSION in m_names only > if use_stmt is mergeable_op or fall through into the check that > use is a store or rhs1 is not mergeable or other reasons prevent > merging. > > * gcc.dg/bitint-52.c: New test. > > --- gcc/gimple-lower-bitint.cc.jj 2023-12-06 09:55:18.522993378 +0100 > +++ gcc/gimple-lower-bitint.cc 2023-12-07 18:05:17.183692049 +0100 > @@ -5989,10 +5989,11 @@ gimple_lower_bitint (void) > { > if (TREE_CODE (TREE_TYPE (rhs1)) != BITINT_TYPE > || (bitint_precision_kind (TREE_TYPE (rhs1)) > - < bitint_prec_large) > - || (TYPE_PRECISION (TREE_TYPE (rhs1)) > - >= TYPE_PRECISION (TREE_TYPE (s))) > - || mergeable_op (SSA_NAME_DEF_STMT (s))) > + < bitint_prec_large)) > + continue; > + if ((TYPE_PRECISION (TREE_TYPE (rhs1)) > + >= TYPE_PRECISION (TREE_TYPE (s))) > + && mergeable_op (use_stmt)) > continue; > /* Prevent merging a widening non-mergeable cast > on result of some narrower mergeable op > @@ -6011,7 +6012,9 @@ gimple_lower_bitint (void) > || !mergeable_op (SSA_NAME_DEF_STMT (rhs1)) > || gimple_store_p (use_stmt)) > continue; > - if (gimple_assign_cast_p (SSA_NAME_DEF_STMT (rhs1))) > + if ((TYPE_PRECISION (TREE_TYPE (rhs1)) > + < TYPE_PRECISION (TREE_TYPE (s))) > + && gimple_assign_cast_p (SSA_NAME_DEF_STMT (rhs1))) > { > /* Another exception is if the widening cast is > from mergeable same precision cast from something > --- gcc/testsuite/gcc.dg/bitint-52.c.jj 2023-12-08 00:35:39.970953164 > +0100 > +++ gcc/testsuite/gcc.dg/bitint-52.c 2023-12-08 00:35:21.983205440 +0100 > @@ -0,0 +1,22 @@ > +/* PR tree-optimization/112902 */ > +/* { dg-do compile { target bitint } } */ > +/* { dg-options "-std=c23 -O2" } */ > + > +double c; > +#if __BITINT_MAXWIDTH__ >= 2048 > +_BitInt (512) a; > +_BitInt (2048) b; > + > +void > +foo (void) > +{ > + b = __builtin_mul_overflow_p (40, (_BitInt (512)) (-b * a), 0); > +} > + > + > +void > +bar (void) > +{ > + c -= (unsigned _BitInt (512)) (a | a << b); > +} > +#endif > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)