On Wed, 10 Jul 2024, Tamar Christina wrote:
> > > > I might also point back to the idea I threw in somewhere, adding
> > > > OEP_VALUE (or a better name) to the set of flags accepted by
> > > > operand_equal_p. You mentioned hashing IIRC but I don't see the patches
> > > > touching hashing?
> > > >
> > >
> > > Yes, That can indeed be done with this approach. The hashing was that
> > > before I
> > > was trying to prevent the "duplicate" IV expressions from being created
> > > in the
> > > first place by modifying get_loop_invariant_expr.
> > >
> > > This function looks up if we have already seen a particular IV expression
> > > and if
> > > we have it just returns that expression. However after reading more of
> > > the code
> > > I realized this wasn't the right approach, as without also dealing with
> > > the
> > candidates
> > > we'd end up creating IV expression that can't be handled by any candidate.
> > >
> > > IVops would just give up then. Reading the code it seems that
> > get_loop_invariant_expr
> > > is just there to prevent blatant duplicates. i.e. it treats `(signed) a`
> > > and `a` as the
> > same.
> > >
> > > This is also why I think that everywhere else *has* to continue stripping
> > > the
> > expression.
> > >
> > > On a note from Richard S that he thought IVopts already had some code to
> > > deal
> > with
> > > expressions that differ only in signs led me to take a different approach.
> > >
> > > The goal wasn't to remove the different sign/unsigned IV expressions, but
> > instead get
> > > Then to be servable by the same candidates. i.e. we want them in the same
> > candidate
> > > groups and then candidate optimization will just do its thing.
> > >
> > > That seemed a more natural fit to how it worked.
> >
> > Yeah, I agree that sounds like the better strathegy.
> >
> > > Do you want me to try the operand_equal_p approach? Though in this case
> > > the
> > issue
> > > is we not only need to know they're equal, but also need to know the scale
> > factor.
> >
> > For this case yes, but if you'd keep the code as-is, the equal with scale
> > factor one case would be fixed. Not a case with different scale factors
> > though - but conversions "elsewhere" should be handled via the stripping.
> > So it would work to simply adjust the operand_equal_p check here?
> >
> > > get_computation_aff_1 scales the common type IV by the scale we
> > > determined,
> > > so I think operand_equal_p would not be very useful here. But it does
> > > look like
> > > constant_multiple_of can just be implemented with
> > aff_combination_constant_multiple_p.
> > >
> > > Should I try?
> >
> > You've had the other places where you replace operand_equal_p with
> > affine-compute and compare. As said that has some associated cost
> > as well as a limit on the number of elements after which it resorts
> > back to operand_equal_p. So for strict equality tests implementing
> > a weaker operand_equal_p might be a better solution.
> >
>
> The structural comparison is implemented as a new mode for operand_equal_p
> which
> compares two expressions ignoring NOP converts (unless their bitsizes differ)
> and ignoring constant values, but requiring both operands be a constant.
>
> There is one downside compared to affine comparison, in that this approach
> does
> not deal well with commutative operations. i.e. it does not see a + (b + c) as
> equivalent to c + (b + a).
>
> This means we lose out on some of the more complicated addressing modes, but
> with so many operations the address will likely be split anyway and we'll deal
> with it then.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu -m32, -m64 and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> PR tree-optimization/114932
> * fold-const.cc (operand_compare::operand_equal_p): Use it.
> (operand_compare::verify_hash_value): Likewise.
> * tree-core.h (enum operand_equal_flag): Add OEP_STRUCTURAL_EQ.
> * tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/114932
> * gfortran.dg/addressing-modes_2.f90: New test.
>
> -- inline copy of --
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index
> 710d697c0217c784b34f9f9f7b00b1945369076a..3d43020541c082c094164724da9d17fbb5793237
> 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -3191,6 +3191,9 @@ operand_compare::operand_equal_p (const_tree arg0,
> const_tree arg1,
> precision differences. */
> if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
> {
> + if (flags & OEP_STRUCTURAL_EQ)
> + return true;
> +
Hmm, so you ignore all constants?
> /* Address of INTEGER_CST is not defined; check that we did not forget
> to drop the OEP_ADDRESS_OF flags. */
> gcc_checking_assert (!(flags & OEP_ADDRESS_OF));
> @@ -3204,7 +3207,8 @@ operand_compare::operand_equal_p (const_tree arg0,
> const_tree arg1,
> because they may change the signedness of the arguments. As pointers
> strictly don't have a signedness, require either two pointers or
> two non-pointers as well. */
> - if (TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE
> (arg1))
> + if ((TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE
> (arg1))
> + && !(flags & OEP_STRUCTURAL_EQ))
> || POINTER_TYPE_P (TREE_TYPE (arg0))
> != POINTER_TYPE_P (TREE_TYPE (arg1)))
and all sign-conversions?
I was thinking to implement a "value equivalence" check, so
(int)((unsigned)i + (unsigned)j) would be equal to i + j for int i and j.
Doesn't your patch still treat (int)(unsigned)i and i as different?
And won't it treat (unsigned)i / 4 and (int)i / 4 as the same? I'm not
sure this is the desired semantic for IVOPTs (or a useful one in general).
That said, when OEP_STRUCTURAL_EQ ignores a sign-change it can only
do that for operations where that maintains value-equivalence
(in terms of twos-complement bit representation).
OTOH if you ignore all differences in constants that's another thing.
> return false;
> @@ -4204,7 +4208,7 @@ operand_compare::verify_hash_value (const_tree arg0,
> const_tree arg1,
> {
> if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK))
> {
> - if (arg0 != arg1 && !(flags & OEP_DECL_NAME))
> + if (arg0 != arg1 && !(flags & (OEP_DECL_NAME | OEP_STRUCTURAL_EQ)))
> {
> inchash::hash hstate0 (0), hstate1 (0);
> hash_operand (arg0, hstate0, flags | OEP_HASH_CHECK);
> diff --git a/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..8eee4be3dc4d69fecfacd4c2e24a4973c8539fae
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> @@ -0,0 +1,20 @@
> +! { dg-do compile { target aarch64-*-* } }
> +! { dg-additional-options "-w -Ofast -fdump-tree-ivopts-all" }
> +
> +module a
> +integer, parameter :: b=3, c=b
> +contains
> +subroutine d(block)
> +integer f, col , block(:, :, :), e
> +do f = 1, c
> + do col = 1, c
> + block(:f, :, e()) = do
> + end do
> + end do
> + end
> + end
> +
> +! { dg-final { scan-tree-dump-not {Selected IV set for loop .+ niters, 3
> IVs:} ivopts } }
> +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 2
> IVs:} 1 ivopts } }
> +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 1
> IVs:} 1 ivopts } }
> +
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index
> 27c569c77022227643151fa4a909a95f3d45bf55..fe398d99bcaa9e8fcec45e5e07d133bc393aa4a7
> 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -962,7 +962,9 @@ enum operand_equal_flag {
> /* In conjunction with OEP_LEXICOGRAPHIC considers names of declarations
> of the same kind. Used to compare VLA bounds involving parameters
> across redeclarations of the same function. */
> - OEP_DECL_NAME = 512
> + OEP_DECL_NAME = 512,
> + /* Check for structural equivalence and ignore NOP_CONVERT casts. */
> + OEP_STRUCTURAL_EQ = 1024
> };
>
> /* Enum and arrays used for tree allocation stats.
> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> index
> c3218a3e8eedbb8d0a7f14c01eeb069cb6024c29..b05e4ac1e63b5c110707a8a3ed5e614b7ccc702f
> 100644
> --- a/gcc/tree-ssa-loop-ivopts.cc
> +++ b/gcc/tree-ssa-loop-ivopts.cc
> @@ -1623,8 +1623,8 @@ record_group_use (struct ivopts_data *data, tree *use_p,
>
> /* Check if it has the same stripped base and step. */
> if (operand_equal_p (iv->base_object, use->iv->base_object, 0)
> - && operand_equal_p (iv->step, use->iv->step, 0)
> - && operand_equal_p (addr_base, use->addr_base, 0))
> + && operand_equal_p (iv->step, use->iv->step, OEP_STRUCTURAL_EQ)
> + && operand_equal_p (addr_base, use->addr_base, OEP_STRUCTURAL_EQ))
> break;
> }
> if (i == data->vgroups.length ())
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)