On Thu, Oct 19, 2023 at 12:32:49PM -0400, Jason Merrill wrote:
> On 10/19/23 10:14, Marek Polacek wrote:
> > On Thu, Oct 19, 2023 at 10:06:01AM -0400, Jason Merrill wrote:
> > > On 10/19/23 09:39, Patrick Palka wrote:
> > > > On Tue, 17 Oct 2023, Marek Polacek wrote:
> > > > 
> > > > > On Tue, Oct 17, 2023 at 04:49:52PM -0400, Jason Merrill wrote:
> > > > > > On 10/16/23 20:39, Marek Polacek wrote:
> > > > > > > On Sat, Oct 14, 2023 at 01:13:22AM -0400, Jason Merrill wrote:
> > > > > > > > On 10/13/23 14:53, Marek Polacek wrote:
> > > > > > > > > On Thu, Oct 12, 2023 at 09:41:43PM -0400, Jason Merrill wrote:
> > > > > > > > > > On 10/12/23 17:04, Marek Polacek wrote:
> > > > > > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for 
> > > > > > > > > > > trunk?
> > > > > > > > > > > 
> > > > > > > > > > > -- >8 --
> > > > > > > > > > > My recent patch introducing cp_fold_immediate_r caused 
> > > > > > > > > > > exponential
> > > > > > > > > > > compile time with nested COND_EXPRs.  The problem is that 
> > > > > > > > > > > the COND_EXPR
> > > > > > > > > > > case recursively walks the arms of a COND_EXPR, but after 
> > > > > > > > > > > processing
> > > > > > > > > > > both arms it doesn't end the walk; it proceeds to walk the
> > > > > > > > > > > sub-expressions of the outermost COND_EXPR, triggering 
> > > > > > > > > > > again walking
> > > > > > > > > > > the arms of the nested COND_EXPR, and so on.  This patch 
> > > > > > > > > > > brings the
> > > > > > > > > > > compile time down to about 0m0.033s.
> > > > > > > > > > > 
> > > > > > > > > > > I've added some debug prints to make sure that the rest 
> > > > > > > > > > > of cp_fold_r
> > > > > > > > > > > is still performed as before.
> > > > > > > > > > > 
> > > > > > > > > > >              PR c++/111660
> > > > > > > > > > > 
> > > > > > > > > > > gcc/cp/ChangeLog:
> > > > > > > > > > > 
> > > > > > > > > > >              * cp-gimplify.cc (cp_fold_immediate_r) <case 
> > > > > > > > > > > COND_EXPR>: Return
> > > > > > > > > > >              integer_zero_node instead of break;.
> > > > > > > > > > >              (cp_fold_immediate): Return true if 
> > > > > > > > > > > cp_fold_immediate_r returned
> > > > > > > > > > >              error_mark_node.
> > > > > > > > > > > 
> > > > > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > > > > > 
> > > > > > > > > > >              * g++.dg/cpp0x/hog1.C: New test.
> > > > > > > > > > > ---
> > > > > > > > > > >       gcc/cp/cp-gimplify.cc             |  9 ++--
> > > > > > > > > > >       gcc/testsuite/g++.dg/cpp0x/hog1.C | 77 
> > > > > > > > > > > +++++++++++++++++++++++++++++++
> > > > > > > > > > >       2 files changed, 82 insertions(+), 4 deletions(-)
> > > > > > > > > > >       create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
> > > > > > > > > > > index bdf6e5f98ff..ca622ca169a 100644
> > > > > > > > > > > --- a/gcc/cp/cp-gimplify.cc
> > > > > > > > > > > +++ b/gcc/cp/cp-gimplify.cc
> > > > > > > > > > > @@ -1063,16 +1063,16 @@ cp_fold_immediate_r (tree 
> > > > > > > > > > > *stmt_p, int *walk_subtrees, void *data_)
> > > > > > > > > > >           break;
> > > > > > > > > > >             if (TREE_OPERAND (stmt, 1)
> > > > > > > > > > >             && cp_walk_tree (&TREE_OPERAND (stmt, 1), 
> > > > > > > > > > > cp_fold_immediate_r, data,
> > > > > > > > > > > -                    nullptr))
> > > > > > > > > > > +                    nullptr) == error_mark_node)
> > > > > > > > > > >           return error_mark_node;
> > > > > > > > > > >             if (TREE_OPERAND (stmt, 2)
> > > > > > > > > > >             && cp_walk_tree (&TREE_OPERAND (stmt, 2), 
> > > > > > > > > > > cp_fold_immediate_r, data,
> > > > > > > > > > > -                    nullptr))
> > > > > > > > > > > +                    nullptr) == error_mark_node)
> > > > > > > > > > >           return error_mark_node;
> > > > > > > > > > >             /* We're done here.  Don't clear 
> > > > > > > > > > > *walk_subtrees here though: we're called
> > > > > > > > > > >            from cp_fold_r and we must let it recurse on 
> > > > > > > > > > > the expression with
> > > > > > > > > > >            cp_fold.  */
> > > > > > > > > > > -      break;
> > > > > > > > > > > +      return integer_zero_node;
> > > > > > > > > > 
> > > > > > > > > > I'm concerned this will end up missing something like
> > > > > > > > > > 
> > > > > > > > > > 1 ? 1 : ((1 ? 1 : 1), immediate())
> > > > > > > > > > 
> > > > > > > > > > as the integer_zero_node from the inner ?: will prevent 
> > > > > > > > > > walk_tree from
> > > > > > > > > > looking any farther.
> > > > > > > > > 
> > > > > > > > > You are right.  The line above works as expected, but
> > > > > > > > > 
> > > > > > > > >       1 ? 1 : ((1 ? 1 : id (42)), id (i));
> > > > > > > > > 
> > > > > > > > > shows the problem (when the expression isn't used as an 
> > > > > > > > > initializer).
> > > > > > > > > 
> > > > > > > > > > Maybe we want to handle COND_EXPR in cp_fold_r instead of 
> > > > > > > > > > here?
> > > > > > > > > 
> > > > > > > > > I hope this version is better.
> > > > > > > > > 
> > > > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > > > > > > 
> > > > > > > > > -- >8 --
> > > > > > > > > My recent patch introducing cp_fold_immediate_r caused 
> > > > > > > > > exponential
> > > > > > > > > compile time with nested COND_EXPRs.  The problem is that the 
> > > > > > > > > COND_EXPR
> > > > > > > > > case recursively walks the arms of a COND_EXPR, but after 
> > > > > > > > > processing
> > > > > > > > > both arms it doesn't end the walk; it proceeds to walk the
> > > > > > > > > sub-expressions of the outermost COND_EXPR, triggering again 
> > > > > > > > > walking
> > > > > > > > > the arms of the nested COND_EXPR, and so on.  This patch 
> > > > > > > > > brings the
> > > > > > > > > compile time down to about 0m0.033s.
> > > > > > > > 
> > > > > > > > Is this number still accurate for this version?
> > > > > > > 
> > > > > > > It is.  I ran time(1) a few more times and the results were 
> > > > > > > 0m0.033s - 0m0.035s.
> > > > > > > That said, ...
> > > > > > > 
> > > > > > > > This change seems algorithmically better than the current code, 
> > > > > > > > but still
> > > > > > > > problematic: if we have nested COND_EXPR A/B/C/D/E, it looks 
> > > > > > > > like we will
> > > > > > > > end up cp_fold_immediate_r walking the arms of E five times, 
> > > > > > > > once for each
> > > > > > > > COND_EXPR.
> > > > > > > 
> > > > > > > ...this is accurate.  I should have addressed the redundant 
> > > > > > > folding in v2
> > > > > > > even though the compilation is pretty much immediate.
> > > > > > > > What I was thinking by handling COND_EXPR in cp_fold_r was to 
> > > > > > > > cp_fold_r walk
> > > > > > > > its subtrees (or cp_fold_immediate_r if it's clear from op0 
> > > > > > > > that the branch
> > > > > > > > isn't taken) so we can clear *walk_subtrees and we don't 
> > > > > > > > fold_imm walk a
> > > > > > > > node more than once.
> > > > > > > 
> > > > > > > I agree I should do better here.  How's this, then?  I've added
> > > > > > > debug_generic_expr to cp_fold_immediate_r to see if it gets the 
> > > > > > > same
> > > > > > > expr multiple times and it doesn't seem to.
> > > > > > > 
> > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > > > > 
> > > > > > > -- >8 --
> > > > > > > My recent patch introducing cp_fold_immediate_r caused exponential
> > > > > > > compile time with nested COND_EXPRs.  The problem is that the 
> > > > > > > COND_EXPR
> > > > > > > case recursively walks the arms of a COND_EXPR, but after 
> > > > > > > processing
> > > > > > > both arms it doesn't end the walk; it proceeds to walk the
> > > > > > > sub-expressions of the outermost COND_EXPR, triggering again 
> > > > > > > walking
> > > > > > > the arms of the nested COND_EXPR, and so on.  This patch brings 
> > > > > > > the
> > > > > > > compile time down to about 0m0.030s.
> > > > > > > 
> > > > > > > The ff_fold_immediate flag is unused after this patch but since 
> > > > > > > I'm
> > > > > > > using it in the P2564 patch, I'm not removing it now.  Maybe 
> > > > > > > at_eof
> > > > > > > can be used instead and then we can remove ff_fold_immediate.
> > > > > > > 
> > > > > > >            PR c++/111660
> > > > > > > 
> > > > > > > gcc/cp/ChangeLog:
> > > > > > > 
> > > > > > >            * cp-gimplify.cc (cp_fold_immediate_r) <case 
> > > > > > > COND_EXPR>: Don't
> > > > > > >   handle it here.
> > > > > > >            (cp_fold_r): Handle COND_EXPR here.
> > > > > > > 
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > 
> > > > > > >            * g++.dg/cpp0x/hog1.C: New test.
> > > > > > >   * g++.dg/cpp2a/consteval36.C: New test.
> > > > > > > ---
> > > > > > >     gcc/cp/cp-gimplify.cc                    | 52 +++++++++-------
> > > > > > >     gcc/testsuite/g++.dg/cpp0x/hog1.C        | 77 
> > > > > > > ++++++++++++++++++++++++
> > > > > > >     gcc/testsuite/g++.dg/cpp2a/consteval36.C | 22 +++++++
> > > > > > >     3 files changed, 128 insertions(+), 23 deletions(-)
> > > > > > >     create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C
> > > > > > >     create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval36.C
> > > > > > > 
> > > > > > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
> > > > > > > index bdf6e5f98ff..a282c3930a3 100644
> > > > > > > --- a/gcc/cp/cp-gimplify.cc
> > > > > > > +++ b/gcc/cp/cp-gimplify.cc
> > > > > > > @@ -1052,27 +1052,6 @@ cp_fold_immediate_r (tree *stmt_p, int 
> > > > > > > *walk_subtrees, void *data_)
> > > > > > >       switch (TREE_CODE (stmt))
> > > > > > >         {
> > > > > > > -    /* Unfortunately we must handle code like
> > > > > > > -  false ? bar () : 42
> > > > > > > -       where we have to check bar too.  The cp_fold call in 
> > > > > > > cp_fold_r could
> > > > > > > -       fold the ?: into a constant before we see it here.  */
> > > > > > > -    case COND_EXPR:
> > > > > > > -      /* If we are called from cp_fold_immediate, we don't need 
> > > > > > > to worry about
> > > > > > > -  cp_fold folding away the COND_EXPR.  */
> > > > > > > -      if (data->flags & ff_fold_immediate)
> > > > > > > - break;
> > > > > > > -      if (TREE_OPERAND (stmt, 1)
> > > > > > > -   && cp_walk_tree (&TREE_OPERAND (stmt, 1), 
> > > > > > > cp_fold_immediate_r, data,
> > > > > > > -                    nullptr))
> > > > > > > - return error_mark_node;
> > > > > > > -      if (TREE_OPERAND (stmt, 2)
> > > > > > > -   && cp_walk_tree (&TREE_OPERAND (stmt, 2), 
> > > > > > > cp_fold_immediate_r, data,
> > > > > > > -                    nullptr))
> > > > > > > - return error_mark_node;
> > > > > > > -      /* We're done here.  Don't clear *walk_subtrees here 
> > > > > > > though: we're called
> > > > > > > -  from cp_fold_r and we must let it recurse on the expression 
> > > > > > > with
> > > > > > > -  cp_fold.  */
> > > > > > > -      break;
> > > > > > >         case PTRMEM_CST:
> > > > > > >           if (TREE_CODE (PTRMEM_CST_MEMBER (stmt)) == 
> > > > > > > FUNCTION_DECL
> > > > > > >             && DECL_IMMEDIATE_FUNCTION_P (PTRMEM_CST_MEMBER 
> > > > > > > (stmt)))
> > > > > > > @@ -1162,8 +1141,35 @@ cp_fold_r (tree *stmt_p, int 
> > > > > > > *walk_subtrees, void *data_)
> > > > > > >       tree stmt = *stmt_p;
> > > > > > >       enum tree_code code = TREE_CODE (stmt);
> > > > > > > -  if (cxx_dialect > cxx17)
> > > > > > > -    cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> > > > > > > +  if (cxx_dialect >= cxx20)
> > > > > > > +    {
> > > > > > > +      /* Unfortunately we must handle code like
> > > > > > > +    false ? bar () : 42
> > > > > > > +  where we have to check bar too.  The cp_fold call below could
> > > > > > > +  fold the ?: into a constant before we've checked it.  */
> > > > > > > +      if (code == COND_EXPR)
> > > > > > > + {
> > > > > > > +   auto then_fn = cp_fold_r, else_fn = cp_fold_r;
> > > > > > > +   /* See if we can figure out if either of the branches is 
> > > > > > > dead.  If it
> > > > > > > +      is, we don't need to do everything that cp_fold_r does.  */
> > > > > > > +   tree cond = maybe_constant_value (TREE_OPERAND (stmt, 0));
> > > > > > > +   if (integer_zerop (cond))
> > > > > > > +     then_fn = cp_fold_immediate_r;
> > > > > > > +   else if (TREE_CODE (cond) == INTEGER_CST)
> > > > > > > +     else_fn = cp_fold_immediate_r;
> > > > > > > +
> > > > > > > +   cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, 
> > > > > > > nullptr);
> > > > > > 
> > > > > > I wonder about doing this before maybe_constant_value, to hopefully 
> > > > > > reduce
> > > > > > the redundant calculations?  OK either way.
> > > > > 
> > > > > Yeah, I was toying with that, I had
> > > > > 
> > > > >     foo() ? x : y
> > > > > 
> > > > > where foo was a constexpr function but the cp_fold_r on op0 didn't 
> > > > > eval it
> > > > > to a constant :(.
> > > > 
> > > > IIUC that's because cp_fold evaluates constexpr calls only with -O, so
> > > > doing cp_fold_r before maybe_constant_value on the condition should
> > > > still be desirable in that case?
> > > 
> > > Hmm, and if the cp_fold_r doesn't reduce the test to a constant, I would
> > > think that folding the COND_EXPR also won't discard the other branch, so 
> > > we
> > > shouldn't need to work harder to get a constant here, so we don't need to
> > > call maybe_constant_value at all?
> > 
> > Sorry, I hadn't seen this message when I posted the tweak.  But the
> > maybe_constant_value call on TREE_OPERAND (stmt, 0) should still make
> > sense because it can render a branch dead even on -O0.  Right?
> 
> But if op0 isn't constant after cp_fold, folding the COND_EXPR won't discard
> the branch, so we don't need to do the extra work to find out that it's
> actually dead.

Hmm, so how about this?  Thus far dg.exp passed.

-- >8 --
This patch is an optimization tweak for cp_fold_r.  If we cp_fold_r the
COND_EXPR's op0 first, we may be able to evaluate it to a constant if -O.
cp_fold has:

3143         if (callee && DECL_DECLARED_CONSTEXPR_P (callee)
3144             && !flag_no_inline)
...
3151             r = maybe_constant_value (x, /*decl=*/NULL_TREE,

flag_no_inline is 1 for -O0:

1124   if (opts->x_optimize == 0)
1125     {
1126       /* Inlining does not work if not optimizing,
1127          so force it not to be done.  */
1128       opts->x_warn_inline = 0;
1129       opts->x_flag_no_inline = 1;
1130     }

but otherwise it's 0 and cp_fold will maybe_constant_value calls to
constexpr functions.  And if it doesn't, then folding the COND_EXPR
will keep both arms, and we can avoid calling maybe_constant_value.

gcc/cp/ChangeLog:

        * cp-gimplify.cc (cp_fold_r): Don't call maybe_constant_value.
---
 gcc/cp/cp-gimplify.cc | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index a282c3930a3..385042aeef2 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1152,13 +1152,12 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void 
*data_)
          auto then_fn = cp_fold_r, else_fn = cp_fold_r;
          /* See if we can figure out if either of the branches is dead.  If it
             is, we don't need to do everything that cp_fold_r does.  */
-         tree cond = maybe_constant_value (TREE_OPERAND (stmt, 0));
-         if (integer_zerop (cond))
+         cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
+         if (integer_zerop (TREE_OPERAND (stmt, 0)))
            then_fn = cp_fold_immediate_r;
-         else if (TREE_CODE (cond) == INTEGER_CST)
+         else if (TREE_CODE (TREE_OPERAND (stmt, 0)) == INTEGER_CST)
            else_fn = cp_fold_immediate_r;
 
-         cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
          if (TREE_OPERAND (stmt, 1))
            cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data,
                          nullptr);

base-commit: 00e7c49fa04a3766e4726322b427621a74b78c71
-- 
2.41.0

Reply via email to