On Tue, Oct 17, 2023 at 04:49:52PM -0400, Jason Merrill wrote:
> On 10/16/23 20:39, Marek Polacek wrote:
> > On Sat, Oct 14, 2023 at 01:13:22AM -0400, Jason Merrill wrote:
> > > On 10/13/23 14:53, Marek Polacek wrote:
> > > > On Thu, Oct 12, 2023 at 09:41:43PM -0400, Jason Merrill wrote:
> > > > > On 10/12/23 17:04, Marek Polacek wrote:
> > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > > >
> > > > > > -- >8 --
> > > > > > My recent patch introducing cp_fold_immediate_r caused exponential
> > > > > > compile time with nested COND_EXPRs. The problem is that the
> > > > > > COND_EXPR
> > > > > > case recursively walks the arms of a COND_EXPR, but after processing
> > > > > > both arms it doesn't end the walk; it proceeds to walk the
> > > > > > sub-expressions of the outermost COND_EXPR, triggering again walking
> > > > > > the arms of the nested COND_EXPR, and so on. This patch brings the
> > > > > > compile time down to about 0m0.033s.
> > > > > >
> > > > > > I've added some debug prints to make sure that the rest of cp_fold_r
> > > > > > is still performed as before.
> > > > > >
> > > > > > PR c++/111660
> > > > > >
> > > > > > gcc/cp/ChangeLog:
> > > > > >
> > > > > > * cp-gimplify.cc (cp_fold_immediate_r) <case COND_EXPR>:
> > > > > > Return
> > > > > > integer_zero_node instead of break;.
> > > > > > (cp_fold_immediate): Return true if cp_fold_immediate_r
> > > > > > returned
> > > > > > error_mark_node.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > > * g++.dg/cpp0x/hog1.C: New test.
> > > > > > ---
> > > > > > gcc/cp/cp-gimplify.cc | 9 ++--
> > > > > > gcc/testsuite/g++.dg/cpp0x/hog1.C | 77
> > > > > > +++++++++++++++++++++++++++++++
> > > > > > 2 files changed, 82 insertions(+), 4 deletions(-)
> > > > > > create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C
> > > > > >
> > > > > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
> > > > > > index bdf6e5f98ff..ca622ca169a 100644
> > > > > > --- a/gcc/cp/cp-gimplify.cc
> > > > > > +++ b/gcc/cp/cp-gimplify.cc
> > > > > > @@ -1063,16 +1063,16 @@ cp_fold_immediate_r (tree *stmt_p, int
> > > > > > *walk_subtrees, void *data_)
> > > > > > break;
> > > > > > if (TREE_OPERAND (stmt, 1)
> > > > > > && cp_walk_tree (&TREE_OPERAND (stmt, 1),
> > > > > > cp_fold_immediate_r, data,
> > > > > > - nullptr))
> > > > > > + nullptr) == error_mark_node)
> > > > > > return error_mark_node;
> > > > > > if (TREE_OPERAND (stmt, 2)
> > > > > > && cp_walk_tree (&TREE_OPERAND (stmt, 2),
> > > > > > cp_fold_immediate_r, data,
> > > > > > - nullptr))
> > > > > > + nullptr) == error_mark_node)
> > > > > > return error_mark_node;
> > > > > > /* We're done here. Don't clear *walk_subtrees here
> > > > > > though: we're called
> > > > > > from cp_fold_r and we must let it recurse on the
> > > > > > expression with
> > > > > > cp_fold. */
> > > > > > - break;
> > > > > > + return integer_zero_node;
> > > > >
> > > > > I'm concerned this will end up missing something like
> > > > >
> > > > > 1 ? 1 : ((1 ? 1 : 1), immediate())
> > > > >
> > > > > as the integer_zero_node from the inner ?: will prevent walk_tree from
> > > > > looking any farther.
> > > >
> > > > You are right. The line above works as expected, but
> > > >
> > > > 1 ? 1 : ((1 ? 1 : id (42)), id (i));
> > > >
> > > > shows the problem (when the expression isn't used as an initializer).
> > > >
> > > > > Maybe we want to handle COND_EXPR in cp_fold_r instead of here?
> > > >
> > > > I hope this version is better.
> > > >
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > >
> > > > -- >8 --
> > > > My recent patch introducing cp_fold_immediate_r caused exponential
> > > > compile time with nested COND_EXPRs. The problem is that the COND_EXPR
> > > > case recursively walks the arms of a COND_EXPR, but after processing
> > > > both arms it doesn't end the walk; it proceeds to walk the
> > > > sub-expressions of the outermost COND_EXPR, triggering again walking
> > > > the arms of the nested COND_EXPR, and so on. This patch brings the
> > > > compile time down to about 0m0.033s.
> > >
> > > Is this number still accurate for this version?
> >
> > It is. I ran time(1) a few more times and the results were 0m0.033s -
> > 0m0.035s.
> > That said, ...
> >
> > > This change seems algorithmically better than the current code, but still
> > > problematic: if we have nested COND_EXPR A/B/C/D/E, it looks like we will
> > > end up cp_fold_immediate_r walking the arms of E five times, once for each
> > > COND_EXPR.
> >
> > ...this is accurate. I should have addressed the redundant folding in v2
> > even though the compilation is pretty much immediate.
> > > What I was thinking by handling COND_EXPR in cp_fold_r was to cp_fold_r
> > > walk
> > > its subtrees (or cp_fold_immediate_r if it's clear from op0 that the
> > > branch
> > > isn't taken) so we can clear *walk_subtrees and we don't fold_imm walk a
> > > node more than once.
> >
> > I agree I should do better here. How's this, then? I've added
> > debug_generic_expr to cp_fold_immediate_r to see if it gets the same
> > expr multiple times and it doesn't seem to.
> >
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> >
> > -- >8 --
> > My recent patch introducing cp_fold_immediate_r caused exponential
> > compile time with nested COND_EXPRs. The problem is that the COND_EXPR
> > case recursively walks the arms of a COND_EXPR, but after processing
> > both arms it doesn't end the walk; it proceeds to walk the
> > sub-expressions of the outermost COND_EXPR, triggering again walking
> > the arms of the nested COND_EXPR, and so on. This patch brings the
> > compile time down to about 0m0.030s.
> >
> > The ff_fold_immediate flag is unused after this patch but since I'm
> > using it in the P2564 patch, I'm not removing it now. Maybe at_eof
> > can be used instead and then we can remove ff_fold_immediate.
> >
> > PR c++/111660
> >
> > gcc/cp/ChangeLog:
> >
> > * cp-gimplify.cc (cp_fold_immediate_r) <case COND_EXPR>: Don't
> > handle it here.
> > (cp_fold_r): Handle COND_EXPR here.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/cpp0x/hog1.C: New test.
> > * g++.dg/cpp2a/consteval36.C: New test.
> > ---
> > gcc/cp/cp-gimplify.cc | 52 +++++++++-------
> > gcc/testsuite/g++.dg/cpp0x/hog1.C | 77 ++++++++++++++++++++++++
> > gcc/testsuite/g++.dg/cpp2a/consteval36.C | 22 +++++++
> > 3 files changed, 128 insertions(+), 23 deletions(-)
> > create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C
> > create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval36.C
> >
> > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
> > index bdf6e5f98ff..a282c3930a3 100644
> > --- a/gcc/cp/cp-gimplify.cc
> > +++ b/gcc/cp/cp-gimplify.cc
> > @@ -1052,27 +1052,6 @@ cp_fold_immediate_r (tree *stmt_p, int
> > *walk_subtrees, void *data_)
> > switch (TREE_CODE (stmt))
> > {
> > - /* Unfortunately we must handle code like
> > - false ? bar () : 42
> > - where we have to check bar too. The cp_fold call in cp_fold_r could
> > - fold the ?: into a constant before we see it here. */
> > - case COND_EXPR:
> > - /* If we are called from cp_fold_immediate, we don't need to worry
> > about
> > - cp_fold folding away the COND_EXPR. */
> > - if (data->flags & ff_fold_immediate)
> > - break;
> > - if (TREE_OPERAND (stmt, 1)
> > - && cp_walk_tree (&TREE_OPERAND (stmt, 1), cp_fold_immediate_r, data,
> > - nullptr))
> > - return error_mark_node;
> > - if (TREE_OPERAND (stmt, 2)
> > - && cp_walk_tree (&TREE_OPERAND (stmt, 2), cp_fold_immediate_r, data,
> > - nullptr))
> > - return error_mark_node;
> > - /* We're done here. Don't clear *walk_subtrees here though: we're
> > called
> > - from cp_fold_r and we must let it recurse on the expression with
> > - cp_fold. */
> > - break;
> > case PTRMEM_CST:
> > if (TREE_CODE (PTRMEM_CST_MEMBER (stmt)) == FUNCTION_DECL
> > && DECL_IMMEDIATE_FUNCTION_P (PTRMEM_CST_MEMBER (stmt)))
> > @@ -1162,8 +1141,35 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void
> > *data_)
> > tree stmt = *stmt_p;
> > enum tree_code code = TREE_CODE (stmt);
> > - if (cxx_dialect > cxx17)
> > - cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> > + if (cxx_dialect >= cxx20)
> > + {
> > + /* Unfortunately we must handle code like
> > + false ? bar () : 42
> > + where we have to check bar too. The cp_fold call below could
> > + fold the ?: into a constant before we've checked it. */
> > + if (code == COND_EXPR)
> > + {
> > + auto then_fn = cp_fold_r, else_fn = cp_fold_r;
> > + /* See if we can figure out if either of the branches is dead. If it
> > + is, we don't need to do everything that cp_fold_r does. */
> > + tree cond = maybe_constant_value (TREE_OPERAND (stmt, 0));
> > + if (integer_zerop (cond))
> > + then_fn = cp_fold_immediate_r;
> > + else if (TREE_CODE (cond) == INTEGER_CST)
> > + else_fn = cp_fold_immediate_r;
> > +
> > + cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
>
> I wonder about doing this before maybe_constant_value, to hopefully reduce
> the redundant calculations? OK either way.
Yeah, I was toying with that, I had
foo() ? x : y
where foo was a constexpr function but the cp_fold_r on op0 didn't eval it
to a constant :(.
Thanks,
Marek