On Thu, Oct 19, 2023 at 12:32:49PM -0400, Jason Merrill wrote: > On 10/19/23 10:14, Marek Polacek wrote: > > On Thu, Oct 19, 2023 at 10:06:01AM -0400, Jason Merrill wrote: > > > On 10/19/23 09:39, Patrick Palka wrote: > > > > On Tue, 17 Oct 2023, Marek Polacek wrote: > > > > > > > > > On Tue, Oct 17, 2023 at 04:49:52PM -0400, Jason Merrill wrote: > > > > > > On 10/16/23 20:39, Marek Polacek wrote: > > > > > > > On Sat, Oct 14, 2023 at 01:13:22AM -0400, Jason Merrill wrote: > > > > > > > > On 10/13/23 14:53, Marek Polacek wrote: > > > > > > > > > On Thu, Oct 12, 2023 at 09:41:43PM -0400, Jason Merrill wrote: > > > > > > > > > > On 10/12/23 17:04, Marek Polacek wrote: > > > > > > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for > > > > > > > > > > > trunk? > > > > > > > > > > > > > > > > > > > > > > -- >8 -- > > > > > > > > > > > My recent patch introducing cp_fold_immediate_r caused > > > > > > > > > > > exponential > > > > > > > > > > > compile time with nested COND_EXPRs. The problem is that > > > > > > > > > > > the COND_EXPR > > > > > > > > > > > case recursively walks the arms of a COND_EXPR, but after > > > > > > > > > > > processing > > > > > > > > > > > both arms it doesn't end the walk; it proceeds to walk the > > > > > > > > > > > sub-expressions of the outermost COND_EXPR, triggering > > > > > > > > > > > again walking > > > > > > > > > > > the arms of the nested COND_EXPR, and so on. This patch > > > > > > > > > > > brings the > > > > > > > > > > > compile time down to about 0m0.033s. > > > > > > > > > > > > > > > > > > > > > > I've added some debug prints to make sure that the rest > > > > > > > > > > > of cp_fold_r > > > > > > > > > > > is still performed as before. > > > > > > > > > > > > > > > > > > > > > > PR c++/111660 > > > > > > > > > > > > > > > > > > > > > > gcc/cp/ChangeLog: > > > > > > > > > > > > > > > > > > > > > > * cp-gimplify.cc (cp_fold_immediate_r) <case > > > > > > > > > > > COND_EXPR>: Return > > > > > > > > > > > integer_zero_node instead of break;. > > > > > > > > > > > (cp_fold_immediate): Return true if > > > > > > > > > > > cp_fold_immediate_r returned > > > > > > > > > > > error_mark_node. > > > > > > > > > > > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > > > > > > > > > > > * g++.dg/cpp0x/hog1.C: New test. > > > > > > > > > > > --- > > > > > > > > > > > gcc/cp/cp-gimplify.cc | 9 ++-- > > > > > > > > > > > gcc/testsuite/g++.dg/cpp0x/hog1.C | 77 > > > > > > > > > > > +++++++++++++++++++++++++++++++ > > > > > > > > > > > 2 files changed, 82 insertions(+), 4 deletions(-) > > > > > > > > > > > create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C > > > > > > > > > > > > > > > > > > > > > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc > > > > > > > > > > > index bdf6e5f98ff..ca622ca169a 100644 > > > > > > > > > > > --- a/gcc/cp/cp-gimplify.cc > > > > > > > > > > > +++ b/gcc/cp/cp-gimplify.cc > > > > > > > > > > > @@ -1063,16 +1063,16 @@ cp_fold_immediate_r (tree > > > > > > > > > > > *stmt_p, int *walk_subtrees, void *data_) > > > > > > > > > > > break; > > > > > > > > > > > if (TREE_OPERAND (stmt, 1) > > > > > > > > > > > && cp_walk_tree (&TREE_OPERAND (stmt, 1), > > > > > > > > > > > cp_fold_immediate_r, data, > > > > > > > > > > > - nullptr)) > > > > > > > > > > > + nullptr) == error_mark_node) > > > > > > > > > > > return error_mark_node; > > > > > > > > > > > if (TREE_OPERAND (stmt, 2) > > > > > > > > > > > && cp_walk_tree (&TREE_OPERAND (stmt, 2), > > > > > > > > > > > cp_fold_immediate_r, data, > > > > > > > > > > > - nullptr)) > > > > > > > > > > > + nullptr) == error_mark_node) > > > > > > > > > > > return error_mark_node; > > > > > > > > > > > /* We're done here. Don't clear > > > > > > > > > > > *walk_subtrees here though: we're called > > > > > > > > > > > from cp_fold_r and we must let it recurse on > > > > > > > > > > > the expression with > > > > > > > > > > > cp_fold. */ > > > > > > > > > > > - break; > > > > > > > > > > > + return integer_zero_node; > > > > > > > > > > > > > > > > > > > > I'm concerned this will end up missing something like > > > > > > > > > > > > > > > > > > > > 1 ? 1 : ((1 ? 1 : 1), immediate()) > > > > > > > > > > > > > > > > > > > > as the integer_zero_node from the inner ?: will prevent > > > > > > > > > > walk_tree from > > > > > > > > > > looking any farther. > > > > > > > > > > > > > > > > > > You are right. The line above works as expected, but > > > > > > > > > > > > > > > > > > 1 ? 1 : ((1 ? 1 : id (42)), id (i)); > > > > > > > > > > > > > > > > > > shows the problem (when the expression isn't used as an > > > > > > > > > initializer). > > > > > > > > > > > > > > > > > > > Maybe we want to handle COND_EXPR in cp_fold_r instead of > > > > > > > > > > here? > > > > > > > > > > > > > > > > > > I hope this version is better. > > > > > > > > > > > > > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > > > > > > > > > > > > > > > -- >8 -- > > > > > > > > > My recent patch introducing cp_fold_immediate_r caused > > > > > > > > > exponential > > > > > > > > > compile time with nested COND_EXPRs. The problem is that the > > > > > > > > > COND_EXPR > > > > > > > > > case recursively walks the arms of a COND_EXPR, but after > > > > > > > > > processing > > > > > > > > > both arms it doesn't end the walk; it proceeds to walk the > > > > > > > > > sub-expressions of the outermost COND_EXPR, triggering again > > > > > > > > > walking > > > > > > > > > the arms of the nested COND_EXPR, and so on. This patch > > > > > > > > > brings the > > > > > > > > > compile time down to about 0m0.033s. > > > > > > > > > > > > > > > > Is this number still accurate for this version? > > > > > > > > > > > > > > It is. I ran time(1) a few more times and the results were > > > > > > > 0m0.033s - 0m0.035s. > > > > > > > That said, ... > > > > > > > > > > > > > > > This change seems algorithmically better than the current code, > > > > > > > > but still > > > > > > > > problematic: if we have nested COND_EXPR A/B/C/D/E, it looks > > > > > > > > like we will > > > > > > > > end up cp_fold_immediate_r walking the arms of E five times, > > > > > > > > once for each > > > > > > > > COND_EXPR. > > > > > > > > > > > > > > ...this is accurate. I should have addressed the redundant > > > > > > > folding in v2 > > > > > > > even though the compilation is pretty much immediate. > > > > > > > > What I was thinking by handling COND_EXPR in cp_fold_r was to > > > > > > > > cp_fold_r walk > > > > > > > > its subtrees (or cp_fold_immediate_r if it's clear from op0 > > > > > > > > that the branch > > > > > > > > isn't taken) so we can clear *walk_subtrees and we don't > > > > > > > > fold_imm walk a > > > > > > > > node more than once. > > > > > > > > > > > > > > I agree I should do better here. How's this, then? I've added > > > > > > > debug_generic_expr to cp_fold_immediate_r to see if it gets the > > > > > > > same > > > > > > > expr multiple times and it doesn't seem to. > > > > > > > > > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > > > > > > > > > > > -- >8 -- > > > > > > > My recent patch introducing cp_fold_immediate_r caused exponential > > > > > > > compile time with nested COND_EXPRs. The problem is that the > > > > > > > COND_EXPR > > > > > > > case recursively walks the arms of a COND_EXPR, but after > > > > > > > processing > > > > > > > both arms it doesn't end the walk; it proceeds to walk the > > > > > > > sub-expressions of the outermost COND_EXPR, triggering again > > > > > > > walking > > > > > > > the arms of the nested COND_EXPR, and so on. This patch brings > > > > > > > the > > > > > > > compile time down to about 0m0.030s. > > > > > > > > > > > > > > The ff_fold_immediate flag is unused after this patch but since > > > > > > > I'm > > > > > > > using it in the P2564 patch, I'm not removing it now. Maybe > > > > > > > at_eof > > > > > > > can be used instead and then we can remove ff_fold_immediate. > > > > > > > > > > > > > > PR c++/111660 > > > > > > > > > > > > > > gcc/cp/ChangeLog: > > > > > > > > > > > > > > * cp-gimplify.cc (cp_fold_immediate_r) <case > > > > > > > COND_EXPR>: Don't > > > > > > > handle it here. > > > > > > > (cp_fold_r): Handle COND_EXPR here. > > > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > > > * g++.dg/cpp0x/hog1.C: New test. > > > > > > > * g++.dg/cpp2a/consteval36.C: New test. > > > > > > > --- > > > > > > > gcc/cp/cp-gimplify.cc | 52 +++++++++------- > > > > > > > gcc/testsuite/g++.dg/cpp0x/hog1.C | 77 > > > > > > > ++++++++++++++++++++++++ > > > > > > > gcc/testsuite/g++.dg/cpp2a/consteval36.C | 22 +++++++ > > > > > > > 3 files changed, 128 insertions(+), 23 deletions(-) > > > > > > > create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C > > > > > > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval36.C > > > > > > > > > > > > > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc > > > > > > > index bdf6e5f98ff..a282c3930a3 100644 > > > > > > > --- a/gcc/cp/cp-gimplify.cc > > > > > > > +++ b/gcc/cp/cp-gimplify.cc > > > > > > > @@ -1052,27 +1052,6 @@ cp_fold_immediate_r (tree *stmt_p, int > > > > > > > *walk_subtrees, void *data_) > > > > > > > switch (TREE_CODE (stmt)) > > > > > > > { > > > > > > > - /* Unfortunately we must handle code like > > > > > > > - false ? bar () : 42 > > > > > > > - where we have to check bar too. The cp_fold call in > > > > > > > cp_fold_r could > > > > > > > - fold the ?: into a constant before we see it here. */ > > > > > > > - case COND_EXPR: > > > > > > > - /* If we are called from cp_fold_immediate, we don't need > > > > > > > to worry about > > > > > > > - cp_fold folding away the COND_EXPR. */ > > > > > > > - if (data->flags & ff_fold_immediate) > > > > > > > - break; > > > > > > > - if (TREE_OPERAND (stmt, 1) > > > > > > > - && cp_walk_tree (&TREE_OPERAND (stmt, 1), > > > > > > > cp_fold_immediate_r, data, > > > > > > > - nullptr)) > > > > > > > - return error_mark_node; > > > > > > > - if (TREE_OPERAND (stmt, 2) > > > > > > > - && cp_walk_tree (&TREE_OPERAND (stmt, 2), > > > > > > > cp_fold_immediate_r, data, > > > > > > > - nullptr)) > > > > > > > - return error_mark_node; > > > > > > > - /* We're done here. Don't clear *walk_subtrees here > > > > > > > though: we're called > > > > > > > - from cp_fold_r and we must let it recurse on the expression > > > > > > > with > > > > > > > - cp_fold. */ > > > > > > > - break; > > > > > > > case PTRMEM_CST: > > > > > > > if (TREE_CODE (PTRMEM_CST_MEMBER (stmt)) == > > > > > > > FUNCTION_DECL > > > > > > > && DECL_IMMEDIATE_FUNCTION_P (PTRMEM_CST_MEMBER > > > > > > > (stmt))) > > > > > > > @@ -1162,8 +1141,35 @@ cp_fold_r (tree *stmt_p, int > > > > > > > *walk_subtrees, void *data_) > > > > > > > tree stmt = *stmt_p; > > > > > > > enum tree_code code = TREE_CODE (stmt); > > > > > > > - if (cxx_dialect > cxx17) > > > > > > > - cp_fold_immediate_r (stmt_p, walk_subtrees, data); > > > > > > > + if (cxx_dialect >= cxx20) > > > > > > > + { > > > > > > > + /* Unfortunately we must handle code like > > > > > > > + false ? bar () : 42 > > > > > > > + where we have to check bar too. The cp_fold call below could > > > > > > > + fold the ?: into a constant before we've checked it. */ > > > > > > > + if (code == COND_EXPR) > > > > > > > + { > > > > > > > + auto then_fn = cp_fold_r, else_fn = cp_fold_r; > > > > > > > + /* See if we can figure out if either of the branches is > > > > > > > dead. If it > > > > > > > + is, we don't need to do everything that cp_fold_r does. */ > > > > > > > + tree cond = maybe_constant_value (TREE_OPERAND (stmt, 0)); > > > > > > > + if (integer_zerop (cond)) > > > > > > > + then_fn = cp_fold_immediate_r; > > > > > > > + else if (TREE_CODE (cond) == INTEGER_CST) > > > > > > > + else_fn = cp_fold_immediate_r; > > > > > > > + > > > > > > > + cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, > > > > > > > nullptr); > > > > > > > > > > > > I wonder about doing this before maybe_constant_value, to hopefully > > > > > > reduce > > > > > > the redundant calculations? OK either way. > > > > > > > > > > Yeah, I was toying with that, I had > > > > > > > > > > foo() ? x : y > > > > > > > > > > where foo was a constexpr function but the cp_fold_r on op0 didn't > > > > > eval it > > > > > to a constant :(. > > > > > > > > IIUC that's because cp_fold evaluates constexpr calls only with -O, so > > > > doing cp_fold_r before maybe_constant_value on the condition should > > > > still be desirable in that case? > > > > > > Hmm, and if the cp_fold_r doesn't reduce the test to a constant, I would > > > think that folding the COND_EXPR also won't discard the other branch, so > > > we > > > shouldn't need to work harder to get a constant here, so we don't need to > > > call maybe_constant_value at all? > > > > Sorry, I hadn't seen this message when I posted the tweak. But the > > maybe_constant_value call on TREE_OPERAND (stmt, 0) should still make > > sense because it can render a branch dead even on -O0. Right? > > But if op0 isn't constant after cp_fold, folding the COND_EXPR won't discard > the branch, so we don't need to do the extra work to find out that it's > actually dead.
Hmm, so how about this? Thus far dg.exp passed. -- >8 -- This patch is an optimization tweak for cp_fold_r. If we cp_fold_r the COND_EXPR's op0 first, we may be able to evaluate it to a constant if -O. cp_fold has: 3143 if (callee && DECL_DECLARED_CONSTEXPR_P (callee) 3144 && !flag_no_inline) ... 3151 r = maybe_constant_value (x, /*decl=*/NULL_TREE, flag_no_inline is 1 for -O0: 1124 if (opts->x_optimize == 0) 1125 { 1126 /* Inlining does not work if not optimizing, 1127 so force it not to be done. */ 1128 opts->x_warn_inline = 0; 1129 opts->x_flag_no_inline = 1; 1130 } but otherwise it's 0 and cp_fold will maybe_constant_value calls to constexpr functions. And if it doesn't, then folding the COND_EXPR will keep both arms, and we can avoid calling maybe_constant_value. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_fold_r): Don't call maybe_constant_value. --- gcc/cp/cp-gimplify.cc | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc index a282c3930a3..385042aeef2 100644 --- a/gcc/cp/cp-gimplify.cc +++ b/gcc/cp/cp-gimplify.cc @@ -1152,13 +1152,12 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_) auto then_fn = cp_fold_r, else_fn = cp_fold_r; /* See if we can figure out if either of the branches is dead. If it is, we don't need to do everything that cp_fold_r does. */ - tree cond = maybe_constant_value (TREE_OPERAND (stmt, 0)); - if (integer_zerop (cond)) + cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr); + if (integer_zerop (TREE_OPERAND (stmt, 0))) then_fn = cp_fold_immediate_r; - else if (TREE_CODE (cond) == INTEGER_CST) + else if (TREE_CODE (TREE_OPERAND (stmt, 0)) == INTEGER_CST) else_fn = cp_fold_immediate_r; - cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr); if (TREE_OPERAND (stmt, 1)) cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data, nullptr); base-commit: 00e7c49fa04a3766e4726322b427621a74b78c71 -- 2.41.0