> On 10/27/21 17:10, Patrick Palka wrote: > > On Wed, 27 Oct 2021, Jason Merrill wrote: > > > > > On 10/27/21 14:54, Patrick Palka wrote: > > > > On Tue, 26 Oct 2021, Jakub Jelinek wrote: > > > > > > > > > On Tue, Oct 26, 2021 at 05:07:43PM -0400, Patrick Palka wrote: > > > > > > The performance impact of the other calls to > > > > > > cxx_eval_outermost_const_expr > > > > > > from p_c_e_1 is probably already mostly mitigated by the constexpr > > > > > > call > > > > > > cache and the fact that we try to evaluate all calls to constexpr > > > > > > functions during cp_fold_function anyway (at least with -O). So > > > > > > trial > > > > > > > > > > constexpr function bodies don't go through cp_fold_function > > > > > (intentionally, > > > > > so that we don't optimize away UB), the bodies are copied before the > > > > > trees > > > > > of the > > > > > normal copy are folded. > > > > > > > > Ah right, I had forgotten about that.. > > > > > > > > Here's another approach that doesn't need to remove trial evaluation for > > > > &&/||. The idea is to first quietly check if the second operand is > > > > potentially constant _before_ performing trial evaluation of the first > > > > operand. This speeds up the case we care about (both operands are > > > > potentially constant) without regressing any diagnostics. We have to be > > > > careful about emitting bogus diagnostics when tf_error is set, hence the > > > > first hunk below which makes p_c_e_1 always proceed quietly first, and > > > > replay noisily in case of error (similar to how satisfaction works). > > > > > > > > Would something like this be preferable? > > > > > > Seems plausible, though doubling the number of stack frames is a downside. > > > > Whoops, good point.. The noisy -> quiet adjustment only needs to > > be performed during the outermost call to p_c_e_1, and not also during > > each recursive call. The revised diff below fixes this thinko, and so > > only a single extra stack frame is needed AFAICT. > > > > > What did you think of Jakub's suggestion of linearizing the terms? > > > > IIUC that would fix the quadraticness, but it wouldn't address that > > we end up evaluating the entire expression twice, once during the trial > > evaluation of each term from p_c_e_1 and again during the proper > > evaluation of the entire expression. It'd be nice if we could somehow > > avoid the double evaluation, as in the approach below (or in the first > > patch). > > OK with more comments to explain the tf_error hijinks.
Thanks a lot, here's the complete committed patch for the record: -- >8 -- Subject: [PATCH] c++: quadratic constexpr behavior for left-assoc logical exprs [PR102780] In the testcase below the two left fold expressions each expand into a constant logical expression with 1024 terms, for which potential_const_expr takes more than a minute to return true. This happens because p_c_e_1 performs trial evaluation of the first operand of a &&/|| in order to determine whether to consider the potentiality of the second operand. And because the expanded expression is left-associated, this trial evaluation causes p_c_e_1 to be quadratic in the number of terms of the expression. This patch fixes this quadratic behavior by making p_c_e_1 preemptively compute potentiality of the second operand of a &&/||, and perform trial evaluation of the first operand only if the second operand isn't potentially constant. We must be careful to avoid emitting bogus diagnostics during the preemptive computation; to that end, we perform this shortcut only when tf_error is cleared, and when tf_error is set we now first check potentiality of the whole expression quietly and replay the check noisily for diagnostics. Apart from fixing the quadraticness for left-associated logical exprs, this change also reduces compile time for the libstdc++ testcase 20_util/variant/87619.cc by about 15% even though our <variant> uses right folds instead of left folds. Likewise for the testcase in the PR, for which compile time is reduced by 30%. The reason for these speedups is that p_c_e_1 no longer performs expensive trial evaluation of each term of large constant logical expressions when determining their potentiality. PR c++/102780 gcc/cp/ChangeLog: * constexpr.c (potential_constant_expression_1) <case TRUTH_*_EXPR>: When tf_error isn't set, preemptively check potentiality of the second operand before performing trial evaluation of the first operand. (potential_constant_expression_1): When tf_error is set, first check potentiality quietly and return true if successful, otherwise proceed noisily to give errors. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/fold13.C: New test. --- gcc/cp/constexpr.c | 26 +++++++++++++++++++++----- gcc/testsuite/g++.dg/cpp1z/fold13.C | 29 +++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/fold13.C diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index a31d12b7451..40fe165c2b7 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -8892,13 +8892,18 @@ potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool now, tmp = boolean_false_node; truth: { - tree op = TREE_OPERAND (t, 0); - if (!RECUR (op, rval)) + tree op0 = TREE_OPERAND (t, 0); + tree op1 = TREE_OPERAND (t, 1); + if (!RECUR (op0, rval)) return false; + if (!(flags & tf_error) && RECUR (op1, rval)) + /* When quiet, try to avoid expensive trial evaluation by first + checking potentiality of the second operand. */ + return true; if (!processing_template_decl) - op = cxx_eval_outermost_constant_expr (op, true); - if (tree_int_cst_equal (op, tmp)) - return RECUR (TREE_OPERAND (t, 1), rval); + op0 = cxx_eval_outermost_constant_expr (op0, true); + if (tree_int_cst_equal (op0, tmp)) + return (flags & tf_error) ? RECUR (op1, rval) : false; else return true; } @@ -9107,6 +9112,17 @@ bool potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool now, tsubst_flags_t flags) { + if (flags & tf_error) + { + /* Check potentiality quietly first, as that could be performed more + efficiently in some cases (currently only for TRUTH_*_EXPR). If + that fails, replay the check noisily to give errors. */ + flags &= ~tf_error; + if (potential_constant_expression_1 (t, want_rval, strict, now, flags)) + return true; + flags |= tf_error; + } + tree target = NULL_TREE; return potential_constant_expression_1 (t, want_rval, strict, now, flags, &target); diff --git a/gcc/testsuite/g++.dg/cpp1z/fold13.C b/gcc/testsuite/g++.dg/cpp1z/fold13.C new file mode 100644 index 00000000000..9d7554f8999 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/fold13.C @@ -0,0 +1,29 @@ +// { dg-do compile { target c++17 } } +// Verify constexpr evaluation of a large left fold logical expression +// isn't quadratic in the size of the expanded expression. + +template<int> struct S { static constexpr bool value = true; }; + +template<class T, T...> struct integer_sequence { }; + +template<class T, T N> +using make_integer_sequence +#if __has_builtin(__make_integer_seq) + = __make_integer_seq<integer_sequence, T, N>; +#else + = integer_sequence<T, __integer_pack(N)...>; +#endif + +template<int... Is> +constexpr bool f_impl(integer_sequence<int, Is...>) { + return (... && S<Is>::value); +} + +static_assert(f_impl(make_integer_sequence<int, 1024>())); + +template<int... Is> +constexpr bool g_impl(integer_sequence<int, Is...>) { + return (... || !S<Is>::value); +} + +static_assert(!g_impl(make_integer_sequence<int, 1024>())); -- 2.33.1.711.g9d530dc002