> On 10/27/21 17:10, Patrick Palka wrote:
> > On Wed, 27 Oct 2021, Jason Merrill wrote:
> > 
> > > On 10/27/21 14:54, Patrick Palka wrote:
> > > > On Tue, 26 Oct 2021, Jakub Jelinek wrote:
> > > > 
> > > > > On Tue, Oct 26, 2021 at 05:07:43PM -0400, Patrick Palka wrote:
> > > > > > The performance impact of the other calls to
> > > > > > cxx_eval_outermost_const_expr
> > > > > > from p_c_e_1 is probably already mostly mitigated by the constexpr
> > > > > > call
> > > > > > cache and the fact that we try to evaluate all calls to constexpr
> > > > > > functions during cp_fold_function anyway (at least with -O).  So
> > > > > > trial
> > > > > 
> > > > > constexpr function bodies don't go through cp_fold_function
> > > > > (intentionally,
> > > > > so that we don't optimize away UB), the bodies are copied before the
> > > > > trees
> > > > > of the
> > > > > normal copy are folded.
> > > > 
> > > > Ah right, I had forgotten about that..
> > > > 
> > > > Here's another approach that doesn't need to remove trial evaluation for
> > > > &&/||.  The idea is to first quietly check if the second operand is
> > > > potentially constant _before_ performing trial evaluation of the first
> > > > operand.  This speeds up the case we care about (both operands are
> > > > potentially constant) without regressing any diagnostics.  We have to be
> > > > careful about emitting bogus diagnostics when tf_error is set, hence the
> > > > first hunk below which makes p_c_e_1 always proceed quietly first, and
> > > > replay noisily in case of error (similar to how satisfaction works).
> > > > 
> > > > Would something like this be preferable?
> > > 
> > > Seems plausible, though doubling the number of stack frames is a downside.
> > 
> > Whoops, good point..  The noisy -> quiet adjustment only needs to
> > be performed during the outermost call to p_c_e_1, and not also during
> > each recursive call.  The revised diff below fixes this thinko, and so
> > only a single extra stack frame is needed AFAICT.
> > 
> > > What did you think of Jakub's suggestion of linearizing the terms?
> > 
> > IIUC that would fix the quadraticness, but it wouldn't address that
> > we end up evaluating the entire expression twice, once during the trial
> > evaluation of each term from p_c_e_1 and again during the proper
> > evaluation of the entire expression.  It'd be nice if we could somehow
> > avoid the double evaluation, as in the approach below (or in the first
> > patch).
> 
> OK with more comments to explain the tf_error hijinks.

Thanks a lot, here's the complete committed patch for the record:

-- >8 --

Subject: [PATCH] c++: quadratic constexpr behavior for left-assoc logical
 exprs [PR102780]

In the testcase below the two left fold expressions each expand into a
constant logical expression with 1024 terms, for which potential_const_expr
takes more than a minute to return true.  This happens because p_c_e_1
performs trial evaluation of the first operand of a &&/|| in order to
determine whether to consider the potentiality of the second operand.
And because the expanded expression is left-associated, this trial
evaluation causes p_c_e_1 to be quadratic in the number of terms of the
expression.

This patch fixes this quadratic behavior by making p_c_e_1 preemptively
compute potentiality of the second operand of a &&/||, and perform trial
evaluation of the first operand only if the second operand isn't
potentially constant.  We must be careful to avoid emitting bogus
diagnostics during the preemptive computation; to that end, we perform
this shortcut only when tf_error is cleared, and when tf_error is set we
now first check potentiality of the whole expression quietly and replay
the check noisily for diagnostics.

Apart from fixing the quadraticness for left-associated logical exprs,
this change also reduces compile time for the libstdc++ testcase
20_util/variant/87619.cc by about 15% even though our <variant> uses
right folds instead of left folds.  Likewise for the testcase in the PR,
for which compile time is reduced by 30%.  The reason for these speedups
is that p_c_e_1 no longer performs expensive trial evaluation of each term
of large constant logical expressions when determining their potentiality.

        PR c++/102780

gcc/cp/ChangeLog:

        * constexpr.c (potential_constant_expression_1) <case TRUTH_*_EXPR>:
        When tf_error isn't set, preemptively check potentiality of the
        second operand before performing trial evaluation of the first
        operand.
        (potential_constant_expression_1): When tf_error is set, first check
        potentiality quietly and return true if successful, otherwise
        proceed noisily to give errors.

gcc/testsuite/ChangeLog:

        * g++.dg/cpp1z/fold13.C: New test.
---
 gcc/cp/constexpr.c                  | 26 +++++++++++++++++++++-----
 gcc/testsuite/g++.dg/cpp1z/fold13.C | 29 +++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/fold13.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index a31d12b7451..40fe165c2b7 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -8892,13 +8892,18 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
       tmp = boolean_false_node;
     truth:
       {
-       tree op = TREE_OPERAND (t, 0);
-       if (!RECUR (op, rval))
+       tree op0 = TREE_OPERAND (t, 0);
+       tree op1 = TREE_OPERAND (t, 1);
+       if (!RECUR (op0, rval))
          return false;
+       if (!(flags & tf_error) && RECUR (op1, rval))
+         /* When quiet, try to avoid expensive trial evaluation by first
+            checking potentiality of the second operand.  */
+         return true;
        if (!processing_template_decl)
-         op = cxx_eval_outermost_constant_expr (op, true);
-       if (tree_int_cst_equal (op, tmp))
-         return RECUR (TREE_OPERAND (t, 1), rval);
+         op0 = cxx_eval_outermost_constant_expr (op0, true);
+       if (tree_int_cst_equal (op0, tmp))
+         return (flags & tf_error) ? RECUR (op1, rval) : false;
        else
          return true;
       }
@@ -9107,6 +9112,17 @@ bool
 potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool now,
                                 tsubst_flags_t flags)
 {
+  if (flags & tf_error)
+    {
+      /* Check potentiality quietly first, as that could be performed more
+        efficiently in some cases (currently only for TRUTH_*_EXPR).  If
+        that fails, replay the check noisily to give errors.  */
+      flags &= ~tf_error;
+      if (potential_constant_expression_1 (t, want_rval, strict, now, flags))
+       return true;
+      flags |= tf_error;
+    }
+
   tree target = NULL_TREE;
   return potential_constant_expression_1 (t, want_rval, strict, now,
                                          flags, &target);
diff --git a/gcc/testsuite/g++.dg/cpp1z/fold13.C 
b/gcc/testsuite/g++.dg/cpp1z/fold13.C
new file mode 100644
index 00000000000..9d7554f8999
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/fold13.C
@@ -0,0 +1,29 @@
+// { dg-do compile { target c++17 } }
+// Verify constexpr evaluation of a large left fold logical expression
+// isn't quadratic in the size of the expanded expression.
+
+template<int> struct S { static constexpr bool value = true; };
+
+template<class T, T...> struct integer_sequence { };
+
+template<class T, T N>
+using make_integer_sequence
+#if __has_builtin(__make_integer_seq)
+  = __make_integer_seq<integer_sequence, T, N>;
+#else
+  = integer_sequence<T, __integer_pack(N)...>;
+#endif
+
+template<int... Is>
+constexpr bool f_impl(integer_sequence<int, Is...>) {
+  return (... && S<Is>::value);
+}
+
+static_assert(f_impl(make_integer_sequence<int, 1024>()));
+
+template<int... Is>
+constexpr bool g_impl(integer_sequence<int, Is...>) {
+  return (... || !S<Is>::value);
+}
+
+static_assert(!g_impl(make_integer_sequence<int, 1024>()));
-- 
2.33.1.711.g9d530dc002

Reply via email to