This is the second patch in the series.

Ranger uses its own API to recursively satisfy dependencies. When range_of_stmt is called on _1482 = _1154 + _1177;  it picks up the ranges of _1154 and _1177 from it's cache. If those statements have not been seen yet, it recursively calls range_of_stmt on each one to resolve the answer.  Each main API call can trigger up to 5 other calls to get to the next API point:

   gimple_ranger::fold_range_internal (...)
   gimple_ranger::range_of_stmt (_1154,...)
   gimple_ranger::range_of_expr (_1154,....)
   fold_using_range::range_of_range_op (..)
   fold_using_range::fold_stmt (...)
   gimple_ranger::fold_range_internal (...)
   gimple_ranger::range_of_stmt (_1482,...)

For a normal forward walk, values tend to already be in the cache, but when we try to answer a range_on_edge question on a back edge, it can trigger a very long series of queries.  I spent some time analyzing these patterns, and found that regardless of which API entry point was used, ultimately range_of_stmt is invoked in a predictable order to initiate the cache values.

This patch implements a dependency resolver which when range_of_stmt uses when it is called on something which does not have a cache entry yet (thus the disambiguation of the temporal failure vs lack of cache entry in the previous patch)

This looks at each operand, and if that operand does not have a cache entry, pushes it on a stack.   Names are popped from the stack and fold_using_range() is invoked once all the operands have been resolved.   When we do get to call fold_using_range::fold_stmt(), we are sure the operands are cached and the value will simply be calculated.  This is ultimately the exact series of events that would have happened had the main API been used... except we don't involve the call stack anymore for each one.

Well, mostly :-).  For this fix, we only do this with operands of stmts which have a range-ops handler.. meaning we do not use the API for anything range-ops understands.  We will still use the main API for resolving PHIS and other statements as they are encountered.    We could do this for PHIS as well, but for the most part it was the chains of stmts within a block that were causing the vast majority of the issue.  If we later discover large chains of PHIs are causing issues as well, then I can easily add them to this as well.  I avoided them this time because there is extra overhead involved in traversing all the PHI arguments extra times.  Sticking with range-ops limits us to 2 operands to check, and the overhead is very minimal.

I have tested this with PHIs as well and we could just include them upfront. The overhead is more than doubled, but the increased compile time of a VRP pass is still under 1%.

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK?

Andrew



From 28d1fea6e6c0c0368dbc04e895aaa0a6b47c19da Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacl...@redhat.com>
Date: Mon, 22 Nov 2021 14:39:41 -0500
Subject: [PATCH 3/3] Directly resolve range_of_stmt dependencies.

All ranger API entries eventually call range_of_stmt to ensure there is an
initial global value to work with.  This can cause very deep call chains when
satisfied via the normal API.  Instead, push any dependencies onto a stack
and evaluate them in a depth first manner, mirroring what would have happened
via the normal API calls.

	PR tree-optimization/103231
	gcc/
	* gimple-range.cc (gimple_ranger::gimple_ranger): Create stmt stack.
	(gimple_ranger::gimple_ranger): Delete stmt stack.
	(gimple_ranger::range_of_stmt): Process depenedencies if they have no
	global cache entry.
	(gimple_ranger::prefill_name): New.
	(gimple_ranger::prefill_stmt_dependencies): New.
	* gimple-range.h (class gimple_ranger): Add prototypes.
---
 gcc/gimple-range.cc | 107 +++++++++++++++++++++++++++++++++++++++++++-
 gcc/gimple-range.h  |   4 ++
 2 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index e3ab3a8bb48..178a470a419 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -46,6 +46,9 @@ gimple_ranger::gimple_ranger () :
   m_oracle = m_cache.oracle ();
   if (dump_file && (param_ranger_debug & RANGER_DEBUG_TRACE))
     tracer.enable_trace ();
+  m_stmt_list.create (0);
+  m_stmt_list.safe_grow (num_ssa_names);
+  m_stmt_list.truncate (0);
 
   // Ensure the not_executable flag is clear everywhere.
   if (flag_checking)
@@ -61,6 +64,11 @@ gimple_ranger::gimple_ranger () :
     }
 }
 
+gimple_ranger::~gimple_ranger ()
+{
+  m_stmt_list.release ();
+}
+
 bool
 gimple_ranger::range_of_expr (irange &r, tree expr, gimple *stmt)
 {
@@ -284,9 +292,10 @@ gimple_ranger::range_of_stmt (irange &r, gimple *s, tree name)
   else
     {
       bool current;
-      // Check if the stmt has already been processed, and is not stale.
+      // Check if the stmt has already been processed.
       if (m_cache.get_global_range (r, name, current))
 	{
+	  // If it isn't stale, use this cached value.
 	  if (current)
 	    {
 	      if (idx)
@@ -294,7 +303,10 @@ gimple_ranger::range_of_stmt (irange &r, gimple *s, tree name)
 	      return true;
 	    }
 	}
-      // Otherwise calculate a new value.
+      else
+	prefill_stmt_dependencies (name);
+
+      // Calculate a new value.
       int_range_max tmp;
       fold_range_internal (tmp, s, name);
 
@@ -311,6 +323,97 @@ gimple_ranger::range_of_stmt (irange &r, gimple *s, tree name)
   return res;
 }
 
+
+// Check if NAME is a dependency that needs resolving, and push it on the
+// stack if so.  R is a scratch range.
+
+inline void
+gimple_ranger::prefill_name (irange &r, tree name)
+{
+  if (!gimple_range_ssa_p (name))
+    return;
+  gimple *stmt = SSA_NAME_DEF_STMT (name);
+  if (!gimple_range_handler (stmt))
+    return;
+
+  bool current;
+  // If this op has not been processed yet, then push it on the stack
+  if (!m_cache.get_global_range (r, name, current))
+    m_stmt_list.safe_push (name);
+}
+
+// This routine will seed the global cache with most of the depnedencies of
+// NAME.  This prevents excessive call depth through the normal API.
+
+void
+gimple_ranger::prefill_stmt_dependencies (tree ssa)
+{
+  if (SSA_NAME_IS_DEFAULT_DEF (ssa))
+    return;
+
+  int_range_max r;
+  unsigned idx;
+  gimple *stmt = SSA_NAME_DEF_STMT (ssa);
+  gcc_checking_assert (stmt && gimple_bb (stmt));
+
+  // Only pre-process range-ops.
+  if (!gimple_range_handler (stmt))
+    return;
+
+  // Mark where on the stack we are starting.
+  unsigned start = m_stmt_list.length ();
+  m_stmt_list.safe_push (ssa);
+
+  idx = tracer.header ("ROS dependence fill\n");
+
+  // Loop until back at the start point.
+  while (m_stmt_list.length () > start)
+    {
+      tree name = m_stmt_list.last ();
+      // NULL is a marker which indicates the next name in the stack has now
+      // been fully resolved, so we can fold it.
+      if (!name)
+	{
+	  // Pop the NULL, then pop the name.
+	  m_stmt_list.pop ();
+	  name = m_stmt_list.pop ();
+	  // Don't fold initial request, it will be calculated upon return.
+	  if (m_stmt_list.length () > start)
+	    {
+	      // Fold and save the value for NAME.
+	      stmt = SSA_NAME_DEF_STMT (name);
+	      fold_range_internal (r, stmt, name);
+	      m_cache.set_global_range (name, r);
+	    }
+	  continue;
+	}
+
+      // Add marker indicating previous NAME in list should be folded
+      // when we get to this NULL.
+      m_stmt_list.safe_push (NULL_TREE);
+      stmt = SSA_NAME_DEF_STMT (name);
+
+      if (idx)
+	{
+	  tracer.print (idx, "ROS dep fill (");
+	  print_generic_expr (dump_file, name, TDF_SLIM);
+	  fputs (") at stmt ", dump_file);
+	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+	}
+
+      gcc_checking_assert (gimple_range_handler (stmt));
+      tree op = gimple_range_operand2 (stmt);
+      if (op)
+	prefill_name (r, op);
+      op = gimple_range_operand1 (stmt);
+      if (op)
+	prefill_name (r, op);
+    }
+  if (idx)
+    tracer.trailer (idx, "ROS ", false, ssa, r);
+}
+
+
 // This routine will invoke the gimple fold_stmt routine, providing context to
 // range_of_expr calls via an private interal API.
 
diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h
index 615496ec9b8..c2923c58b27 100644
--- a/gcc/gimple-range.h
+++ b/gcc/gimple-range.h
@@ -47,6 +47,7 @@ class gimple_ranger : public range_query
 {
 public:
   gimple_ranger ();
+  ~gimple_ranger ();
   virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL) OVERRIDE;
   virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) OVERRIDE;
   virtual bool range_on_edge (irange &r, edge e, tree name) OVERRIDE;
@@ -61,9 +62,12 @@ public:
   bool fold_stmt (gimple_stmt_iterator *gsi, tree (*) (tree));
 protected:
   bool fold_range_internal (irange &r, gimple *s, tree name);
+  void prefill_name (irange &r, tree name);
+  void prefill_stmt_dependencies (tree ssa);
   ranger_cache m_cache;
   range_tracer tracer;
   basic_block current_bb;
+  vec<tree> m_stmt_list;
 };
 
 /* Create a new ranger instance and associate it with a function.
-- 
2.17.2

Reply via email to