103231 - Directly resolve range_of_stmt dependencies.

Andrew MacLeod via Gcc-patches Wed, 24 Nov 2021 06:07:39 -0800

On 11/24/21 04:17, Richard Biener wrote:

On Tue, Nov 23, 2021 at 6:04 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:

This is the second patch in the series.


Ranger uses its own API to recursively satisfy dependencies. When
range_of_stmt is called on _1482 = _1154 + _1177;  it picks up the
ranges of _1154 and _1177 from it's cache. If those statements have not
been seen yet, it recursively calls range_of_stmt on each one to resolve
the answer.  Each main API call can trigger up to 5 other calls to get
to the next API point:

     gimple_ranger::fold_range_internal (...)
     gimple_ranger::range_of_stmt (_1154,...)
     gimple_ranger::range_of_expr (_1154,....)
     fold_using_range::range_of_range_op (..)
     fold_using_range::fold_stmt (...)
     gimple_ranger::fold_range_internal (...)
     gimple_ranger::range_of_stmt (_1482,...)

For a normal forward walk, values tend to already be in the cache, but
when we try to answer a range_on_edge question on a back edge, it can
trigger a very long series of queries.  I spent some time analyzing
these patterns, and found that regardless of which API entry point was
used, ultimately range_of_stmt is invoked in a predictable order to
initiate the cache values.

This patch implements a dependency resolver which when range_of_stmt
uses when it is called on something which does not have a cache entry
yet (thus the disambiguation of the temporal failure vs lack of cache
entry in the previous patch)

This looks at each operand, and if that operand does not have a cache
entry, pushes it on a stack.   Names are popped from the stack and
fold_using_range() is invoked once all the operands have been
resolved.   When we do get to call fold_using_range::fold_stmt(), we are
sure the operands are cached and the value will simply be calculated.
This is ultimately the exact series of events that would have happened
had the main API been used... except we don't involve the call stack
anymore for each one.

Well, mostly :-).  For this fix, we only do this with operands of stmts
which have a range-ops handler.. meaning we do not use the API for
anything range-ops understands.  We will still use the main API for
resolving PHIS and other statements as they are encountered.    We could
do this for PHIS as well, but for the most part it was the chains of
stmts within a block that were causing the vast majority of the issue.
If we later discover large chains of PHIs are causing issues as well,
then I can easily add them to this as well.  I avoided them this time
because there is extra overhead involved in traversing all the PHI
arguments extra times.  Sticking with range-ops limits us to 2 operands
to check, and the overhead is very minimal.

I have tested this with PHIs as well and we could just include them
upfront. The overhead is more than doubled, but the increased compile
time of a VRP pass is still under 1%.

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK?

OK.

Richard.

Andrew

committed.

Re: [PATCH 2/2] PR tree-optimization/103231 - Directly resolve range_of_stmt dependencies.

Reply via email to