On 10/15/2021 8:21 AM, Aldy Hernandez wrote:
On 10/15/21 3:50 PM, Andrew MacLeod wrote:
I've been looking at the pathological time issue ranger has with the
testcase from, uuuuuh.. PR 97623 I think. I've lost the details,
but kept the file since it was showing unpleasant behaviour.
Most of the time is spent in callbacks from substitute_and_fold to
value_on_edge() dealing with PHI results and arguments. Turns out,
its virtually all wasted time dealing with SSA_NAMES with the
OCCURS_IN_ABNORMAL_PHI flag set..
This patch tells ranger not to consider any SSA_NAMEs which occur in
abnormal PHIs. This reduces the memory footprint of all the caches,
and also has a ripple effect with the new threader code which uses
the GORI exports and imports tables, making it faster as well as no
ssa-name with the abnormal flag set will be entered into the tables.
That alone was not quite enough, as all the sheer volume of call
backs still took time, so I added checks in the value_of_* class of
routines used by substitute_and_fold to indicate there is no constant
value available for any SSA_NAME with that flag set.
On my x86_64 box, before this change, that test case looked like:
tree VRP : 7.76 ( 4%) 0.23 ( 5%)
8.02 ( 4%) 537k ( 0%)
tree VRP threader : 7.20 ( 4%) 0.08 ( 2%) 7.28
( 4%) 392k ( 0%)
tree Early VRP : 39.22 ( 22%) 0.07 ( 2%)
39.44 ( 22%) 1142k ( 0%)
And with this patch , the results are:
tree VRP : 7.57 ( 6%) 0.26 ( 5%)
7.85 ( 6%) 537k ( 0%)
tree VRP threader : 0.62 ( 0%) 0.02 ( 0%)
0.65 ( 0%) 392k ( 0%)
tree Early VRP : 4.00 ( 3%) 0.01 ( 0%)
4.03 ( 3%) 1142k ( 0%)
Which is a significant improvement, both for EVRP and the threader..
The patch adjusts the ranger folder, as well as the hybrid folder.
bootstrapped on x86_64-pc-linux-gnu with no regressions and no missed
cases that I have been able to find.
I don't want to push it quite yet as I wanted feedback to make sure
we don't actually do anything I'm not aware of with SSA_NAMES which
have the ABNORMAL_PHI flag set. Most of the code i can find in VRP
and vr-values appears to punt, so I presume not even considering
those names is fine?
The backward threader skips both edges with EDGE_ABNORMAL set as well
as phi results to have SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
The forward threader skips out on all abnormal edges as well. It
seems to even avoid threading through blocks where one of the 2
outgoing edges is abnormal. Dunno if this was an oversight, or just
being extra careful.
Being extra careful. I couldn't convince myself that copying a block
with an abnormal edge (incoming or outgoing) was going to be reliably safe.
jeff