On 07/02/2018 07:08 AM, Christophe Lyon wrote:
On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
While poking around in the backwards threader I noticed that we bail if
we have already seen a starting BB.
/* Do not jump-thread twice from the same block. */
if (bitmap_bit_p (threaded_blocks, entry->src->index)
This limitation discards paths that are sub-paths of paths that have
already been threaded.
The following patch scans the remaining to-be-threaded paths to identify
if any of them start from the same point, and are thus sub-paths of the
just-threaded path. By removing the common prefix of blocks in upcoming
threadable paths, and then rewiring first non-common block
appropriately, we expose new threading opportunities, since we are no
longer starting from the same BB. We also simplify the would-be
threaded paths, because we don't duplicate already duplicated paths.
[snip]
Hi,
I've noticed a regression on aarch64:
FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
threaded: 3"
very likely caused by this patch (appeared between 262282 and 262294)
Christophe
The test needs to be adjusted here.
The long story is that the aarch64 IL is different at thread3 time in
that it has 2 profitable sub-paths that can now be threaded with my
patch. This is causing the threaded count to be 5 for aarch64, versus 3
for x86 64. Previously we couldn't thread these in aarch64, so the
backwards threader would bail.
One can see the different threading opportunities by sticking
debug_all_paths() at the top of thread_through_all_blocks(). You will
notice that aarch64 has far more candidates to begin with. The IL on
the x86 backend, has no paths that start on the same BB. The aarch64,
on the other hand, has many to choose from:
path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11,
path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19,
Some of these prove unprofitable, but 2 more than before are profitable now.
BTW, I see another threading related failure on aarch64 which is
unrelated to my patch, and was previously there:
FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps
threaded"
This is probably another IL incompatibility between architectures.
Anyways... the attached path fixes the regression. I have added a note
to the test explaining the IL differences. We really should rewrite all
the threading tests (I am NOT volunteering ;-)).
OK for trunk?
Aldy
curr.patch
gcc/testsuite/
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64
has a slightly different IL that provides more threading
opportunities.