On 9/23/21 6:10 PM, Jeff Law wrote:
On 9/23/2021 5:15 AM, Aldy Hernandez wrote:
My upcoming improvements to the forward jump threader make it thread
more aggressively. In investigating some "regressions", I noticed
that it has always allowed threading through empty latches and across
loop boundaries. As we have discussed recently, this should be avoided
until after loop optimizations have run their course.
Note that this wasn't much of a problem before because DOM/VRP
couldn't find these opportunities, but with a smarter solver, we trip
over them more easily.
We used to be much more aggressive in this space -- but we removed the
equivalency tracking on backedges in the main part of DOM which had the
side effect to reducing the number of threads related to back edges in
loops.
I thought we couldn't thread through back edges at all in the old
threader, or are we talking about the same thing? We have a hard fail
on backedge thread attempts for anything but the backward threader and
its custom copier.
Of course that was generally a positive thing given the issues we've
been discussing.
Yeah. These tweaks have reduced the number of jump threads in my
bootstrap .ii by 6%, so a considerable amount. But they were
problematic threading paths to begin with.
For example, it removed the regression introduced by the backward
threader rewrite in gcc.dg/vect/bb-slp-16.c. Interestingly, for all the
checks we do in the backward threader, some threading through loop
boundaries seep through. In particular the check for loop crossing
excludes the taken edge, which IMO is a mistake. If the entire path is
in a loop, but the taken edge points to another loop, that by definition
is a loop crossing. Note, we have an exception for the first block in a
path being in another loop, but that's something else ;-).
Anywhoo... we're catching it now. We really should clean this up and
merge the differing implementations. But I'm way over my time budget
for this ;-).
Because the forward threader doesn't have an independent localized cost
model like the new threader (profitable_path_p), it is difficult to
catch these things at discovery. However, we can catch them at
registration time, with the added benefit that all the threaders
(forward and backward) can share the handcuffs.
In an ideal world profitability and correctness would be separated --
but they're still intertwined and I don't think this makes that
situation particularly worse. And I do like that having a single choke
point.
Huh, I hadn't though about it that way, but you're right. The
profitable_path_p code is catching both correctness as well as
profitability issues. It seems all the profitability stuff is keyed by
param*fsm* compile options, though. It should be easy enough to separate.
Obviously you're cleaning this up, so I think a significant degree of
freedom should be given here....
Much appreciated.
Aldy