On 9/9/21 12:15 PM, Richard Biener wrote:
On Thu, Sep 9, 2021 at 11:21 AM Aldy Hernandez <al...@redhat.com> wrote:



On 9/9/21 10:58 AM, Richard Biener wrote:

I ran some experiments a while back, and my current work on the enhanced
solver/threader, can fold virtually everything the DOM/threader gets
(even with its use of const_and_copies, avail_exprs, and
evrp_range_analyzer), while getting 5% more DOM threads and 1% more
overall threads.  That is, I've been testing if the path solver can
solve everything the DOM threader needs (the hybrid approach I mentioned).

Unfortunately, replacing the forward threader right now is not feasible
for a few reasons:

a) The const_and_copies/avail_exprs relation framework can do floats,
and that's next year's ranger work.

Hmm, it sounds like relying fully on ranger is a bit premature then.

For floats, most definitely. Ranger doesn't do them. They're for the next release. However, const_and_copies/avail_exprs for integers we can do just fine, that's why evrp/VRP are much easier targets.


b) Even though we can seemingly fold everything DOM/threader does, in
order to replace it with a backward threader instance we'd have to merge
the cost/profitability code scattered throughout the forward threader,
as well as the EDGE_FSM* / EDGE_COPY* business.

c) DOM changes the IL as it goes.  Though we could conceivably divorce
do the threading after DOM is done.

Yeah, it does not actually process/simplify the blocks copied by threading.
In fact I think it only registers jump threading opportunities during the DOM
walk and commits them only later.  But it of course uses its avail / copies
stack to find them - that part you cannot easily divorce.

Yes, all the threaders register paths ahead of time and only do the actual CFG shuffling at the end with thread_through_all_blocks().


DOM is also yet another value-numbering framework - one could think
of stripping it down from that side, keeping the threading bits only
(but you'd still have the avail / copies bits).

Ughh, yeah. I've seen some random missed cases because of the value-numbering it does :-/. It's very frustrating that we have various ad-hoc VN methods.


That said, it has one nice property it can leverage due to its incredibly
simple memory redundancy handling, in that it usually performs way less
alias queries than FRE (even when you run the latter in non-iterative mode).

But the same way DOM can register jump threading opportunities FRE
could do as well.

My plan for DOM/threading for this release is to replace its evrp analyzer with a path solver (path_range_query). We can still use the avail/copies, and everything else should just work. I'm hand waiving on a few missed cases due to IL changing, but last time I checked we got way more in return. Andrew even thought we should get most things even in the presence of changing IL, but I haven't distilled testcases for him.

For VRP/threading, the same hybrid thing, but replacing the ASSERT_EXPRs, and the avail/copies with a path solver. There are no floats here. Things are looking good in my tests.

Once we do floats, if we could merge the EDGE_*_THREAD and the cost/profit bits between both threaders, we should be able to replace the entire forward threader with a backward client. Hopefully I don't run out of steam by then ;-).

Thanks for your insight on DOM.  It's very helpful.

Aldy

Reply via email to