On May 31, 2016 4:55:36 PM GMT+02:00, Jeff Law <l...@redhat.com> wrote: >On 05/30/2016 03:16 AM, Richard Biener wrote: >> >> Ok, but the placement (and number of) threading passes then no longer >depends >> on DOM/VRP passes - and as you placed the threading passes _before_ >those >> passes the threading itself does not benefit from DOM/VRP but only >from >> previous optimization passes. >Right. Note that number of passes now is actually the same as we had >before, they're just occurring outside DOM/VRP. > >The backwards threader's only dependency on DOM/VRP was to propagate >constants into PHI nodes and to propagate away copies. That dependency > >was removed. > > >> >> I see this as opportunity to remove some of them ;) I now see in the >main >> optimization pipeline >> >> NEXT_PASS (pass_fre); >> NEXT_PASS (pass_thread_jumps); >> NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */); >> >> position makes sense - FRE removed redundancies and fully >copy/constant >> propagated the IL. >> >> NEXT_PASS (pass_sra); >> /* The dom pass will also resolve all __builtin_constant_p >calls >> that are still there to 0. This has to be done after some >> propagations have already run, but before some more dead >code >> is removed, and this place fits nicely. Remember this when >> trying to move or duplicate pass_dominator somewhere >earlier. */ >> NEXT_PASS (pass_thread_jumps); >> NEXT_PASS (pass_dominator, true /* may_peel_loop_headers_p */); >> >> this position OTOH doesn't make much sense as IL cleanup is missing >> after SRA and previous opts. After loop we now have >We should look at this one closely. The backwards threader doesn't >depend on IL cleanups. It should do its job regardless of the state of > >the IL. > > >> >> NEXT_PASS (pass_tracer); >> NEXT_PASS (pass_thread_jumps); >> NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p >*/); >> NEXT_PASS (pass_strlen); >> NEXT_PASS (pass_thread_jumps); >> NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */); >> >> I don't think we want two threadings so close together. It makes >some sense >> to have a threading _after_ DOM but before VRP (DOM cleaned up the >IL). >That one is, IMHO, the least useful. I haven't done any significant >analysis of this specific instance though to be sure. The step you saw > >was meant to largely preserve behavior. Further cleanups are >definitely >possible. > >The most common case I've seen where the DOM/VRP make transformations >that then expose something useful to the backward threader come from >those pesky context sensitive equivalences. > >We (primarily Andrew, but Aldy and myself are also involved) are >looking >at ways to more generally expose range information created for these >situations. Exposing range information and getting it more precise by >allowing "unnecessary copies" or some such would eliminate those cases >where DOM/VRP expose new opportunities for the backwards jump threader. > > > > >> >> So that would leave two from your four passes and expose the >opportunity >> to re-add one during early-opts, also after FRE. That one should be >> throttled down to operate in "-Os" mode though. >I'll take a look at them, but with some personal stuff and PTO it'll >likely be a few weeks before I've got anything useful. > >> >> So, can you see what removing the two threading passes that don't >make >> sense to me do to your statistics? And check whether a -Os-like >threading >> can be done early? >Interesting you should mention doing threading early -- that was one of > >the secondary motivations behind getting the backwards threading bits >out into their own pass, I just failed to mention it. > >Essentially we want to limit the backwards substitution to single step >within a single block for that case (which is trivially easy). That >would allow us to run a very cheap threader during early optimizations.
Just do double check - the pass does both forward and backward threading and DOM/VRP now do neither themselves? Richard. >Jeff