Re: Moving backwards/FSM threader into its own pass

Jeff Law Tue, 31 May 2016 07:56:32 -0700

On 05/30/2016 03:16 AM, Richard Biener wrote:


Ok, but the placement (and number of) threading passes then no longer depends
on DOM/VRP passes - and as you placed the threading passes _before_ those
passes the threading itself does not benefit from DOM/VRP but only from
previous optimization passes.

Right. Note that number of passes now is actually the same as we hadbefore, they're just occurring outside DOM/VRP.

The backwards threader's only dependency on DOM/VRP was to propagateconstants into PHI nodes and to propagate away copies. That dependencywas removed.


I see this as opportunity to remove some of them ;)  I now see in the main
optimization pipeline

      NEXT_PASS (pass_fre);
      NEXT_PASS (pass_thread_jumps);
      NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);

position makes sense - FRE removed redundancies and fully copy/constant
propagated the IL.

      NEXT_PASS (pass_sra);
      /* The dom pass will also resolve all __builtin_constant_p calls
         that are still there to 0.  This has to be done after some
         propagations have already run, but before some more dead code
         is removed, and this place fits nicely.  Remember this when
         trying to move or duplicate pass_dominator somewhere earlier.  */
      NEXT_PASS (pass_thread_jumps);
      NEXT_PASS (pass_dominator, true /* may_peel_loop_headers_p */);

this position OTOH doesn't make much sense as IL cleanup is missing
after SRA and previous opts.  After loop we now have

We should look at this one closely. The backwards threader doesn'tdepend on IL cleanups. It should do its job regardless of the state ofthe IL.


      NEXT_PASS (pass_tracer);
      NEXT_PASS (pass_thread_jumps);
      NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
      NEXT_PASS (pass_strlen);
      NEXT_PASS (pass_thread_jumps);
      NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);

I don't think we want two threadings so close together.  It makes some sense
to have a threading _after_ DOM but before VRP (DOM cleaned up the IL).

That one is, IMHO, the least useful. I haven't done any significantanalysis of this specific instance though to be sure. The step you sawwas meant to largely preserve behavior. Further cleanups are definitelypossible.

The most common case I've seen where the DOM/VRP make transformationsthat then expose something useful to the backward threader come fromthose pesky context sensitive equivalences.

We (primarily Andrew, but Aldy and myself are also involved) are lookingat ways to more generally expose range information created for thesesituations. Exposing range information and getting it more precise byallowing "unnecessary copies" or some such would eliminate those caseswhere DOM/VRP expose new opportunities for the backwards jump threader.


So that would leave two from your four passes and expose the opportunity
to re-add one during early-opts, also after FRE.  That one should be
throttled down to operate in "-Os" mode though.

I'll take a look at them, but with some personal stuff and PTO it'lllikely be a few weeks before I've got anything useful.


So, can you see what removing the two threading passes that don't make
sense to me do to your statistics?  And check whether a -Os-like threading
can be done early?

Interesting you should mention doing threading early -- that was one ofthe secondary motivations behind getting the backwards threading bitsout into their own pass, I just failed to mention it.

Essentially we want to limit the backwards substitution to single stepwithin a single block for that case (which is trivially easy). Thatwould allow us to run a very cheap threader during early optimizations.


Jeff

Re: Moving backwards/FSM threader into its own pass

Reply via email to