Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c

Jeff Law Fri, 02 Dec 2016 18:21:10 -0800

On 12/02/2016 03:22 PM, Segher Boessenkool wrote:

On Fri, Dec 02, 2016 at 09:47:00AM +0100, Richard Biener wrote:

STC tries to make as large as possible consecutive "traces", mainly to
help with instruction cache utilization and hit rate etc.  It cannot do
a very good job if it isn't allowed to copy blocks.


"simple" tries to (dynamically) have as many fall-throughs as possible,
i.e. as few jumps as possible.


I hope it special-cases backedges, that is, not make those fallthru.


It doesn't, and that is a good thing.

Firstly, what is classified as backedge doesn't make too much sense in
some cases; quite often the cfg isn't reducible.

Secondly, for simple loops it does not matter: the backedge has lower
frequency than the forward edges, so everything works out as you want.

Thirdly, consider this loop:

    |
    S<----
    |     |
    A     |
   / \    |
  B   C   |
   \ /    |
    D     |
    |     |
    E-----
    |
    F

where the backedge now has higher frequency than A->B and A->C.
With simple this ends up as:

S: ...; goto A

D: ...
E: ...; if (not again) goto F
S: ...
A: ...; if () goto C
B: ...; goto D

C: ...; goto D

and this is cheaper than the alternative:

S: ...
A: ...; if () goto C
B: ...
D: ...
E: ...; if (again) goto A
F:

C: ...; goto D

If a fraction f of the loop iterations does C (so 1-f does B), the
alternative has 1+2f jumps per iteration; the code simple makes has 1+f.

In fact, the latter can be particularly bad with small icaches. Thefirst sequence is far better. IIRC there were spec92 cases theregetting this kind of thing right made a measurable difference onembedded targets.


Jeff

Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c

Reply via email to