On Thu, Sep 10, 2015 at 8:14 AM, Segher Boessenkool
<seg...@kernel.crashing.org> wrote:
> This patch rewrites the shrink-wrapping algorithm, allowing non-linear
> pieces of CFG to be duplicated for use without prologue instead of just
> linear pieces.
>
> On PowerPC, this enables shrink-wrapping of about 2%-3% more functions.
> I expected more, but in most cases this would help we cannot yet shrink-
> wrap because there are non-volatile registers used, often in the first
> block already.
>
> Since with this patch you still get only one prologue, it doesn't do
> much either for the case where there are many no-return error paths
> (common in an enable-checking compiler build); all those paths end in
> a no-return call, and those need a prologue (are not sibling calls).
> There are PRs about this.  For shrink-wrapping, because all those
> paths want a prologue we put a prologue early in the function, although
> none of the "regular" code needs it.
>
> I instrumented things a bit (not in the patch).  We can get about 10%
> to 20% more functions shrink-wrapped by allowing multiple edges that
> need a prologue inserted (edges to one and the same block); this can be
> easily done by just inserting an extra block.  I'll work on this.
>
> Of the blocks chosen to have the prologue inserted, about 70% need a
> prologue because there is a call, 25% for other reasons (non-volatile
> register sets mostly), and only 5% do not themselves need a prologue.
>
> There are also cases where no block needs a prologue at all, but GCC
> thinks the function needs one nevertheless.  This happens for example
> if a stack frame was created for an address-taken local variable, but
> that variable was optimised away later.  This doesn't happen much in
> most cases (one in a thousand or so).  There are some cases (like -pg)
> where the compiler forces a stack frame even if nothing uses it.
>
> Shrink-wrapping is run at -O1, and basic block reordering is not.
> Shrink-wrapping would often benefit from some simple reordering.  There
> are quite a few targets that do not want the STC bbro at all, either;
> we should have a simple bbro that runs at -O1 as well, does not increase
> code size, and can be used for those targets that do not want STC.
>
> It also would be nice to get rid of the silly games shrink-wrapping
> plays (together with function.c) making fake edges for where the
> simple_returns should be inserted.  It would simplify a lot of code
> if we would (could) just insert them directly.
>
> Bootstrapped and regression tested on powerpc64-linux.  Is this okay
> for mainline?
>
>
> Segher
>
>
> 2015-09-10  Segher Boessenkool  <seg...@kernel.crashing.org>
>
>         * shrink-wrap.c (requires_stack_frame_p): Fix formatting.
>         (dup_block_and_redirect): Delete function.
>         (can_dup_for_shrink_wrapping): New function.
>         (fix_fake_fallthrough_edge): New function.
>         (try_shrink_wrapping): Rewrite function.
>         (convert_to_simple_return): Call fix_fake_fallthrough_edge.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67587


H.J.

Reply via email to