On Thu, Sep 10, 2015 at 8:14 AM, Segher Boessenkool <seg...@kernel.crashing.org> wrote: > This patch rewrites the shrink-wrapping algorithm, allowing non-linear > pieces of CFG to be duplicated for use without prologue instead of just > linear pieces. > > On PowerPC, this enables shrink-wrapping of about 2%-3% more functions. > I expected more, but in most cases this would help we cannot yet shrink- > wrap because there are non-volatile registers used, often in the first > block already. > > Since with this patch you still get only one prologue, it doesn't do > much either for the case where there are many no-return error paths > (common in an enable-checking compiler build); all those paths end in > a no-return call, and those need a prologue (are not sibling calls). > There are PRs about this. For shrink-wrapping, because all those > paths want a prologue we put a prologue early in the function, although > none of the "regular" code needs it. > > I instrumented things a bit (not in the patch). We can get about 10% > to 20% more functions shrink-wrapped by allowing multiple edges that > need a prologue inserted (edges to one and the same block); this can be > easily done by just inserting an extra block. I'll work on this. > > Of the blocks chosen to have the prologue inserted, about 70% need a > prologue because there is a call, 25% for other reasons (non-volatile > register sets mostly), and only 5% do not themselves need a prologue. > > There are also cases where no block needs a prologue at all, but GCC > thinks the function needs one nevertheless. This happens for example > if a stack frame was created for an address-taken local variable, but > that variable was optimised away later. This doesn't happen much in > most cases (one in a thousand or so). There are some cases (like -pg) > where the compiler forces a stack frame even if nothing uses it. > > Shrink-wrapping is run at -O1, and basic block reordering is not. > Shrink-wrapping would often benefit from some simple reordering. There > are quite a few targets that do not want the STC bbro at all, either; > we should have a simple bbro that runs at -O1 as well, does not increase > code size, and can be used for those targets that do not want STC. > > It also would be nice to get rid of the silly games shrink-wrapping > plays (together with function.c) making fake edges for where the > simple_returns should be inserted. It would simplify a lot of code > if we would (could) just insert them directly. > > Bootstrapped and regression tested on powerpc64-linux. Is this okay > for mainline? > > > Segher > > > 2015-09-10 Segher Boessenkool <seg...@kernel.crashing.org> > > * shrink-wrap.c (requires_stack_frame_p): Fix formatting. > (dup_block_and_redirect): Delete function. > (can_dup_for_shrink_wrapping): New function. > (fix_fake_fallthrough_edge): New function. > (try_shrink_wrapping): Rewrite function. > (convert_to_simple_return): Call fix_fake_fallthrough_edge. >
This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67587 H.J.