On Sat, Nov 05, 2011 at 10:50:44AM +0100, Jakub Jelinek wrote: > >From quick look, f1 isn't shrink-wrapped probably because of the set > of bb's that need prologue/epilogue around it doesn't end in a return, > but in a tail call. Can't we just add a prologue before the bar call > and throw the epilogue away (normally the epilogue in a function that > ends only in a tail call is just emitted after the barrier and > optimized away I think, we could do the same?).
http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00046.html ought to cure this particular problem. With that patch, similar code on powerpc-linux does result in shrink wrapping. > And f2 is something that IMHO with especially AVX/AVX2 code happens very > often, the prologue is expensive as it realigns the stack. The reason > for that is that until reload we don't know whether something won't be > spilled on the stack and we need/want 32-byte aligned stack slots > for that spilling. Huh? thread_prologue_and_epilogue is after reload. So your backend ought to be able to figure out whether an aligned stack is needed. -- Alan Modra Australia Development Lab, IBM