On Tue, Jul 25, 2017 at 7:54 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Tue, Jul 25, 2017 at 3:52 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Fri, Jul 14, 2017 at 4:46 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>> On Fri, Jul 7, 2017 at 5:56 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >>>> On Fri, Jul 07, 2017 at 09:58:42AM -0700, H.J. Lu wrote: >>>>> On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek <ja...@redhat.com> wrote: >>>>> > Hi! >>>>> > >>>>> > Honza recently changed the i?86 backend, so that it often doesn't >>>>> > do -maccumulate-outgoing-args by default on x86_64. >>>>> > Unfortunately, on some of the here included testcases this regressed >>>>> > quite a bit the generated code. As AVX vectors are used, the dynamic >>>>> > realignment code needs to assume e.g. that some of them will need to be >>>>> > spilled, and for -mno-accumulate-outgoing-args the code needs to set >>>>> > need_drap early as well. But in when emitting the prologue/epilogue, >>>>> > if need_drap is set, we don't perform the optimization for leaf >>>>> > functions >>>>> > which have zero size stack frame, thus we end up with uselessly doing >>>>> > dynamic stack realignment, setting up DRAP that nothing uses and later >>>>> > on >>>>> > restore everything back. >>>>> > >>>>> > This patch improves it, if the DRAP register isn't live at the start of >>>>> > entry bb successor and we aren't going to realign the stack, we don't >>>>> > need DRAP at all, and even if we need DRAP register, that can't be the >>>>> > sole >>>>> > reason for doing stack realignment, the prologue code is able to set up >>>>> > DRAP >>>>> > even without dynamic stack realignment. >>>>> > >>>>> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? >>>>> > >>>>> > 2013-12-20 Jakub Jelinek <ja...@redhat.com> >>>>> > >>>>> > PR target/59501 >>>>> > * config/i386/i386.c (ix86_save_reg): Don't return true for >>>>> > drap_reg >>>>> > if !crtl->stack_realign_needed. >>>>> > (ix86_finalize_stack_realign_flags): If drap_reg isn't live on >>>>> > entry >>>>> > and stack_realign_needed will be false, clear drap_reg and >>>>> > need_drap. >>>>> > Optimize leaf functions that don't need stack frame even if >>>>> > crtl->need_drap. >>>>> > >>>>> > * gcc.target/i386/pr59501-1.c: New test. >>>>> > * gcc.target/i386/pr59501-1a.c: New test. >>>>> > * gcc.target/i386/pr59501-2.c: New test. >>>>> > * gcc.target/i386/pr59501-2a.c: New test. >>>>> > * gcc.target/i386/pr59501-3.c: New test. >>>>> > * gcc.target/i386/pr59501-3a.c: New test. >>>>> > * gcc.target/i386/pr59501-4.c: New test. >>>>> > * gcc.target/i386/pr59501-4a.c: New test. >>>>> > * gcc.target/i386/pr59501-5.c: New test. >>>>> > * gcc.target/i386/pr59501-6.c: New test. > > LGTM, assuming Jakub is OK with the patch. > > Thanks, > Uros.
Jakub, can you take a look at this: https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00400.html Thanks. > >>>>> > >>>>> > --- gcc/testsuite/gcc.target/i386/pr59501-4a.c.jj 2013-12-20 >>>>> > 12:19:20.603212859 +0100 >>>>> > +++ gcc/testsuite/gcc.target/i386/pr59501-4a.c 2013-12-20 >>>>> > 12:23:33.647881672 +0100 >>>>> > @@ -0,0 +1,8 @@ >>>>> > +/* PR target/59501 */ >>>>> > +/* { dg-do compile { target { ! ia32 } } } */ >>>>> > +/* { dg-options "-O2 -mavx -maccumulate-outgoing-args" } */ >>>>> > + >>>>> > +#include "pr59501-3a.c" >>>>> > + >>>>> > +/* Verify no dynamic realignment is performed. */ >>>>> > +/* { dg-final { scan-assembler-not "and\[^\n\r]*sp" { xfail *-*-* } } >>>>> > } */ >>>>> > >>>>> >>>>> Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was >>>>> xfailed due to stack frame access via frame pointer instead of DARP. >>>>> This patch finds the maximum stack alignment from the stack frame access >>>>> instructions and avoids stack realignment if stack alignment needed is >>>>> less than incoming stack boundary. >>>>> >>>>> I am testing this patch. OK for trunk if there is no regression? >>>>> >>>>> >>>> >>>> We need to keep the preferred stack alignment as the minimum stack >>>> alignment. Here is the updated patch. Tested on x86-64. OK for >>>> trunk? >>>> >>>> Thanks. >>> >>> Hi Jakub, >>> >>> This patch fixes the xfailed testcase in your patch: >>> >>> https://gcc.gnu.org/ml/gcc-patches/2013-12/msg01767.html >>> >>> Your original patch: >>> >>> https://gcc.gnu.org/ml/gcc-patches/2011-11/msg01058.html >>> >>> assumes that any instructions accessing stack require stack >>> realignment: >>> >>> + FOR_EACH_BB (bb) >>> + { >>> + rtx insn; >>> + FOR_BB_INSNS (bb, insn) >>> + if (NONDEBUG_INSN_P (insn) >>> + && requires_stack_frame_p (insn, prologue_used, >>> + set_up_by_prologue)) >>> + { >>> + crtl->stack_realign_needed = stack_realign; >>> + crtl->stack_realign_finalized = true; >>> + return; >>> + } >>> + } >>> >>> This patch checks the actual alignment needed for any instructions >>> accessing stack. It skips stack realignment if the actual stack alignment >>> needed is less than or equal to incoming stack alignment. >>> >>> Can you take look at it? >>> >>> Thanks. >>> >> >> PING >> >> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00400.html >> >> -- >> H.J. -- H.J.