On 12/14/18 4:01 PM, H.J. Lu wrote: > On Thu, Dec 13, 2018 at 11:11 PM Uros Bizjak <ubiz...@gmail.com> wrote: >> On Thu, Dec 13, 2018 at 6:36 PM H.J. Lu <hongjiu...@intel.com> wrote: >>> get_frame_size () returns used stack slots during compilation, which >>> may be optimized out later. Since ix86_find_max_used_stack_alignment >>> is called by ix86_finalize_stack_frame_flags to check if stack frame >>> is required, there is no need to call get_frame_size () which may give >>> inaccurate final stack frame size. >>> >>> Tested on AVX512 machine configured with >>> >>> --with-arch=native --with-cpu=native >>> >>> OK for trunk? >>> >>> >>> H.J. >>> --- >>> gcc/ >>> >>> PR target/88483 >>> * config/i386/i386.c (ix86_finalize_stack_frame_flags): Don't >>> use get_frame_size (). >>> >>> gcc/testsuite/ >>> >>> PR target/88483 >>> * gcc.target/i386/stackalign/pr88483.c: New test. >> LGTM, but you know this part of the compiler better than I. >> >> Thanks, >> Uros. >> >>> --- >>> gcc/config/i386/i386.c | 1 - >>> .../gcc.target/i386/stackalign/pr88483.c | 17 +++++++++++++++++ >>> 2 files changed, 17 insertions(+), 1 deletion(-) >>> create mode 100644 gcc/testsuite/gcc.target/i386/stackalign/pr88483.c >>> >>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >>> index caa701fe242..edc8f4f092e 100644 >>> --- a/gcc/config/i386/i386.c >>> +++ b/gcc/config/i386/i386.c >>> @@ -12876,7 +12876,6 @@ ix86_finalize_stack_frame_flags (void) >>> && flag_exceptions >>> && cfun->can_throw_non_call_exceptions) >>> && !ix86_frame_pointer_required () >>> - && get_frame_size () == 0 >>> && ix86_nsaved_sseregs () == 0 >>> && ix86_varargs_gpr_size + ix86_varargs_fpr_size == 0) >>> { >>> diff --git a/gcc/testsuite/gcc.target/i386/stackalign/pr88483.c >>> b/gcc/testsuite/gcc.target/i386/stackalign/pr88483.c >>> new file mode 100644 >>> index 00000000000..5aec8fd4cf6 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/i386/stackalign/pr88483.c >>> @@ -0,0 +1,17 @@ >>> +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ >>> +/* { dg-options "-O2 -mavx2" } */ >>> + >>> +struct B >>> +{ >>> + char a[12]; >>> + int b; >>> +}; >>> + >>> +struct B >>> +f2 (void) >>> +{ >>> + struct B x = {}; >>> + return x; >>> +} >>> + >>> +/* { dg-final { scan-assembler-not >>> "and\[lq\]?\[^\\n\]*-\[0-9\]+,\[^\\n\]*sp" } } */ >>> -- >>> 2.19.2 >>> > My fix triggered a latent bug in ix86_find_max_used_stack_alignment. > Here is the fix. OK for trunk? > > Thanks. > > -- H.J. > > > 0001-x86-Properly-check-stack-reference.patch > > From 83f0b37f287ed198a3b50e2be6b0f7f5c154020e Mon Sep 17 00:00:00 2001 > From: "H.J. Lu" <hjl.to...@gmail.com> > Date: Fri, 14 Dec 2018 12:21:02 -0800 > Subject: [PATCH] x86: Properly check stack reference > > A latent bug in ix86_find_max_used_stack_alignment was uncovered by the > fix for PR target/88483, which caused: > > FAIL: gcc.target/i386/incoming-8.c scan-assembler andl[\\t ]*\\$-16,[\\t > ]*%esp > > on i386. ix86_find_max_used_stack_alignment failed to notice stack > reference via non-stack/frame registers and missed stack alignment > requirement. We should track all registers which may reference stack > by checking registers set from stack referencing registers. > > Tested on i686 and x86-64 with > > --with-arch=native --with-cpu=native > > on AVX512 machine. Tested on i686 and x86-64 without > > --with-arch=native --with-cpu=native > > on x86-64 machine. > > PR target/88483 > * config/i386/i386.c (ix86_stack_referenced_p): New function. > (ix86_find_max_used_stack_alignment): Call ix86_stack_referenced_p > to check if stack is referenced. > --- > gcc/config/i386/i386.c | 43 ++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 39 insertions(+), 4 deletions(-) > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 4599ca2a7d5..bf93ec3722f 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -12777,6 +12777,18 @@ output_probe_stack_range (rtx reg, rtx end) > return ""; > } > > +/* Return true if OP references stack frame though one of registers > + in STACK_REF_REGS. */ > + > +static bool > +ix86_stack_referenced_p (const_rtx op, rtx *stack_ref_regs) > +{ > + for (int i = 0; i < LAST_REX_INT_REG; i++) > + if (stack_ref_regs[i] && reg_mentioned_p (stack_ref_regs[i], op)) > + return true; > + return false; > +} > + > /* Return true if stack frame is required. Update STACK_ALIGNMENT > to the largest alignment, in bits, of stack slot used if stack > frame is required and CHECK_STACK_SLOT is true. */ > @@ -12801,6 +12813,12 @@ ix86_find_max_used_stack_alignment (unsigned int > &stack_alignment, > > bool require_stack_frame = false; > > + /* Array of hard registers which reference stack frame. */ > + rtx stack_ref_regs[LAST_REX_INT_REG]; > + memset (stack_ref_regs, 0, sizeof (stack_ref_regs)); > + stack_ref_regs[STACK_POINTER_REGNUM] = stack_pointer_rtx; > + stack_ref_regs[FRAME_POINTER_REGNUM] = frame_pointer_rtx; > + > FOR_EACH_BB_FN (bb, cfun) > { > rtx_insn *insn; > @@ -12811,16 +12829,33 @@ ix86_find_max_used_stack_alignment (unsigned int > &stack_alignment, > { > require_stack_frame = true; > > + rtx body = PATTERN (insn); > + if (GET_CODE (body) == SET) > + { > + /* If a register is set from a stack referencing > + register, it is a stack referencing register. */ > + rtx dest = SET_DEST (body); > + if (REG_P (dest) && !MEM_P (SET_SRC (body))) > + { > + int regno = REGNO (dest); > + gcc_assert (regno < FIRST_PSEUDO_REGISTER); > + if (GENERAL_REGNO_P (regno)) > + { > + add_to_hard_reg_set (&set_up_by_prologue, Pmode, > + regno); > + stack_ref_regs[regno] = dest; > + } > + } > + } > + > if (check_stack_slot) > { > /* Find the maximum stack alignment. */ > subrtx_iterator::array_type array; > FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL) > if (MEM_P (*iter) > - && (reg_mentioned_p (stack_pointer_rtx, > - *iter) > - || reg_mentioned_p (frame_pointer_rtx, > - *iter))) > + && ix86_stack_referenced_p (*iter, > + stack_ref_regs)) > { > unsigned int alignment = MEM_ALIGN (*iter); > if (alignment > stack_alignment) This just does a linear scan of the blocks and insns within the block building up STACK_REF_REGS as we go.
The problem is there's no guarantee that we're visiting the blocks in execution order. So we might see an insn that indirectly references the stack reg *before* we see the insn which had the copy from the stack pointer. Or am I missing something? Jeff