https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120839

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 15e0dd547a9..7d1e786a78b 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -8980,7 +8980,8 @@ ix86_finalize_stack_frame_flags (void)
     {
       /* After stack_realign_needed is finalized, we can't no longer
         change it.  */
-      gcc_assert (crtl->stack_realign_needed == stack_realign);
+      gcc_assert (crtl->stack_realign_needed == stack_realign
+                 || !stack_realign);
       return;
     }


avoids the ICE - the logic would be it's OK to use stack realign when
it later isn't needed.  But then this generates

e:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        andq    $-16, %rsp
        fldt    16(%rbp)
        fstpl   d(%rip)
        leave
        .cfi_def_cfa 7, 8
        ret

this looks to miss the frame cleanup.  At cmpelim time we have

(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 (set (reg:XF 8 st [orig:100 f.a ] [100])
        (mem/c:XF (plus:DI (reg/f:DI 6 bp)
                (const_int 16 [0x10])) [1 f.a+0 S16 A256])) "t.c":5:20 172
{*movxf_internal}
     (expr_list:REG_EQUIV (mem/c:XF (reg/f:DI 16 argp) [1 f.a+0 S16 A256])
        (nil)))
(insn 6 5 9 2 (set (mem/c:DF (symbol_ref:DI ("d") [flags 0x2]  <var_decl
0x7fce963d6e40 d>) [3 d+0 S8 A64])
        (float_truncate:DF (reg:XF 8 st [orig:100 f.a ] [100]))) "t.c":5:20 212
{truncxfdf2}
     (nil))
;;  succ:       EXIT [always]

the frame is setup by the caller, so the A256 access does not need
stack re-alignment.  That shoud(?) be reflected by incoming_stack_boundary
but this is computed as 128.

crtl->stack_realign_needed is already set when we first arrive at
ix86_finalize_stack_frame_flags, and that setting looks not optimal.

The setting is from expand_stack_alignment () computed via

6899      crtl->stack_realign_needed
6900        = INCOMING_STACK_BOUNDARY < crtl->stack_alignment_estimated;

and it seems to be "fine" to re-set that during the first
ix86_finalize_stack_frame_flags.  At first we compute 256, we run into
cfun->machine->stack_frame_required and update stack_realign to true.
But 'stack_alignment' as computed by

  unsigned int stack_alignment
    = (crtl->is_leaf && !ix86_current_function_calls_tls_descriptor
       ? crtl->max_used_stack_slot_alignment
       : crtl->stack_alignment_needed);

does not change, both crtl->max_used_stack_slot_alignment and
crtl->stack_alignment_needed are unchanged.  It seems we want to
update that.  But if I do

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 15e0dd547a9..194b2e0f78d 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -9040,6 +9040,15 @@ ix86_finalize_stack_frame_flags (void)
              crtl->preferred_stack_boundary
                = incoming_stack_boundary;
            }
+         else
+           {
+             crtl->max_used_stack_slot_alignment
+               = stack_alignment;
+             crtl->stack_alignment_needed
+               = stack_alignment;
+             crtl->preferred_stack_boundary
+               = stack_alignment;
+           }
        }
       else
        {

then I hit an ICE at

#1  0x0000000001ffc347 in ix86_expand_prologue ()
    at /space/rguenther/src/gcc-clean/gcc/config/i386/i386.cc:9528
9528          gcc_assert (m->fs.sp_realigned_offset ==
frame.stack_realign_offset);
(gdb) l
9523             Beyond this point, stack access should be done via
choose_baseaddr or
9524             by using sp_valid_at and fp_valid_at to determine the correct
base
9525             register.  Henceforth, any CFA offset should be thought of as
logical
9526             and not physical.  */
9527          gcc_assert (m->fs.sp_realigned_offset >=
m->fs.sp_realigned_fp_last);
9528          gcc_assert (m->fs.sp_realigned_offset ==
frame.stack_realign_offset);
9529          m->fs.sp_realigned = true;
9530
9531          /* SEH unwind emit doesn't currently support REG_CFA_EXPRESSION,
which
9532             is needed to describe where a register is saved using a
realigned
(gdb) p m->fs.sp_realigned_offset
$1 = 32
(gdb) p frame.stack_realign_offset
$2 = 16

this code is confusing.

The analysis is still not reflecting reality (bigger incoming stack
boundary, frame accesses are not something stack-realignment can fix).

I don't see HJ is working on this but I clearly do not know enough of
this code.  I believe it's a backend issue though and a fix must be
done in the backend.

Reply via email to