https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120839
--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to H.J. Lu from comment #14) > (In reply to Richard Biener from comment #12) > > I don't see HJ is working on this but I clearly do not know enough of > > this code. I believe it's a backend issue though and a fix must be > > done in the backend. > > My current patch is at > > https://patchwork.sourceware.org/project/gcc/list/?series=49364 But comment#4 and comment#5 are misguided. The target cannot change alignment of objects accessible by the user. Iff the ABI really specifies this should be passed only 128bit (which I doubt, see comment#10), then we have to emit a callee copy so that accesses done in 'e' access an object with specified alignment. typedef struct { long double a, b; } c __attribute__((aligned(32))); double d; void e(c f);// { d = f.a; } c x; void bar() { e (x); } shows we do pass the object in a 128bit aligned stack slot only. But void e(c f) { if ((unsigned long)&f & (1<<5 - 1)) __builtin_abort (); } shows we elide the alignment test. And typedef int v8si __attribute__((vector_size(32))); typedef struct { char a[32]; } c __attribute__((aligned(32))); v8si d; void __attribute__((noinline)) e(c f) { d = *(v8si *)f.a; } c x; void bar() { e (x); } shows we pass 'x' in a 16 byte aligned stack slot, copy it to a local, properly aligned storage and access that with large alignment. This happens in assign_parm_setup_block (insn 2 5 3 2 (set (reg:OI 99) (mem/c:OI (reg/f:DI 92 virtual-incoming-args) [0 f+0 S32 A64])) "t.c":6:39 -1 (nil)) (insn 3 2 4 2 (set (mem/c:OI (plus:DI (reg/f:DI 93 virtual-stack-vars) (const_int -32 [0xffffffffffffffe0])) [0 f+0 S32 A256]) (reg:OI 99)) "t.c":6:39 -1 (nil)) (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG) (insn 7 4 8 2 (set (reg:V8SI 100) (mem/c:V8SI (plus:DI (reg/f:DI 93 virtual-stack-vars) (const_int -32 [0xffffffffffffffe0])) [1 MEM[(v8si *)&f]+0 S32 A256])) "t.c":6:43 -1 (nil)) That is something we somehow fail to do for the testcase in question - possibly XFmode is special here or some other code is confused about the incoming argument alignment. (note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (note 2 3 5 2 NOTE_INSN_FUNCTION_BEG) (insn 5 2 6 2 (set (reg:XF 100) (mem/c:XF (reg/f:DI 92 virtual-incoming-args) [1 f.a+0 S16 A256])) "t.c":5:20 -1 (nil)) (insn 6 5 0 2 (set (mem/c:DF (symbol_ref:DI ("d") [flags 0x2] <var_decl 0x7ffff740de40 d>) [3 d+0 S8 A64]) (float_truncate:DF (reg:XF 100))) "t.c":5:20 -1 (nil)) assign_parm_setup_block doesn't do this because data->stack_parm is already assigned. It get's cleared in assign_parm_adjust_stack_rtl for the working case but not here: /* If we can't trust the parm stack slot to be aligned enough for its ultimate type, don't use that slot after entry. We'll make another stack slot, if we need one. */ if (stack_parm && ((GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm) && ((optab_handler (movmisalign_optab, data->nominal_mode) != CODE_FOR_nothing) || targetm.slow_unaligned_access (data->nominal_mode, MEM_ALIGN (stack_parm)))) || (data->nominal_type && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) stack_parm = NULL; where the difference is data->stack_parm with A64 vs A128 and the PREFERRED_STACK_BOUNDARY check which I do not understand. I don't quite understand the movmisalign optab check either, but ... the latter check was introduced in r0-64961-gbfc45551d5ace4 I believe the MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY needs to be dropped, changing it to <= also works and is less aggressive. Anyway, dropping or changing to <= the fixes the testcase and we emit e: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 andq $-32, %rsp movdqa 16(%rbp), %xmm0 movaps %xmm0, -32(%rsp) fldt -32(%rsp) fstpl d(%rip) leave .cfi_def_cfa 7, 8 ret
