Repository : ssh://darcs.haskell.org//srv/darcs/ghc On branch : master
http://hackage.haskell.org/trac/ghc/changeset/a9ce36118f0de3aeb427792f8f2c5ae097c94d3f >--------------------------------------------------------------- commit a9ce36118f0de3aeb427792f8f2c5ae097c94d3f Author: David M Peixotto <[email protected]> Date: Wed Oct 19 15:49:06 2011 -0500 Change stack alignment to 16+8 bytes in STG code This patch changes the STG code so that %rsp to be aligned to a 16-byte boundary + 8. This is the alignment required by the x86_64 ABI on entry to a function. Previously we kept %rsp aligned to a 16-byte boundary, but this was causing problems for the LLVM backend (see #4211). We now don't need to invoke llvm stack mangler on x86_64 targets. Since the stack is now 16+8 byte algined in STG land on x86_64, we don't need to mangle the stack manipulations with the llvm mangler. This patch only modifies the alignement for x86_64 backends. Signed-off-by: David Terei <[email protected]> >--------------------------------------------------------------- compiler/llvmGen/LlvmMangler.hs | 6 +++- compiler/nativeGen/X86/CodeGen.hs | 16 ++++++++------ rts/StgCRun.c | 42 +++++++++++++++++++++--------------- 3 files changed, 37 insertions(+), 27 deletions(-) diff --git a/compiler/llvmGen/LlvmMangler.hs b/compiler/llvmGen/LlvmMangler.hs index 68e92cf..981bbf2 100644 --- a/compiler/llvmGen/LlvmMangler.hs +++ b/compiler/llvmGen/LlvmMangler.hs @@ -143,11 +143,13 @@ fixTables ss = fixed have been pushed, so sub 4). GHC though since it always uses jumps keeps the stack 16 byte aligned on both function calls and function entry. - We correct the alignment here. + We correct the alignment here for Mac OS X i386. The x86_64 target already + has the correct alignment since we keep the stack 16+8 aligned throughout + STG land for 64-bit targets. -} fixupStack :: B.ByteString -> B.ByteString -> B.ByteString -#if !darwin_TARGET_OS +#if !darwin_TARGET_OS || x86_64_TARGET_ARCH fixupStack = const #else diff --git a/compiler/nativeGen/X86/CodeGen.hs b/compiler/nativeGen/X86/CodeGen.hs index 1efa327..458f379 100644 --- a/compiler/nativeGen/X86/CodeGen.hs +++ b/compiler/nativeGen/X86/CodeGen.hs @@ -1842,15 +1842,17 @@ genCCall64 target dest_regs args = tot_arg_size = arg_size * length stack_args -- On entry to the called function, %rsp should be aligned - -- on a 16-byte boundary +8 (i.e. the first stack arg after - -- the return address is 16-byte aligned). In STG land - -- %rsp is kept 16-byte aligned (see StgCRun.c), so we just - -- need to make sure we push a multiple of 16-bytes of args, - -- plus the return address, to get the correct alignment. + -- on a 16-byte boundary +8 (i.e. the first stack arg + -- above the return address is 16-byte aligned). In STG + -- land %rsp is kept 8-byte aligned (see StgCRun.c), so we + -- just need to make sure we pad by eight bytes after + -- pushing a multiple of 16-bytes of args to get the + -- correct alignment. If we push an odd number of eight byte + -- arguments then no padding is needed. -- Urg, this is hard. We need to feed the delta back into -- the arg pushing code. (real_size, adjust_rsp) <- - if tot_arg_size `rem` 16 == 0 + if (tot_arg_size + 8) `rem` 16 == 0 then return (tot_arg_size, nilOL) else do -- we need to adjust... delta <- getDeltaNat @@ -1865,7 +1867,7 @@ genCCall64 target dest_regs args = delta <- getDeltaNat -- deal with static vs dynamic call targets - (callinsns,cconv) <- + (callinsns,_cconv) <- case target of CmmCallee (CmmLit (CmmLabel lbl)) conv -> -- ToDo: stdcall arg sizes diff --git a/rts/StgCRun.c b/rts/StgCRun.c index 7251e64..11e0543 100644 --- a/rts/StgCRun.c +++ b/rts/StgCRun.c @@ -267,28 +267,35 @@ StgRunIsImplementedInAssembler(void) "addq %0, %%rsp\n\t" "retq" - : : "i"(RESERVED_C_STACK_BYTES+48+8 /*stack frame size*/)); + : : "i"(RESERVED_C_STACK_BYTES+48 /*stack frame size*/)); /* - HACK alert! - - The x86_64 ABI specifies that on a procedure call, %rsp is + The x86_64 ABI specifies that on entry to a procedure, %rsp is aligned on a 16-byte boundary + 8. That is, the first argument on the stack after the return address will be 16-byte aligned. - Which should be fine: RESERVED_C_STACK_BYTES+48 is a multiple - of 16 bytes. - - BUT... when we do a C-call from STG land, gcc likes to put the - stack alignment adjustment in the prolog. eg. if we're calling - a function with arguments in regs, gcc will insert 'subq $8,%rsp' - in the prolog, to keep %rsp aligned (the return address is 8 - bytes, remember). The mangler throws away the prolog, so we - lose the stack alignment. - - The hack is to add this extra 8 bytes to our %rsp adjustment - here, so that throughout STG code, %rsp is 16-byte aligned, - ready for a C-call. + We maintain the 16+8 stack alignment throughout the STG code. + + When we call STG_RUN the stack will be aligned to 16+8. We used + to subtract an extra 8 bytes so that %rsp would be 16 byte + aligned at all times in STG land. This worked fine for the + native code generator which knew that the stack was already + aligned on 16 bytes when it generated calls to C functions. + + This arrangemnt caused problems for the LLVM backend. The LLVM + code generator would assume that on entry to each function the + stack is aligned to 16+8 as required by the ABI. However, since + we only enter STG functions by jumping to them with tail calls, + the stack was actually aligned to a 16-byte boundary. The LLVM + backend had its own mangler that would post-process the + assembly code to fixup the stack manipulation code to mainain + the correct alignment (see #4211). + + Therefore, we now now keep the stack aligned to 16+8 while in + STG land so that LLVM generates correct code without any + mangling. The native code generator can handle this alignment + just fine by making sure the stack is aligned to a 16-byte + boundary before it makes a C-call. A quick way to see if this is wrong is to compile this code: @@ -300,7 +307,6 @@ StgRunIsImplementedInAssembler(void) stack isn't aligned, and calling exitWith from Haskell invokes shutdownHaskellAndExit using a C call. - Future gcc releases will almost certainly break this hack... */ } _______________________________________________ Cvs-ghc mailing list [email protected] http://www.haskell.org/mailman/listinfo/cvs-ghc
