Stack manipulation in LLVM backend

Max Bolingbroke Fri, 02 Apr 2010 08:02:52 -0700

Hi,

I've narrowed down one of the problems with the LLVM backend that
David descried in his thesis. LLVM sometimes generates gratuitous
extra stack manipulations.


For example, consider this input program:

{{{
module Toy where

toy :: (Int -> Maybe Int) -> Int
toy f = case f 10 of Just x -> x; Nothing -> 20
}}}

Compiled using ghc -O (which runs llvl's "opt -O2"), we get the
following LLVM IR:

{{{
define cc10 void @Toy_toy_entry(i32 %stg_terei_baseArg, i32
%stg_terei_spArg, i32 %stg_terei_hpArg, i32 %stg_terei_r1Arg) nounwind
{
cdx:
  %ndz = add i32 %stg_terei_spArg, -4             ; <i32> [#uses=3]
  %ndB = add i32 %stg_terei_baseArg, 84           ; <i32> [#uses=1]
  %ndC = inttoptr i32 %ndB to i32*                ; <i32*> [#uses=1]
  %ndD = load i32* %ndC                           ; <i32> [#uses=1]
  %ndE = icmp ult i32 %ndz, %ndD                  ; <i1> [#uses=1]
  br i1 %ndE, label %cdG, label %ndH

ndH:                                              ; preds = %cdx
  %ndJ = inttoptr i32 %stg_terei_spArg to i32*    ; <i32*> [#uses=2]
  %ndK = load i32* %ndJ                           ; <i32> [#uses=1]
  %ndP = inttoptr i32 %ndz to i32*                ; <i32*> [#uses=1]
  store i32 add (i32 ptrtoint (%Toy_toy1_closure_struct*
@Toy_toy2_closure to i32), i32 1), i32* %ndP
  store i32 ptrtoint (%sbo_info_struct* @sbo_info to i32), i32* %ndJ
  tail call cc10 void @stg_ap_p_fast(i32 %stg_terei_baseArg, i32 %ndz,
i32 %stg_terei_hpArg, i32 %ndK) nounwind
  ret void

cdG:                                              ; preds = %cdx
  %ne1 = add i32 %stg_terei_baseArg, -4           ; <i32> [#uses=1]
  %ne2 = inttoptr i32 %ne1 to i32*                ; <i32*> [#uses=1]
  %ne3 = load i32* %ne2                           ; <i32> [#uses=1]
  %ne4 = inttoptr i32 %ne3 to void (i32, i32, i32, i32)* ; <void (i32,
i32, i32, i32)*> [#uses=1]
  tail call cc10 void %ne4(i32 %stg_terei_baseArg, i32
%stg_terei_spArg, i32 %stg_terei_hpArg, i32 ptrtoint
(%Toy_toy_closure_struct* @Toy_toy_closure to i32)) nounwind
  ret void
}
}}}

So far so good. However, when we compile this with "llc -tailcallopt
-O2 -march=x86", we get:

{{{
_Toy_toy_entry:                         ## @Toy_toy_entry
## BB#0:                                ## %cdx
        subl    $12, %esp
        leal    -4(%ebp), %eax
        cmpl    84(%ebx), %eax
        jb      LBB2_2
## BB#1:                                ## %ndH
        movl    (%ebp), %esi
        movl    $_Toy_toy2_closure+1, -4(%ebp)
        movl    $_sbo_info, (%ebp)
        movl    %eax, %ebp
        addl    $12, %esp
        jmp     _stg_ap_p_fast  # TAILCALL
LBB2_2:                                 ## %cdG
        movl    -4(%ebx), %eax
        movl    $_Toy_toy_closure, %esi
        addl    $12, %esp
        jmpl    *%eax  # TAILCALL

        .globl  ___stginit_Toy_
        .align  4, 0x90
}}}

Note the "subl  $12, %esp" and "addl    $12, %esp" pairs. %esp is never
actually used, so this fiddling is entirely pointless! This seems to
happen in every single function LLVM compiles, so I imagine fixing the
problem would be quite a big win.

Without -tailcallopt, we get nice code:

{{{
_Toy_toy_entry:                         ## @Toy_toy_entry
## BB#0:                                ## %cdx
        leal    -4(%ebp), %eax
        cmpl    84(%ebx), %eax
        jb      LBB2_2
## BB#1:                                ## %ndH
        movl    (%ebp), %esi
        movl    $_Toy_toy2_closure+1, -4(%ebp)
        movl    $_sbo_info, (%ebp)
        movl    %eax, %ebp
        jmp     _stg_ap_p_fast  # TAILCALL
LBB2_2:                                 ## %cdG
        movl    -4(%ebx), %eax
        movl    $_Toy_toy_closure, %esi
        jmpl    *%eax  # TAILCALL
}}}

Unfortunately -tailcallopt is apparently the only way to get
*guaranteed* tail calls. The LLVM developers appear to know that
tailcallopt causes this sort of rubbish in the output
(http://old.nabble.com/Removing--tailcallopt--td27475523.html) but
obviously haven't fixed it (my results are with the very recent LLVM
r100183).

I couldn't get the stack manipulations to go away with any amount of
"noreturn" and "unreachable" annotations around "tail call"s and
function definitions. It was worth a try though :-)

In summary, it looks like anyone wanting to fix the excess stack
manipulations issue is going to have to get their hands messy and
delve into LLVM's LLC!

Cheers,
Max

_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Stack manipulation in LLVM backend

Reply via email to