On Sunday, 22 April 2018 at 05:29:30 UTC, Mike Franklin wrote:
On Sunday, 22 April 2018 at 00:41:34 UTC, Nicholas Wilson wrote:

You're not using the C library version of it, the compiler does the stack space reservation inline for you. There is no way around this.

I'm not convinced. I did some no-runtime testing and eventually found the implementation in druntime here: https://github.com/dlang/druntime/blob/master/src/rt/alloca.d

Mike

The first assertion ("the C library isn't called") is easily apperent from
that assembly dump. The second is interesting but not so evident.

It might be clearer looking at actual assembly.

The doSomething function starts as such:

; sym._D4test11doSomethingFmZv (int arg_1h);
    ; prologue, puts the old stack pointer on the stack
      0x563d809095ec      55             push rbp
      0x563d809095ed      488bec         mov rbp, rsp
    ; allocate stack memory
      0x563d809095f0      4883ec20       sub rsp, 0x20
    ; setup arguments for the alloca call
; that 0x20 in rcx is actually the size of the current stack allocation 0x563d809095f4 48c745e82000. mov qword [local_18h], 0x20 ; 32
      0x563d809095fc      48ffc7         inc rdi
0x563d809095ff 48897de0 mov qword [local_20h], rdi
      0x563d80909603      488d4de8       lea rcx, [local_18h]
    ; calls alloca
      0x563d80909607      e830010000     call sym.__alloca

The alloca function works as such:

;-- __alloca:
; Note how we don't create a stack frame by "push rbp;mov rbp,rsp" ; Those instructions could be inlined, it's not a function per se
    ;
; At that point rcx holds the size of the calling functions's stack frame
    ; and eax how much we want to add
      0x563d8090973c      4889ca         mov rdx, rcx
      0x563d8090973f      4889f8         mov rax, rdi
    ; Round rax up to 16 bytes
      0x563d80909742      4883c00f       add rax, 0xf
      0x563d80909746      24f0           and al, 0xf0
      0x563d80909748      4885c0         test rax, rax
  ,=< 0x563d8090974b      7505           jne 0x563d80909752
  |   0x563d8090974d      b810000000     mov eax, 0x10
  `-> 0x563d80909752      4889c6         mov rsi, rax
    ; Do the substraction in rax which holds the new address
      0x563d80909755      48f7d8         neg rax
      0x563d80909758      4801e0         add rax, rsp
    ; Check for overflows
  ,=< 0x563d8090975b      7321           jae 0x563d8090977e
  | ; Replace the old stack pointer by the new one
  |   0x563d8090975d      4889e9         mov rcx, rbp
  |   0x563d80909760      4829e1         sub rcx, rsp
  |   0x563d80909763      482b0a         sub rcx, qword [rdx]
  |   0x563d80909766      480132         add qword [rdx], rsi
  |   0x563d80909769      4889c4         mov rsp, rax
  |   0x563d8090976c      4801c8         add rax, rcx
  |   0x563d8090976f      4889e7         mov rdi, rsp
  |   0x563d80909772      4801e6         add rsi, rsp
  |   0x563d80909775      48c1e903       shr rcx, 3
| 0x563d80909779 f348a5 rep movsq qword [rdi], qword ptr [rsi]
 ,==< 0x563d8090977c      eb02           jmp 0x563d80909780
 |`-> 0x563d8090977e      31c0           xor eax, eax
 |  ; Done!
 `--> 0x563d80909780      c3             ret

So as you can see alloca isn't really a function in that it doesn't create a stack frame. It also needs help from the compiler to setup its arguments since the current allocation size is needed (rcx in the beginning of alloca) which isn't a parameter known by the programmer. The compiler has to detect that __alloca call and setup an additionnal argument by itself. Alloca then
just ("just") modifies the calling frame.


(I really hope I didn't mess something up)

Reply via email to