On Sunday, 22 April 2018 at 05:29:30 UTC, Mike Franklin wrote:
On Sunday, 22 April 2018 at 00:41:34 UTC, Nicholas Wilson wrote:
You're not using the C library version of it, the compiler
does the stack space reservation inline for you. There is no
way around this.
I'm not convinced. I did some no-runtime testing and
eventually found the implementation in druntime here:
https://github.com/dlang/druntime/blob/master/src/rt/alloca.d
Mike
The first assertion ("the C library isn't called") is easily
apperent from
that assembly dump. The second is interesting but not so evident.
It might be clearer looking at actual assembly.
The doSomething function starts as such:
; sym._D4test11doSomethingFmZv (int arg_1h);
; prologue, puts the old stack pointer on the stack
0x563d809095ec 55 push rbp
0x563d809095ed 488bec mov rbp, rsp
; allocate stack memory
0x563d809095f0 4883ec20 sub rsp, 0x20
; setup arguments for the alloca call
; that 0x20 in rcx is actually the size of the current stack
allocation
0x563d809095f4 48c745e82000. mov qword [local_18h],
0x20 ; 32
0x563d809095fc 48ffc7 inc rdi
0x563d809095ff 48897de0 mov qword [local_20h],
rdi
0x563d80909603 488d4de8 lea rcx, [local_18h]
; calls alloca
0x563d80909607 e830010000 call sym.__alloca
The alloca function works as such:
;-- __alloca:
; Note how we don't create a stack frame by "push rbp;mov
rbp,rsp"
; Those instructions could be inlined, it's not a function
per se
;
; At that point rcx holds the size of the calling functions's
stack frame
; and eax how much we want to add
0x563d8090973c 4889ca mov rdx, rcx
0x563d8090973f 4889f8 mov rax, rdi
; Round rax up to 16 bytes
0x563d80909742 4883c00f add rax, 0xf
0x563d80909746 24f0 and al, 0xf0
0x563d80909748 4885c0 test rax, rax
,=< 0x563d8090974b 7505 jne 0x563d80909752
| 0x563d8090974d b810000000 mov eax, 0x10
`-> 0x563d80909752 4889c6 mov rsi, rax
; Do the substraction in rax which holds the new address
0x563d80909755 48f7d8 neg rax
0x563d80909758 4801e0 add rax, rsp
; Check for overflows
,=< 0x563d8090975b 7321 jae 0x563d8090977e
| ; Replace the old stack pointer by the new one
| 0x563d8090975d 4889e9 mov rcx, rbp
| 0x563d80909760 4829e1 sub rcx, rsp
| 0x563d80909763 482b0a sub rcx, qword [rdx]
| 0x563d80909766 480132 add qword [rdx], rsi
| 0x563d80909769 4889c4 mov rsp, rax
| 0x563d8090976c 4801c8 add rax, rcx
| 0x563d8090976f 4889e7 mov rdi, rsp
| 0x563d80909772 4801e6 add rsi, rsp
| 0x563d80909775 48c1e903 shr rcx, 3
| 0x563d80909779 f348a5 rep movsq qword [rdi],
qword ptr [rsi]
,==< 0x563d8090977c eb02 jmp 0x563d80909780
|`-> 0x563d8090977e 31c0 xor eax, eax
| ; Done!
`--> 0x563d80909780 c3 ret
So as you can see alloca isn't really a function in that it
doesn't create a
stack frame. It also needs help from the compiler to setup its
arguments
since the current allocation size is needed (rcx in the
beginning of alloca)
which isn't a parameter known by the programmer. The compiler has
to detect
that __alloca call and setup an additionnal argument by itself.
Alloca then
just ("just") modifies the calling frame.
(I really hope I didn't mess something up)