On 12 March 2012 00:58, Robert Jacques <sandf...@jhu.edu> wrote: > That's an argument for using the right register for the job. And we can / > will be doing this on x86-64, as other compilers have already done. Manu > was arguing that MRV were somehow special and had mystical optimization > potential. That's simply not true. >
Here's some tests for you: // first test that the argument registers allocate as expected... int gprtest(int x, int y, int z) { return x+y+z; } Perfect, ints pass in register sequence, return in r0, no memory access add r0, r0, r1 add r0, r0, r2 bx lr float fptest(float x, float y, float z) { return x+y+z; } Same for floats fadds s0, s0, s1 fadds s0, s0, s2 bx lr // Some MRV tests... auto mrv1(int x, int z) { return Tuple!(int, int)(x, z); } Simple case, 2 ints FAIL, stores the 2 arguments it received in regs straight to output struct pointer supplied stmia r0, {r1, r2} bx lr auto mrv2(int x, float y, byte z) { return Tuple!(int, float, byte)(x, y, z); } Different typed things EPIC FAIL stmfd sp!, {r4, r5} mov ip, #0 sub sp, sp, #24 mov r4, r2 str ip, [sp, #12] str ip, [sp, #20] ldr r2, .L27 add ip, sp, #24 mov r3, r0 mov r5, r1 str r2, [sp, #16] @ float ldmdb ip, {r0, r1, r2} stmia r3, {r0, r1, r2} fsts s0, [r3, #4] stmia sp, {r0, r1, r2} str r5, [r3, #0] strb r4, [r3, #8] mov r0, r3 add sp, sp, #24 ldmfd sp!, {r4, r5} bx lr auto range(int *p) { return p[0..1]; } Range SURPRISE FAIL, even a range is returned as a struct! O_O mov r2, #1 str r2, [r0, #0] str r1, [r0, #4] bx lr So the D ABI is a complete shambles on ARM! Unsurprisingly, it all just follows the return struct by-val ABI, which is to write it to the stack unconditionally. And sadly, it even thinks the internal types like range+delegate are just a struct by-val, and completely ruins those! Let's try again with x86... auto mrv1(int x, int z) { return Tuple!(int, int)(x, z); } Returns in eax/edx as expected movl 4(%esp), %eax movl 8(%esp), %edx auto mrv2(int x, float y, int z) { return Tuple!(int, float, int)(x, y, z); } FAIL! All written to a struct rather than returning in eax,edx,st0 .. This is C ABI baggage, D can do better. movl 4(%esp), %eax movl 8(%esp), %edx movl %edx, (%eax) movl 12(%esp), %edx movl %edx, 4(%eax) movl 16(%esp), %edx movl %edx, 8(%eax) ret $4 auto range(int *p) { return p[0..1]; } Obviously, the small struct optimisation allows this to work properly movl $1, %eax movl 4(%esp), %edx ret All that said, x86 isn't a good test case, since all args are ALWAYS passed on the stack. x64 would be a much better test since it actually has arg registers, but I'm on windows, so no x64 for me...