Seems I had misused volatile. I removed ‘volatile’ from the function argument on test_0 and it prevented the spill through the stack.
I added volatile because I was trying to avoid the compiler optimising away the call to test_0 (as it has no side effects) but it appeared that volatile was unnecessary and was a misuse of volatile (intended to indicate storage may change outside of the control of the compiler). However it is an interesting case… as a register arguments don’t have storage. GCC, Clang folk, any ideas on why there is a stack spill for a volatile register argument passed in esi? Does volatile force the argument to have storage allocated on the stack? Is this a corner case in the C standard? This argument in the x86_64 calling convention only has a register, so technically it can’t change outside the control of the C "virtual machine” so volatile has a vague meaning here. This seems to be a case of interpreting the C standard in such a was as to make sure that a volatile argument “can be changed” outside the control of the C "virtual machine” by explicitly giving it a storage location on the stack. I think volatile scalar arguments are a special case and that the volatile type label shouldn’t widen the scope beyond the register unless it actually *needs* storage to spill. This is not a volatile stack scoped variable unless the C standard interprets ABI register parameters as actually having ‘storage’ so this is questionable… Maybe I should have gotten a warning… or the volatile type qualifier on a scalar register argument should have been ignored… volatile for scalar function arguments seems to mean: “make this volatile and subject to change outside of the compiler” rather than being a qualifier for its storage (which is a register). # gcc test_0: mov DWORD PTR [rsp-4], esi mov ecx, DWORD PTR [rsp-4] mov eax, edi cdq idiv ecx mov eax, edx ret # clang test_0: mov dword ptr [rsp - 4], esi xor edx, edx mov eax, edi div dword ptr [rsp - 4] mov eax, edx ret /* Test program compiled on x86_64 with: cc -O3 -fomit-frame-pointer -masm=intel -S test.c -o test.S */ #include <stdio.h> #include <limits.h> static const int p = 8191; static const int s = 13; int __attribute__ ((noinline)) test_0(unsigned int k, volatile int p) { return k % p; } int __attribute__ ((noinline)) test_1(unsigned int k) { return k % p; } int __attribute__ ((noinline)) test_2(unsigned int k) { int i = (k&p) + (k>>s); i = (i&p) + (i>>s); if (i>=p) i -= p; return i; } int main() { test_0(1, 8191); /* control */ for (int i = INT_MIN; i < INT_MAX; i++) { int r1 = test_1(i), r2 = test_2(i); if (r1 != r2) printf("%d %d %d\n", i, r1, r2); } } > On 27 Mar 2016, at 2:32 PM, Andrew Waterman <and...@sifive.com> wrote: > > It would be good to figure out how to get rid of the spurious register spills. > > The strength reduction optimization isn't always profitable on Rocket, > as it increases instruction count and code size. The divider has an > early out and for small numbers is quite fast. > > On Fri, Mar 25, 2016 at 5:43 PM, Michael Clark <michaeljcl...@mac.com> wrote: >> Now considering I have no idea how many cycles it takes for an integer >> divide on the Rocket so the optimisation may not be a win. >> >> Trying to read MuDiv in multiplier.scala, and will at some point run some >> timings in the cycle-accurate simulator. >> >> In either case, the spurious stack moves emitted by GCC are curious... >> >>> On 26 Mar 2016, at 9:42 AM, Michael Clark <michaeljcl...@mac.com> wrote: >>> >>> Hi All, >>> >>> I have found an interesting case where an optimisation is not being applied >>> by GCC on RISC-V. And also some strange assembly output from GCC on RISC-V. >>> >>> Both GCC and Clang appear to optimise division by a constant Mersenne prime >>> on x86_64 however GCC on RISC-V is not applying this optimisation. >>> >>> See test program and assembly output for these platforms: >>> >>> * GCC -O3 on RISC-V >>> * GCC -O3 on x86_64 >>> * LLVM/Clang -O3 on x86_64 >>> >>> Another strange observation is GCC on RISC-V is moving a1 to a5 via a stack >>> store followed by a stack load. Odd? GCC 5 also seems to be doing odd stuff >>> with stack ‘moves' on x86_64, moving esi to ecx via the stack (I think >>> recent x86 micro-architecture treats tip of the stack like an extended >>> register file so this may only have a small penalty on x86). >>> >>> See GCC on RISC-V is emitting this: >>> >>> test_0: >>> add sp,sp,-16 >>> sw a1,12(sp) >>> lw a5,12(sp) >>> add sp,sp,16 >>> remuw a0,a0,a5 >>> jr ra >>> >>> instead of this: >>> >>> test_0: >>> remuw a0,a0,a1 >>> jr ra >>> >>> Compiler devs, please read Test program and assembly output. I have not yet >>> tested LLVM/Clang on RISC-V yet… I will do that next… I have not had time >>> to dig into compiler code yet... >>> >>> Regards, >>> Michael. >>> >>> >>> /* Test program */ >>> >>> #include <stdio.h> >>> #include <limits.h> >>> >>> static const int p = 8191; >>> static const int s = 13; >>> >>> int __attribute__ ((noinline)) test_0(unsigned int k, volatile int p) >>> { >>> return k % p; >>> } >>> >>> int __attribute__ ((noinline)) test_1(unsigned int k) >>> { >>> return k % p; >>> } >>> >>> int __attribute__ ((noinline)) test_2(unsigned int k) >>> { >>> int i = (k&p) + (k>>s); >>> i = (i&p) + (i>>s); >>> if (i>=p) i -= p; >>> return i; >>> } >>> >>> int main() >>> { >>> test_0(1, 8191); /* control */ >>> for (int i = INT_MIN; i < INT_MAX; i++) { >>> int r1 = test_1(i), r2 = test_2(i); >>> if (r1 != r2) printf("%d %d %d\n", i, r1, r2); >>> } >>> } >>> >>> >>> >>> /* RISC-V GCC */ >>> >>> $ riscv64-unknown-elf-gcc --version >>> riscv64-unknown-elf-gcc (GCC) 5.2.0 >>> >>> test_0: >>> add sp,sp,-16 >>> sw a1,12(sp) >>> lw a5,12(sp) >>> add sp,sp,16 >>> remuw a0,a0,a5 >>> jr ra >>> test_1: >>> li a5,8192 >>> addw a5,a5,-1 >>> remuw a0,a0,a5 >>> ret >>> test_2: >>> li a3,8192 >>> addw a2,a3,-1 >>> and a4,a0,a2 >>> srlw a0,a0,13 >>> addw a5,a4,a0 >>> and a0,a5,a2 >>> sraw a5,a5,13 >>> addw a0,a0,a5 >>> addw a3,a3,-2 >>> ble a0,a3,.L5 >>> subw a0,a0,a2 >>> .L5: >>> ret >>> >>> >>> /* Linux x86_64 GCC */ >>> >>> $ gcc --version >>> gcc (Debian 5.2.1-23) 5.2.1 20151028 >>> >>> test_0: >>> mov DWORD PTR [rsp-4], esi >>> mov ecx, DWORD PTR [rsp-4] >>> mov eax, edi >>> cdq >>> idiv ecx >>> mov eax, edx >>> ret >>> test_1: >>> mov eax, edi >>> mov rcx, rax >>> mov rdx, rax >>> sal rcx, 6 >>> sal rdx, 19 >>> add rdx, rcx >>> add rax, rdx >>> mov edx, edi >>> shr rax, 32 >>> sub edx, eax >>> shr edx >>> add eax, edx >>> shr eax, 12 >>> mov edx, eax >>> sal edx, 13 >>> sub edx, eax >>> sub edi, edx >>> mov eax, edi >>> ret >>> test_2: >>> mov eax, edi >>> shr edi, 13 >>> and eax, 8191 >>> add eax, edi >>> mov edx, eax >>> sar eax, 13 >>> and edx, 8191 >>> add eax, edx >>> lea edx, [rax-8191] >>> cmp eax, 8191 >>> cmovge eax, edx >>> ret >>> >>> >>> /* Darwin x86_64 LLVM Clang */ >>> >>> $ cc --version >>> Apple LLVM version 7.3.0 (clang-703.0.29) >>> >>> _test_0: >>> mov dword ptr [rsp - 4], esi >>> xor edx, edx >>> mov eax, edi >>> div dword ptr [rsp - 4] >>> mov eax, edx >>> ret >>> _test_1: >>> mov eax, edi >>> imul rax, rax, 524353 >>> shr rax, 32 >>> mov ecx, edi >>> sub ecx, eax >>> shr ecx >>> add ecx, eax >>> shr ecx, 12 >>> imul eax, ecx, 8191 >>> sub edi, eax >>> mov eax, edi >>> ret >>> _test_2: >>> mov eax, edi >>> and eax, 8191 >>> mov ecx, edi >>> shr ecx, 13 >>> add eax, ecx >>> add ecx, edi >>> and ecx, 8191 >>> shr eax, 13 >>> lea edx, [rcx + rax] >>> cmp edx, 8190 >>> lea eax, [rcx + rax - 8191] >>> cmovbe eax, edx >>> ret >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "RISC-V SW Dev" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to sw-dev+unsubscr...@groups.riscv.org. >>> To post to this group, send email to sw-...@groups.riscv.org. >>> Visit this group at >>> https://groups.google.com/a/groups.riscv.org/group/sw-dev/. >>> To view this discussion on the web visit >>> https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/2600D96D-94BC-4259-9D39-DE4993859281%40mac.com. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "RISC-V SW Dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to sw-dev+unsubscr...@groups.riscv.org. >> To post to this group, send email to sw-...@groups.riscv.org. >> Visit this group at >> https://groups.google.com/a/groups.riscv.org/group/sw-dev/. >> To view this discussion on the web visit >> https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/9F3C9DE6-F00B-4402-A83B-354455DEAFFA%40mac.com.