Re: [go-nuts] Register-based ABI benchmarks
Thank you - it makes sense. The first time I read https://github.com/golang/go/blob/master/src/cmd/compile/abi-internal.md I thought there were plenty of registers for parameters, even for x86_64. But with string, slices, interfaces, etc ... multiple registers are used, so it does not take so many parameters before having to spill on the stack. Regards, Didier. On Fri, Feb 4, 2022 at 2:27 AM Robert Engels wrote: > +1. Sometimes the compiler optimizations are even worse if they change the > behavior the chip was typically expecting. > > > On Feb 3, 2022, at 2:23 PM, Ian Lance Taylor wrote: > > > > On Thu, Feb 3, 2022 at 7:21 AM Didier Spezia > wrote: > >> > >> It seems Aarch64 benefits more from the register-based ABI than x86_64. > >> I don''t see really why. Does anyone have a clue? > > > > My view is that the x86 architecture has fewer registers and has had a > > massive decades-long investment in performance, so stack operations > > are highly optimized in hardware, including things like forwarding > > values stored in the stack by the caller to the retrieval from the > > stack by the callee without waiting even for the memory cache. The > > ARM architecture has more registers and has historically focused more > > on power savings than on raw performance, so it has less optimization > > on stack handling and benefits more from a smarter compiler. > > > > In my experience testing compiler optimizations can be frustrating on > > x86 because the hardware is just so good. Almost every other > > processor architecture shows bigger benefits from compiler > > optimizations. > > > > Ian > > > > -- > > You received this message because you are subscribed to the Google > Groups "golang-nuts" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to golang-nuts+unsubscr...@googlegroups.com. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVBg%2BWkrT636M-VuBjnaSOjUiAd_Einso_%3DBWFWMKRttA%40mail.gmail.com > . > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAKmhntmATcj%2BtxH%2BQLsLH_W4iT5xy6t-aaSPeyzN1AV%2BMA4Vdw%40mail.gmail.com.
Re: [go-nuts] Register-based ABI benchmarks
+1. Sometimes the compiler optimizations are even worse if they change the behavior the chip was typically expecting. > On Feb 3, 2022, at 2:23 PM, Ian Lance Taylor wrote: > > On Thu, Feb 3, 2022 at 7:21 AM Didier Spezia wrote: >> >> It seems Aarch64 benefits more from the register-based ABI than x86_64. >> I don''t see really why. Does anyone have a clue? > > My view is that the x86 architecture has fewer registers and has had a > massive decades-long investment in performance, so stack operations > are highly optimized in hardware, including things like forwarding > values stored in the stack by the caller to the retrieval from the > stack by the callee without waiting even for the memory cache. The > ARM architecture has more registers and has historically focused more > on power savings than on raw performance, so it has less optimization > on stack handling and benefits more from a smarter compiler. > > In my experience testing compiler optimizations can be frustrating on > x86 because the hardware is just so good. Almost every other > processor architecture shows bigger benefits from compiler > optimizations. > > Ian > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVBg%2BWkrT636M-VuBjnaSOjUiAd_Einso_%3DBWFWMKRttA%40mail.gmail.com. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/32CF3C8F-EE52-4A60-8EB0-53B8CB2E164D%40ix.netcom.com.
Re: [go-nuts] Register-based ABI benchmarks
On Thu, Feb 3, 2022 at 7:21 AM Didier Spezia wrote: > > It seems Aarch64 benefits more from the register-based ABI than x86_64. > I don''t see really why. Does anyone have a clue? My view is that the x86 architecture has fewer registers and has had a massive decades-long investment in performance, so stack operations are highly optimized in hardware, including things like forwarding values stored in the stack by the caller to the retrieval from the stack by the callee without waiting even for the memory cache. The ARM architecture has more registers and has historically focused more on power savings than on raw performance, so it has less optimization on stack handling and benefits more from a smarter compiler. In my experience testing compiler optimizations can be frustrating on x86 because the hardware is just so good. Almost every other processor architecture shows bigger benefits from compiler optimizations. Ian -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVBg%2BWkrT636M-VuBjnaSOjUiAd_Einso_%3DBWFWMKRttA%40mail.gmail.com.
Re: [go-nuts] Register-based ABI benchmarks
Usually Arm cpus have a lot more registers to pass values in. > On Feb 3, 2022, at 9:21 AM, Didier Spezia wrote: > > We are using our own benchmark to evaluate the performance of different CPU > models of cloud providers. > https://github.com/AmadeusITGroup/cpubench1A > > One point we have realized is the results of such benchmark can be biased > depending on the version of the Go compiler. > > For instance, the register-based ABI has a measurable positive impact on > performance, but it does not come with the same version of Go depending on > the CPU architecture. When we run different versions of Go against the same > code base for recent Intel and ARM CPUs, we get: > https://github.com/AmadeusITGroup/cpubench1A/issues/8 > > It is about +10% throughput for x86_86 (from go 1.16.13 -> 1.17.6) and +17% > for Aarch64 (from go 1.17.6 -> 1.18beta1). Yay! > > It seems Aarch64 benefits more from the register-based ABI than x86_64. > I don''t see really why. Does anyone have a clue? > Thanks. > > Best regards, > Didier. > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/0dae635a-768a-4cf4-ae05-84e294ca8745n%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/FDD03561-166B-436B-9475-B0E653FA5F3C%40ix.netcom.com.
[go-nuts] Register-based ABI benchmarks
We are using our own benchmark to evaluate the performance of different CPU models of cloud providers. https://github.com/AmadeusITGroup/cpubench1A One point we have realized is the results of such benchmark can be biased depending on the version of the Go compiler. For instance, the register-based ABI has a measurable positive impact on performance, but it does not come with the same version of Go depending on the CPU architecture. When we run different versions of Go against the same code base for recent Intel and ARM CPUs, we get: https://github.com/AmadeusITGroup/cpubench1A/issues/8 It is about +10% throughput for x86_86 (from go 1.16.13 -> 1.17.6) and +17% for Aarch64 (from go 1.17.6 -> 1.18beta1). Yay! It seems Aarch64 benefits more from the register-based ABI than x86_64. I don''t see really why. Does anyone have a clue? Thanks. Best regards, Didier. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/0dae635a-768a-4cf4-ae05-84e294ca8745n%40googlegroups.com.