Re: [go-nuts] Register-based ABI benchmarks

2022-02-04 Thread Didier Spezia
Thank you - it makes sense.

The first time I read
https://github.com/golang/go/blob/master/src/cmd/compile/abi-internal.md
I thought there were plenty of registers for parameters, even for x86_64.

But with string, slices, interfaces, etc ... multiple registers are used,
so it does not take
so many parameters before having to spill on the stack.

Regards,
Didier.

On Fri, Feb 4, 2022 at 2:27 AM Robert Engels  wrote:

> +1. Sometimes the compiler optimizations are even worse if they change the
> behavior the chip was typically expecting.
>
> > On Feb 3, 2022, at 2:23 PM, Ian Lance Taylor  wrote:
> >
> > On Thu, Feb 3, 2022 at 7:21 AM Didier Spezia 
> wrote:
> >>
> >> It seems Aarch64 benefits more from the register-based ABI than x86_64.
> >> I don''t see really why. Does anyone have a clue?
> >
> > My view is that the x86 architecture has fewer registers and has had a
> > massive decades-long investment in performance, so stack operations
> > are highly optimized in hardware, including things like forwarding
> > values stored in the stack by the caller to the retrieval from the
> > stack by the callee without waiting even for the memory cache.  The
> > ARM architecture has more registers and has historically focused more
> > on power savings than on raw performance, so it has less optimization
> > on stack handling and benefits more from a smarter compiler.
> >
> > In my experience testing compiler optimizations can be frustrating on
> > x86 because the hardware is just so good.  Almost every other
> > processor architecture shows bigger benefits from compiler
> > optimizations.
> >
> > Ian
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "golang-nuts" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to golang-nuts+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVBg%2BWkrT636M-VuBjnaSOjUiAd_Einso_%3DBWFWMKRttA%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAKmhntmATcj%2BtxH%2BQLsLH_W4iT5xy6t-aaSPeyzN1AV%2BMA4Vdw%40mail.gmail.com.


Re: [go-nuts] Register-based ABI benchmarks

2022-02-03 Thread Robert Engels
+1. Sometimes the compiler optimizations are even worse if they change the 
behavior the chip was typically expecting. 

> On Feb 3, 2022, at 2:23 PM, Ian Lance Taylor  wrote:
> 
> On Thu, Feb 3, 2022 at 7:21 AM Didier Spezia  wrote:
>> 
>> It seems Aarch64 benefits more from the register-based ABI than x86_64.
>> I don''t see really why. Does anyone have a clue?
> 
> My view is that the x86 architecture has fewer registers and has had a
> massive decades-long investment in performance, so stack operations
> are highly optimized in hardware, including things like forwarding
> values stored in the stack by the caller to the retrieval from the
> stack by the callee without waiting even for the memory cache.  The
> ARM architecture has more registers and has historically focused more
> on power savings than on raw performance, so it has less optimization
> on stack handling and benefits more from a smarter compiler.
> 
> In my experience testing compiler optimizations can be frustrating on
> x86 because the hardware is just so good.  Almost every other
> processor architecture shows bigger benefits from compiler
> optimizations.
> 
> Ian
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVBg%2BWkrT636M-VuBjnaSOjUiAd_Einso_%3DBWFWMKRttA%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/32CF3C8F-EE52-4A60-8EB0-53B8CB2E164D%40ix.netcom.com.


Re: [go-nuts] Register-based ABI benchmarks

2022-02-03 Thread Ian Lance Taylor
On Thu, Feb 3, 2022 at 7:21 AM Didier Spezia  wrote:
>
> It seems Aarch64 benefits more from the register-based ABI than x86_64.
> I don''t see really why. Does anyone have a clue?

My view is that the x86 architecture has fewer registers and has had a
massive decades-long investment in performance, so stack operations
are highly optimized in hardware, including things like forwarding
values stored in the stack by the caller to the retrieval from the
stack by the callee without waiting even for the memory cache.  The
ARM architecture has more registers and has historically focused more
on power savings than on raw performance, so it has less optimization
on stack handling and benefits more from a smarter compiler.

In my experience testing compiler optimizations can be frustrating on
x86 because the hardware is just so good.  Almost every other
processor architecture shows bigger benefits from compiler
optimizations.

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVBg%2BWkrT636M-VuBjnaSOjUiAd_Einso_%3DBWFWMKRttA%40mail.gmail.com.


Re: [go-nuts] Register-based ABI benchmarks

2022-02-03 Thread Robert Engels
Usually Arm cpus have a lot more registers to pass values in. 

> On Feb 3, 2022, at 9:21 AM, Didier Spezia  wrote:
> 
> We are using our own benchmark to evaluate the performance of different CPU 
> models of cloud providers.
> https://github.com/AmadeusITGroup/cpubench1A
> 
> One point we have realized is the results of such benchmark can be biased 
> depending on the version of the Go compiler. 
> 
> For instance, the register-based ABI has a measurable positive impact on 
> performance, but it does not come with the same version of Go depending on 
> the CPU architecture. When we run different versions of Go against the same 
> code base for recent Intel and ARM CPUs, we get: 
> https://github.com/AmadeusITGroup/cpubench1A/issues/8
> 
> It is about +10% throughput for x86_86 (from go 1.16.13 -> 1.17.6) and +17% 
> for Aarch64 (from go 1.17.6 -> 1.18beta1). Yay!
> 
> It seems Aarch64 benefits more from the register-based ABI than x86_64.
> I don''t see really why. Does anyone have a clue?
> Thanks.
> 
> Best regards,
> Didier.
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/0dae635a-768a-4cf4-ae05-84e294ca8745n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/FDD03561-166B-436B-9475-B0E653FA5F3C%40ix.netcom.com.


[go-nuts] Register-based ABI benchmarks

2022-02-03 Thread Didier Spezia
We are using our own benchmark to evaluate the performance of different CPU 
models of cloud providers.
https://github.com/AmadeusITGroup/cpubench1A

One point we have realized is the results of such benchmark can be biased 
depending on the version of the Go compiler. 

For instance, the register-based ABI has a measurable positive impact on 
performance, but it does not come with the same version of Go depending on 
the CPU architecture. When we run different versions of Go against the same 
code base for recent Intel and ARM CPUs, we get: 
https://github.com/AmadeusITGroup/cpubench1A/issues/8

It is about +10% throughput for x86_86 (from go 1.16.13 -> 1.17.6) and +17% 
for Aarch64 (from go 1.17.6 -> 1.18beta1). Yay!

It seems Aarch64 benefits more from the register-based ABI than x86_64.
I don''t see really why. Does anyone have a clue?
Thanks.

Best regards,
Didier.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/0dae635a-768a-4cf4-ae05-84e294ca8745n%40googlegroups.com.