I am assuming that there is an internal Go structure/process that when there is less than N Go routines it fits in the L1 CPU cache, and beyond a certain point it spills to the L2 or higher - thus the nearly order of magnitude performance decrease, yet consistent times within a range.

Since the worker code is so trivial, you are seeing this. Most worker code is not as trivial so the overhead of the locking/scheduler constructs have far less effect (or the worker is causing L1 evictions anyway - so you never see the optimum performance possible of the scheduler).

-----Original Message-----
From: changkun
Sent: Aug 20, 2019 3:33 AM
To: golang-nuts
Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

Hi Robert,

Thanks for your explanation. But how could I "logged the number of operations done per Go routine", which particular debug settings you referring to?
It is reasonable that sync.Mutex rely on runtime scheduler but channels do not. However, it is unclear why a significant performance drop appears. Is it possible to determine when the performance will appear?

Best,
Changkun

On Monday, August 19, 2019 at 10:27:19 PM UTC+2, Robert Engels wrote:
I think you'll find the reason that the Mutex uses the Go scheduler. The chan is controlled by a 'mutex' which eventually defers to the OS futex - and the OS futex is probably more efficient at scheduling in the face of large contention - although you would think it should be the other way around.

I am guessing that if you logged the number of operations done per Go routine, you will see that the Mutex version is very fair, and the chan/futex version is unfair - meaning many are starved.

-----Original Message-----
From: changkun
Sent: Aug 19, 2019 12:50 PM
To: golang-nuts
Subject: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

I am comparing the performance regarding sync.Mutex and Go channels. Here is my benchmark: https://play.golang.org/p/zLjVtsSx9gd

The performance comparison visualization is as follows:

sync.Mutex performance (1).png

What are the reasons that 

1. sync.Mutex encounter a large performance drop when the number of goroutines goes higher than roughly 3400?
2. Go channels are pretty stable but slower than sync.Mutex before?



Raw bench data by benchstat (go test -bench=. -count=5):

MutexWrite/goroutines-2400-8  48.6ns ± 1%
MutexWrite/goroutines-2480-8  49.1ns ± 0%
MutexWrite/goroutines-2560-8  49.7ns ± 1%
MutexWrite/goroutines-2640-8  50.5ns ± 3%
MutexWrite/goroutines-2720-8  50.9ns ± 2%
MutexWrite/goroutines-2800-8  51.8ns ± 3%
MutexWrite/goroutines-2880-8  52.5ns ± 2%
MutexWrite/goroutines-2960-8  54.1ns ± 4%
MutexWrite/goroutines-3040-8  54.5ns ± 2%
MutexWrite/goroutines-3120-8  56.1ns ± 3%
MutexWrite/goroutines-3200-8  63.2ns ± 5%
MutexWrite/goroutines-3280-8  77.5ns ± 6%
MutexWrite/goroutines-3360-8   141ns ± 6%
MutexWrite/goroutines-3440-8   239ns ± 8%
MutexWrite/goroutines-3520-8   248ns ± 3%
MutexWrite/goroutines-3600-8   254ns ± 2%
MutexWrite/goroutines-3680-8   256ns ± 1%
MutexWrite/goroutines-3760-8   261ns ± 2%
MutexWrite/goroutines-3840-8   266ns ± 3%
MutexWrite/goroutines-3920-8   276ns ± 3%
MutexWrite/goroutines-4000-8   278ns ± 3%
MutexWrite/goroutines-4080-8   286ns ± 5%
MutexWrite/goroutines-4160-8   293ns ± 4%
MutexWrite/goroutines-4240-8   295ns ± 2%
MutexWrite/goroutines-4320-8   280ns ± 8%
MutexWrite/goroutines-4400-8   294ns ± 9%
MutexWrite/goroutines-4480-8   285ns ±10%
MutexWrite/goroutines-4560-8   290ns ± 8%
MutexWrite/goroutines-4640-8   271ns ± 3%
MutexWrite/goroutines-4720-8   271ns ± 4%

ChanWrite/goroutines-2400-8  158ns ± 3%
ChanWrite/goroutines-2480-8  159ns ± 2%
ChanWrite/goroutines-2560-8  161ns ± 2%
ChanWrite/goroutines-2640-8  161ns ± 1%
ChanWrite/goroutines-2720-8  163ns ± 1%
ChanWrite/goroutines-2800-8  166ns ± 3%
ChanWrite/goroutines-2880-8  168ns ± 1%
ChanWrite/goroutines-2960-8  176ns ± 4%
ChanWrite/goroutines-3040-8  176ns ± 2%
ChanWrite/goroutines-3120-8  180ns ± 1%
ChanWrite/goroutines-3200-8  180ns ± 1%
ChanWrite/goroutines-3280-8  181ns ± 2%
ChanWrite/goroutines-3360-8  183ns ± 2%
ChanWrite/goroutines-3440-8  188ns ± 3%
ChanWrite/goroutines-3520-8  190ns ± 2%
ChanWrite/goroutines-3600-8  193ns ± 2%
ChanWrite/goroutines-3680-8  196ns ± 3%
ChanWrite/goroutines-3760-8  199ns ± 2%
ChanWrite/goroutines-3840-8  206ns ± 2%
ChanWrite/goroutines-3920-8  209ns ± 2%
ChanWrite/goroutines-4000-8  206ns ± 2%
ChanWrite/goroutines-4080-8  209ns ± 2%
ChanWrite/goroutines-4160-8  208ns ± 2%
ChanWrite/goroutines-4240-8  209ns ± 3%
ChanWrite/goroutines-4320-8  213ns ± 2%
ChanWrite/goroutines-4400-8  209ns ± 2%
ChanWrite/goroutines-4480-8  211ns ± 1%
ChanWrite/goroutines-4560-8  213ns ± 2%
ChanWrite/goroutines-4640-8  215ns ± 1%
ChanWrite/goroutines-4720-8  218ns ± 3%


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/3275fb21-dfbd-411d-be42-683386e7ebe2%40googlegroups.com.




-- 
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/77b8dfc3-53d2-4fbe-9538-cd070d47cd34%40googlegroups.com.



--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1342454826.2778.1566317395363%40wamui-abby.atl.sa.earthlink.net.

Reply via email to