As I expected, the test is not testing what you think it is. Many of the Go 
routines created do not perform the same number of iterations. The benchmark 
harness is only designed to try and perform enough iterations to get a time per 
op while “running N routines”.

You need to rework the test so that all Go routines run the same number of 
iterations - and time the entire process - then you can see how the 
concurrency/scheduling latency affects things (or the time of the operation 
when fully contended). Then the time per is = total time / (iterations * 
nRoutines)

Here is the code that shows the number of iterations per routine: 
https://play.golang.org/p/LkAvB39X3_Z



> On Aug 21, 2019, at 6:31 PM, Robert Engels <reng...@ix.netcom.com> wrote:
> 
> I don't think you've posted code for the atomic version...
> 
> Each Go routine has its own stack. So when you cycle through many Go routines 
> you will be destroying the cache as each touches N memory addresses (that are 
> obviously not shared).
> 
> That's my guess anyway - the performance profile certainly looks like a cache 
> issue to me. Once the cache is exhausted - the kernel based scheduler is more 
> efficient - so it does suggest to me that there are some optimizations that 
> can be done in the Go scheduler.
> 
> I will look at a few things this evening.
> 
> -----Original Message----- 
> From: changkun 
> Sent: Aug 21, 2019 4:51 PM 
> To: golang-nuts 
> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
> goroutine contention more than 3400 
> 
> "less than N Go routines it fits in the L1 CPU cache," I am guessing that you 
> are thinking of local queues on each M, the scheduler's local queue size is 
> strict to 256 goroutines. However, in our case, all blocking goroutines don't 
> go the run queue but blocked and stored on semtable, which is a forest and 
> each tree is an unlimited balanced tree. When a lock is released, only a 
> single goroutine will be detached and put into the local queue (so scheduler 
> only schedules runq with a single goroutine without content to globalq). 
> How could an L1/L2 problem appear here? Do you think this is still some kind 
> of "limited L1 cache to store large mount of goroutines" ?
> 
> What interests me is a newly created issue, I am not sure if this question is 
> relevant to https://github.com/golang/go/issues/33747 
> <https://github.com/golang/go/issues/33747>
> The issue talked about small contention on large Ps, but a full scale of my 
> benchmark is shown as follows:
> 
> 
> 
> 
> 
> On Tuesday, August 20, 2019 at 6:10:32 PM UTC+2, Robert Engels wrote:
> I am assuming that there is an internal Go structure/process that when there 
> is less than N Go routines it fits in the L1 CPU cache, and beyond a certain 
> point it spills to the L2 or higher - thus the nearly order of magnitude 
> performance decrease, yet consistent times within a range.
> 
> Since the worker code is so trivial, you are seeing this. Most worker code is 
> not as trivial so the overhead of the locking/scheduler constructs have far 
> less effect (or the worker is causing L1 evictions anyway - so you never see 
> the optimum performance possible of the scheduler).
> 
> -----Original Message----- 
> From: changkun 
> Sent: Aug 20, 2019 3:33 AM 
> To: golang-nuts 
> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
> goroutine contention more than 3400 
> 
> Hi Robert,
> 
> Thanks for your explanation. But how could I "logged the number of operations 
> done per Go routine", which particular debug settings you referring to?
> It is reasonable that sync.Mutex rely on runtime scheduler but channels do 
> not. However, it is unclear why a significant performance drop appears. Is it 
> possible to determine when the performance will appear?
> 
> Best,
> Changkun
> 
> On Monday, August 19, 2019 at 10:27:19 PM UTC+2, Robert Engels wrote:
> I think you'll find the reason that the Mutex uses the Go scheduler. The chan 
> is controlled by a 'mutex' which eventually defers to the OS futex - and the 
> OS futex is probably more efficient at scheduling in the face of large 
> contention - although you would think it should be the other way around.
> 
> I am guessing that if you logged the number of operations done per Go 
> routine, you will see that the Mutex version is very fair, and the chan/futex 
> version is unfair - meaning many are starved.
> 
> -----Original Message----- 
> From: changkun 
> Sent: Aug 19, 2019 12:50 PM 
> To: golang-nuts 
> Subject: [go-nuts] sync.Mutex encounter large performance drop when goroutine 
> contention more than 3400 
> 
> I am comparing the performance regarding sync.Mutex and Go channels. Here is 
> my benchmark: https://play.golang.org/p/zLjVtsSx9gd 
> <https://play.golang.org/p/zLjVtsSx9gd>
> 
> The performance comparison visualization is as follows:
> 
> 
> 
> What are the reasons that 
> 
> 1. sync.Mutex encounter a large performance drop when the number of 
> goroutines goes higher than roughly 3400?
> 2. Go channels are pretty stable but slower than sync.Mutex before?
> 
> 
> 
> Raw bench data by benchstat (go test -bench=. -count=5):
> 
> MutexWrite/goroutines-2400-8  48.6ns ± 1%
> MutexWrite/goroutines-2480-8  49.1ns ± 0%
> MutexWrite/goroutines-2560-8  49.7ns ± 1%
> MutexWrite/goroutines-2640-8  50.5ns ± 3%
> MutexWrite/goroutines-2720-8  50.9ns ± 2%
> MutexWrite/goroutines-2800-8  51.8ns ± 3%
> MutexWrite/goroutines-2880-8  52.5ns ± 2%
> MutexWrite/goroutines-2960-8  54.1ns ± 4%
> MutexWrite/goroutines-3040-8  54.5ns ± 2%
> MutexWrite/goroutines-3120-8  56.1ns ± 3%
> MutexWrite/goroutines-3200-8  63.2ns ± 5%
> MutexWrite/goroutines-3280-8  77.5ns ± 6%
> MutexWrite/goroutines-3360-8   141ns ± 6%
> MutexWrite/goroutines-3440-8   239ns ± 8%
> MutexWrite/goroutines-3520-8   248ns ± 3%
> MutexWrite/goroutines-3600-8   254ns ± 2%
> MutexWrite/goroutines-3680-8   256ns ± 1%
> MutexWrite/goroutines-3760-8   261ns ± 2%
> MutexWrite/goroutines-3840-8   266ns ± 3%
> MutexWrite/goroutines-3920-8   276ns ± 3%
> MutexWrite/goroutines-4000-8   278ns ± 3%
> MutexWrite/goroutines-4080-8   286ns ± 5%
> MutexWrite/goroutines-4160-8   293ns ± 4%
> MutexWrite/goroutines-4240-8   295ns ± 2%
> MutexWrite/goroutines-4320-8   280ns ± 8%
> MutexWrite/goroutines-4400-8   294ns ± 9%
> MutexWrite/goroutines-4480-8   285ns ±10%
> MutexWrite/goroutines-4560-8   290ns ± 8%
> MutexWrite/goroutines-4640-8   271ns ± 3%
> MutexWrite/goroutines-4720-8   271ns ± 4%
> 
> ChanWrite/goroutines-2400-8  158ns ± 3%
> ChanWrite/goroutines-2480-8  159ns ± 2%
> ChanWrite/goroutines-2560-8  161ns ± 2%
> ChanWrite/goroutines-2640-8  161ns ± 1%
> ChanWrite/goroutines-2720-8  163ns ± 1%
> ChanWrite/goroutines-2800-8  166ns ± 3%
> ChanWrite/goroutines-2880-8  168ns ± 1%
> ChanWrite/goroutines-2960-8  176ns ± 4%
> ChanWrite/goroutines-3040-8  176ns ± 2%
> ChanWrite/goroutines-3120-8  180ns ± 1%
> ChanWrite/goroutines-3200-8  180ns ± 1%
> ChanWrite/goroutines-3280-8  181ns ± 2%
> ChanWrite/goroutines-3360-8  183ns ± 2%
> ChanWrite/goroutines-3440-8  188ns ± 3%
> ChanWrite/goroutines-3520-8  190ns ± 2%
> ChanWrite/goroutines-3600-8  193ns ± 2%
> ChanWrite/goroutines-3680-8  196ns ± 3%
> ChanWrite/goroutines-3760-8  199ns ± 2%
> ChanWrite/goroutines-3840-8  206ns ± 2%
> ChanWrite/goroutines-3920-8  209ns ± 2%
> ChanWrite/goroutines-4000-8  206ns ± 2%
> ChanWrite/goroutines-4080-8  209ns ± 2%
> ChanWrite/goroutines-4160-8  208ns ± 2%
> ChanWrite/goroutines-4240-8  209ns ± 3%
> ChanWrite/goroutines-4320-8  213ns ± 2%
> ChanWrite/goroutines-4400-8  209ns ± 2%
> ChanWrite/goroutines-4480-8  211ns ± 1%
> ChanWrite/goroutines-4560-8  213ns ± 2%
> ChanWrite/goroutines-4640-8  215ns ± 1%
> ChanWrite/goroutines-4720-8  218ns ± 3%
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golan...@googlegroups.com <>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/3275fb21-dfbd-411d-be42-683386e7ebe2%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/3275fb21-dfbd-411d-be42-683386e7ebe2%40googlegroups.com?utm_medium=email&utm_source=footer>.
> 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golan...@ <>googlegroups.com <http://googlegroups.com/>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/77b8dfc3-53d2-4fbe-9538-cd070d47cd34%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/77b8dfc3-53d2-4fbe-9538-cd070d47cd34%40googlegroups.com?utm_medium=email&utm_source=footer>.
> 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com 
> <mailto:golang-nuts+unsubscr...@googlegroups.com>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/28298078-9aa1-4a9d-8e99-0b4f261cbb47%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/28298078-9aa1-4a9d-8e99-0b4f261cbb47%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/1D17D483-D66C-4646-8648-754985DBEF21%40ix.netcom.com.

Reply via email to