On Jan 13, 2021, at 7:21 PM, Peter Wilson <peter.wil...@bsc.es> wrote: > So, after a long ramble, given that I am happy to waste CPU time in busy > waits (rather than have the overhead of scheduling blocked goroutines), what > is the recommendation for the signalling mechanism when all is done in go and > everything's a goroutine, not a thread?
This is similar to something I'm working on for logic simulation, and I'd been thinking about the clocked simulation as well. I'll be interested in your results; since I'm also considering remote computation (and GPU computation, which might as well be remote) I'm currently going with the idea of futures driven by either channels or sync.Cond. That may not be as efficient for your use case. > My guess is that creating specialist blocking 'barriers' using sync/atomic > (atomic.Operation seems to be around 4nsec on my Mac Mini) is the highest > performance mechanism. There's a dearth of performance information on channel > communication, waitgroup, mutex etc use, but those I have seen seem to > suggest that sending/receiving on a channel might be over the order of > 100nsec; since in C we iterate twice through the list in 30-40nsec, this is a > tad high (yes, fixeable by modeling a bigger system, but) My advice would be to implement the easiest method possible that's not likely to box you in and profile it and see where your bottlenecks are. In my case, so far, the delays introduced by IPC mechanisms (and also allocations) is absolutely dwarfed by just the "business logic" crunching the logical primitives. So far it's not worth trying to improve the IPC on the order of nanoseconds (would be a nice problem to have) because the work done in each "chunk" is big enough that it's not worth worrying about. This also leads me to the next part, which is that if you have lots of little operations and you're worried about the time spent on IPC for each little thing, you'll probably get the easiest and best performance gains by trying to batch them so that you can burn through lots of similar operations at once before trying to send a slice over a channel or something. As always, do a POC implementation and then profile it. That's the only productive way to optimize things at this scale, and Go has EXCELLENT profiling capabilities built in. - Dave -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/24427C92-66CF-4515-ADB4-A3E96059380C%40gmail.com.