We're experiencing a problem with our program which serves HTTP requests. Its clients have TCP connection timeouts set to 1 second, and under certain pattern of heavy load the server fails to perform some net.netFD.Accept() calls in time so a fraction of clients gets I/O timeouts when attempting to connect.
Thanks to the Go runtime tracing facility, we were able to pinpoint that those I/O timeouts happen due to the goroutine which does the accept() call being unblocked by the netpoller but then waits in a run queue to actually have a chance to run for too long (in our case -- sometimes for over two seconds). We hypothesize that this happens due to heavy load on the goroutine scheduler. Still, we have to invent a temporary workaround until the real cause of the degraded performance is understood and fixed. We were able to come up with two workarounds: - Run multiple instances of net/http.Server in several goroutine. This appears to alleviate the problem. We hypothesize that doing this can be viewed as raising the relative number of quanta the scheduler will grant to gorutines serving the accept() call. - Lock the goroutine which runs the net/http.Server instance to its underlying OS thread (and may be do a syscall to raise that thread's priority above normal). We did not yet test this approach but it appears to be better that the first as it should lower possible contentions between multiple instances of the net/http.Server type, and makes the whole setup simpler. I'd like to solicit insight on whether the latter approach is workable. I sadly lack full understanding on how actually running of goroutines and the runtime scheduler really interacts with the scheduling of the threads done by the OS. For instance, whould the described setup do anything to help the goroutine doing accept() have more execution quanta? I'm in doubt because of the interaction between the netpoller and the goroutines. Say, a goroutine have itself locked to its underlying OS thread, that thread has its OS-level priority raised. Now suppose that goroutine does the accept() syscall; the socket backlog is zero, so the syscall would block and so it goes to the netpoller and the goroutine gets parked. Since it has its thread locked, my understanding is such that thread gets essentially dormant (and the runtime is free to spawn another one to fullfill GOMAXPROCS). Now what happens in these two cases: - A client initiates the connection and the netpoller unblocks the goroutine doing accept. As I understand, the runtime will merely figure out that that particular goroutine must run on its dedicated thread, so it will make its P execute it on that thread (hope I got the terminology right). Will the fact the goroutine was bound to its own thread help it get executed with priority compared to other goroutines (which essencially contend for GOMAXPROCS other threads)? - What happens to the thread, to which a goroutine was locked, while that goroutine remains parked? Is this thread somehow suspended and the OS never wakes it until the goroutine gets unblocked? Or is the OS free to schedule it, and if yes, what runs on it when it gets its execution quantum? Any help or pointers would be greatly appreciated! -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.