We're experiencing a problem with our program which serves HTTP requests.

Its clients have TCP connection timeouts set to 1 second, and under
certain pattern of heavy load the server fails to perform some
net.netFD.Accept() calls in time so a fraction of clients gets I/O
timeouts when attempting to connect.

Thanks to the Go runtime tracing facility, we were able to pinpoint that
those I/O timeouts happen due to the goroutine which does the accept() call
being unblocked by the netpoller but then waits in a run queue to
actually have a chance to run for too long (in our case -- sometimes for
over two seconds).  We hypothesize that this happens due to heavy load on
the goroutine scheduler.

Still, we have to invent a temporary workaround until the real cause of
the degraded performance is understood and fixed.  We were able to come
up with two workarounds:

- Run multiple instances of net/http.Server in several goroutine.

  This appears to alleviate the problem.
  We hypothesize that doing this can be viewed as raising the relative
  number of quanta the scheduler will grant to gorutines serving the
  accept() call.

- Lock the goroutine which runs the net/http.Server instance to its
  underlying OS thread (and may be do a syscall to raise that thread's
  priority above normal).

  We did not yet test this approach but it appears to be better that the
  first as it should lower possible contentions between multiple
  instances of the net/http.Server type, and makes the whole setup
  simpler.

I'd like to solicit insight on whether the latter approach is workable.

I sadly lack full understanding on how actually running of goroutines
and the runtime scheduler really interacts with the scheduling of the
threads done by the OS.  For instance, whould the described setup do
anything to help the goroutine doing accept() have more execution
quanta?

I'm in doubt because of the interaction between the netpoller and the
goroutines.  Say, a goroutine have itself locked to its underlying OS
thread, that thread has its OS-level priority raised. Now suppose that
goroutine does the accept() syscall; the socket backlog is zero, so the
syscall would block and so it goes to the netpoller and the goroutine
gets parked.  Since it has its thread locked, my understanding is such
that thread gets essentially dormant (and the runtime is free to spawn
another one to fullfill GOMAXPROCS).

Now what happens in these two cases:

- A client initiates the connection and the netpoller unblocks the
  goroutine doing accept.

  As I understand, the runtime will merely figure out that that
  particular goroutine must run on its dedicated thread, so it will make
  its P execute it on that thread (hope I got the terminology right).

  Will the fact the goroutine was bound to its own thread help it get
  executed with priority compared to other goroutines (which essencially
  contend for GOMAXPROCS other threads)?

- What happens to the thread, to which a goroutine was locked, while
  that goroutine remains parked?  Is this thread somehow suspended and
  the OS never wakes it until the goroutine gets unblocked?
  Or is the OS free to schedule it, and if yes, what runs on it when it
  gets its execution quantum?

Any help or pointers would be greatly appreciated!

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to