Hi, Mathieu Othacehe <othac...@gnu.org> skribis:
> I hoped that it would decrease the load of the Cuirass publish server > that was flooded by worker connections. While the situation is better, > there are still a lot of substitute timeout errors on Cuirass. > > I have also observed those timeouts using the publish server behind > ci.guix.gnu.org. This makes me think that there's something else that is > preventing the server from honoring every connection request in less > than 5 seconds (the timeout duration). I don’t see this when substituting from https://ci.guix.gnu.org these days. You mentioned on IRC that nginx logs show that ‘guix publish’ times out. Looking at /var/log/nginx/error.log, I see “Connection reset by peer” and “Broken pipe”, which could indicate that the client closed the connection (which was open) prematurely, maybe due to an internal timeout. What I’d like to know is whether those timeouts you mention appear during connection establishment (connect(2) on the client side doesn’t complete within 5 seconds) or after connection establishment. Connection establishment is handled by the kernel and ‘guix publish’ is not involved. However, the listen(2) call made in (web server http) sets a maximum backlog of 128 connections; if at some point 128 connections are already queued, then the 129th client will have connect(2) take some time. Seems unlikely, but who knows. Ludo’.