I've come across a situation whereby file transfers consistently fail from
an httpd server. On the one hand it's a bit of an edge case, but on the
other it definitely seems to be incorrect behaviour. I'm sure this must
have been discussed before, but I couldn't find anything much on the list
with an admittedly fairly brief search.

Essentially, npm/event/event.c waits in lingering close state for
MAX_SECS_TO_LINGER ( which is defined as 30 ) before forcibly closing the
connection - if there's still unacknowledged write data in the kernel
socket at this point, the connection fails. Luckily, the conditions under
which this can happen are fairly limited - essentially it amounts to the
receiver not being able to accept data quickly enough. On Linux at least,
the default write buffer for a socket seems to be 212992 bytes ( well,
that's /proc/sys/net/core/wmem_(default|max), the actual value used will be
less - the manpage suggesting half, though my experiments don't bear that
out. ) For that to drain in 30 seconds, the transfer speed needs to be at
least 57kbit/s. Whilst that's pretty slow, remember that there could be
many simultaneous connections, so the size of the pipe that starts to cause
issues could be considerably larger. Of course, the file(s) being
transferred would also need to be big enough to fill that buffer - smaller
files result in even lower transfer rates needed before the issue happens.

A simple test case for all this is I set up a web server, client machine,
and two routers in between to act as a WAN emulator. On each of the ( Linux
) routers I did:

tc qdisc add dev eth2 root netem limit 100000 rate 1000kbit

( eth2 is obviously the "WAN" interface. ) Issuing 20 simultaneous "wget"
commands from the client machine to fetch a 1M file with no retries
resulted in 14 of them failing. It actually seems to struggle at 8
simultaneous connections and above - this is with a fairly default
compilation of httpd from source.

On Linux at least, you can see how much unsent data remains by querying the
SIOCOUTQ ioctl, so the mitigation would be to check to see that ANY data
was draining at all, and if so ( and there's some left ) extend the
lingering close time and repeat. However, this wouldn't be a cross platform
solution, but it would at least be the "correct" thing to do in terms of
network function. Not sure if there's an equivalent on other systems.

Adam

Reply via email to