Hi,

On Tue, Mar 20, 2012 at 03:30:18PM +0100, Mostowiec Dominik wrote:
> Hi,
> When I stress testing haproxy and reload it with -sf option:
> "The server is now under siege...[error] socket: unable to connect sock.c:222:
> Connection reset by peer
> [error] socket: unable to connect sock.c:222: Connection >
> [error] socket: unable to connect sock.c:222: Connection >
>  ...
>  "
> It sends many TCP RST for a while.
> 
> my sysctl option:
> net.ipv4.tcp_tw_reuse = 1
> net.ipv4.tcp_tw_recycle = 1
> net.ipv4.tcp_max_tw_buckets = 631056
> net.ipv4.tcp_max_orphans = 631056
> 
> This is known problem?

Yes, this is a known limitation. When you use -sf, the new process and
the old one synchronize so that the old one releases the listening FD
and the new one starts listening. The window is very short, but closing
the old FD means that pending connections requests are dropped at the
same time, causing the RSTs you observe. The higher the RTT between
your clients and your box, the more likely you are to see them. It's
hard to observe them on a local network under normal loads.

> Can I do something to fix this?

Krisztian Ivancso is working on FD passing between the old and the new
process, which should catch most of these issues. The difficulty remains
in identifying which FD can be reused and possibly adjusted when a number
of options have been set (eg: MSS, interface binding, ...).

In the mean time there is a kernel patch available on the site to enable
SO_REUSE_PORT, which allows both processes to bind the port at the same
time. It totally clears the uncertainty window since the new process binds
and only then asks the other one to release the ports. But still there are
a few RST left due to the half-open connections that cannot be transferred.

Regards,
Willy


Reply via email to