Re: HAProxy and TIME_WAIT

2011-12-01 Thread Daniel Rankov
Thank you, works like a charm !

2011/11/30 Willy Tarreau 

> On Wed, Nov 30, 2011 at 06:10:29PM +0200, Daniel Rankov wrote:
> > Hi, Thank you, these explonations are really helpfull.
> > Now may be because of a bug or something but "option nolinger" is not
> > working for backend. it works great for frontend. And I've tested putting
> > this option all over the config file... That's is what had confused me.
>
> Indeed you're right, I can reproduce this behaviour too. It happened when
> we introduced the systematic shutdown() before the close() to avoid
> resetting
> too many connections. Please apply the attached patch which fixes it.
>
> Thanks for your report,
> Willy
>
>


Re: HAProxy and TIME_WAIT

2011-11-30 Thread Willy Tarreau
On Wed, Nov 30, 2011 at 06:10:29PM +0200, Daniel Rankov wrote:
> Hi, Thank you, these explonations are really helpfull.
> Now may be because of a bug or something but "option nolinger" is not
> working for backend. it works great for frontend. And I've tested putting
> this option all over the config file... That's is what had confused me.

Indeed you're right, I can reproduce this behaviour too. It happened when
we introduced the systematic shutdown() before the close() to avoid resetting
too many connections. Please apply the attached patch which fixes it.

Thanks for your report,
Willy

>From b7e257fe61890e4edc839d76dc0223a8d5bdb0f2 Mon Sep 17 00:00:00 2001
From: Willy Tarreau 
Date: Wed, 30 Nov 2011 18:02:24 +0100
Subject: BUG: tcp: option nolinger does not work on backends

Daniel Rankov reported that "option nolinger" is inefficient on backends.
The reason is that it is set on the file descriptor only, which does not
prevent haproxy from performing a clean shutdown() before closing. We must
set the flag on the stream_interface instead if we want an RST to be emitted
upon active close.
---
 src/proto_tcp.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index 37d9054..5ccfb81 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -239,7 +239,7 @@ int tcpv4_connect_server(struct stream_interface *si,
setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, (char *) &one, 
sizeof(one));
 
if (be->options & PR_O_TCP_NOLING)
-   setsockopt(fd, SOL_SOCKET, SO_LINGER, (struct linger *) 
&nolinger, sizeof(struct linger));
+   si->flags |= SI_FL_NOLINGER;
 
/* allow specific binding :
 * - server-specific at first
-- 
1.7.2.3



Re: HAProxy and TIME_WAIT

2011-11-30 Thread Daniel Rankov
Hi, Thank you, these explonations are really helpfull.
Now may be because of a bug or something but "option nolinger" is not
working for backend. it works great for frontend. And I've tested putting
this option all over the config file... That's is what had confused me.

OS is centos6, HA-Proxy version 1.4.18. here is the simplified config file,
I quess there is nothing wrong with it, but still:
global
maxconn 32000
daemon
log 127.0.0.1 local1 info
defaults
log global
option tcplog
maxconn 32000
frontend https-in
bind 192.168.2.38:443
default_backend servers-https
option nolinger
backend servers-https
mode tcp
balance source
option redispatch
option nolinger
server jetty-1 127.0.0.1:8443

With this configuration haproxy closes the connection to client with RST,
but with backend it does not. By putting "option nolinger" in defaults
section it works the same way.
Am I wrong or is it a bug ?

Thank you



2011/11/30 Willy Tarreau 

> On Wed, Nov 30, 2011 at 03:56:14PM +0200, Daniel Rankov wrote:
> > Ok, now I'm kind of stuck here.
> > Let me share you my observations on my really simple evirionment:
> > for client I use wget on server with ip 192.168.2.30
> > haproxy is set on server with ip 192.168.2.38
> > and haproxy and web serer comunicate on 127.0.0.1. haproxy is in tcpmode.
> > this is the monitored tcpdump for connection client to haproxy /just the
> > closing connection part/ :
> > 
> > 14:56:40.448210 IP 192.168.2.30.55867 > 192.168.2.38.443: . ack 7983 win
> > 204 
> > 14:56:40.448849 IP 192.168.2.30.55867 > 192.168.2.38.443: F 618:618(0)
> ack
> > 7983 win 204 
> > 14:56:40.449513 IP 192.168.2.38.443 > 192.168.2.30.55867: F 7983:7983(0)
> > ack 619 win 62 
> > 14:56:40.449656 IP 192.168.2.30.55867 > 192.168.2.38.443: . ack 7984 win
> > 204 
> >
> > and this is tcpdump for 127.0.0.1 /just the closing part again/ :
> > 
> > 14:56:40.447887 IP 127.0.0.1.59302 > 127.0.0.1.8443: . ack 7983 win 386
> > 
> > 14:56:40.448914 IP 127.0.0.1.59302 > 127.0.0.1.8443: F 618:618(0) ack
> 7983
> > win 386 
> > 14:56:40.449236 IP 127.0.0.1.8443 > 127.0.0.1.59302: F 7983:7983(0) ack
> 619
> > win 273 
> > 14:56:40.449272 IP 127.0.0.1.59302 > 127.0.0.1.8443: . ack 7984 win 386
> > 
> >
> > So that showes me that the connections from haproxy to webserver are
> closed
> > with FIN/FIN-ACK/ACK.
> > here is netstat -anpo | grep TIME:
> > tcp0  0 127.0.0.1:59302 127.0.0.1:8443
> >  TIME_WAIT   -   timewait (58.73/0/0)
> >
> > is that the expected bahaviour ?
>
> Yes, if you're in TCP mode (I though you were using HTTP mode), it's
> perfectly
> expected because in TCP mode there is no way to know if some important data
> were sent and not received by the other side, so you cannot use an RST to
> force
> a close.
>
> Also, in TCP mode, haproxy just relays on the other side what it sees. So
> as
> you can see, wget closes the connection to haproxy, then haproxy does the
> same
> with the server.
>
> If you want to force an RST, you can use "option nolinger" in the backend.
> But
> then again, this is really not recommended since it can lead to incomplete
> data
> being received by the server. In the case of HTTPS, it should not be an
> issue
> due to the SSL closing handshake, but this is something to keep in mind.
>
> Regards,
> Willy
>
>


Re: HAProxy and TIME_WAIT

2011-11-30 Thread Willy Tarreau
On Wed, Nov 30, 2011 at 03:56:14PM +0200, Daniel Rankov wrote:
> Ok, now I'm kind of stuck here.
> Let me share you my observations on my really simple evirionment:
> for client I use wget on server with ip 192.168.2.30
> haproxy is set on server with ip 192.168.2.38
> and haproxy and web serer comunicate on 127.0.0.1. haproxy is in tcpmode.
> this is the monitored tcpdump for connection client to haproxy /just the
> closing connection part/ :
> 
> 14:56:40.448210 IP 192.168.2.30.55867 > 192.168.2.38.443: . ack 7983 win
> 204 
> 14:56:40.448849 IP 192.168.2.30.55867 > 192.168.2.38.443: F 618:618(0) ack
> 7983 win 204 
> 14:56:40.449513 IP 192.168.2.38.443 > 192.168.2.30.55867: F 7983:7983(0)
> ack 619 win 62 
> 14:56:40.449656 IP 192.168.2.30.55867 > 192.168.2.38.443: . ack 7984 win
> 204 
> 
> and this is tcpdump for 127.0.0.1 /just the closing part again/ :
> 
> 14:56:40.447887 IP 127.0.0.1.59302 > 127.0.0.1.8443: . ack 7983 win 386
> 
> 14:56:40.448914 IP 127.0.0.1.59302 > 127.0.0.1.8443: F 618:618(0) ack 7983
> win 386 
> 14:56:40.449236 IP 127.0.0.1.8443 > 127.0.0.1.59302: F 7983:7983(0) ack 619
> win 273 
> 14:56:40.449272 IP 127.0.0.1.59302 > 127.0.0.1.8443: . ack 7984 win 386
> 
> 
> So that showes me that the connections from haproxy to webserver are closed
> with FIN/FIN-ACK/ACK.
> here is netstat -anpo | grep TIME:
> tcp0  0 127.0.0.1:59302 127.0.0.1:8443
>  TIME_WAIT   -   timewait (58.73/0/0)
> 
> is that the expected bahaviour ?

Yes, if you're in TCP mode (I though you were using HTTP mode), it's perfectly
expected because in TCP mode there is no way to know if some important data
were sent and not received by the other side, so you cannot use an RST to force
a close.

Also, in TCP mode, haproxy just relays on the other side what it sees. So as
you can see, wget closes the connection to haproxy, then haproxy does the same
with the server.

If you want to force an RST, you can use "option nolinger" in the backend. But
then again, this is really not recommended since it can lead to incomplete data
being received by the server. In the case of HTTPS, it should not be an issue
due to the SSL closing handshake, but this is something to keep in mind.

Regards,
Willy




Re: HAProxy and TIME_WAIT

2011-11-30 Thread Daniel Rankov
Ok, now I'm kind of stuck here.
Let me share you my observations on my really simple evirionment:
for client I use wget on server with ip 192.168.2.30
haproxy is set on server with ip 192.168.2.38
and haproxy and web serer comunicate on 127.0.0.1. haproxy is in tcpmode.
this is the monitored tcpdump for connection client to haproxy /just the
closing connection part/ :

14:56:40.448210 IP 192.168.2.30.55867 > 192.168.2.38.443: . ack 7983 win
204 
14:56:40.448849 IP 192.168.2.30.55867 > 192.168.2.38.443: F 618:618(0) ack
7983 win 204 
14:56:40.449513 IP 192.168.2.38.443 > 192.168.2.30.55867: F 7983:7983(0)
ack 619 win 62 
14:56:40.449656 IP 192.168.2.30.55867 > 192.168.2.38.443: . ack 7984 win
204 

and this is tcpdump for 127.0.0.1 /just the closing part again/ :

14:56:40.447887 IP 127.0.0.1.59302 > 127.0.0.1.8443: . ack 7983 win 386

14:56:40.448914 IP 127.0.0.1.59302 > 127.0.0.1.8443: F 618:618(0) ack 7983
win 386 
14:56:40.449236 IP 127.0.0.1.8443 > 127.0.0.1.59302: F 7983:7983(0) ack 619
win 273 
14:56:40.449272 IP 127.0.0.1.59302 > 127.0.0.1.8443: . ack 7984 win 386


So that showes me that the connections from haproxy to webserver are closed
with FIN/FIN-ACK/ACK.
here is netstat -anpo | grep TIME:
tcp0  0 127.0.0.1:59302 127.0.0.1:8443
 TIME_WAIT   -   timewait (58.73/0/0)

is that the expected bahaviour ?

All the best !



2011/11/29 Willy Tarreau 

> Hi Daniel,
>
> On Tue, Nov 29, 2011 at 06:10:46PM +0200, Daniel Rankov wrote:
> > For sure TIME_WAIT connections are not an issue when thay keep
> information
> > about sockets to clients, but when TIME_WAIT connections keep sockets
> bussy
> > for your host where HAProxy is deployed to the backend the limit can be
> > reached - it's defined by ip_local_port_range.
> > Here is what I mean:
> > Client -(1)-> HAProxy -(2)-> Webserver
> >  / it doesn't metter if the web server and haproxy are on the same
> server./
> > I) client connects to haproxy
> > socket is tooken - clientIP:random_port:haproxy_ip:haproxy_port
> >
> > II) haproxy connects to webserver
> > socket is tooken haproxy_local_ip:random_port:backend_ip:backend_port
> >
> > III) client closes a conneciton to haproxy (1) in normal way -
> > FIN/FIN-ACK/ACK
> > this way we have one connections that goes from ESTABLISHED to TIME_WAIT
> > state. we don't really care about this TIME_WAIT connection beacause the
> > socket that is tooken is between a client and haproxy
> > - clientIP:random_port:haproxy_ip:haproxy_port
> >
> > IV) haproxy closes the connection to backend (2) with FIN/FIN-ACK/ACK
> > Now this ESTABLISHED connection goes to TIME_WAIT state. and the socket
> > that is tooken is between the haporxy and backend server.
>
> I agree on this point and this is why it does not happen :-)
>
> Haproxy uses an RST to close the connection to the backend server precisely
> because of this, otherwise it would not work at all. You can strace it, you
> will notice that it does a setsockopt(SO_LINGER, {0}) before closing. In
> fact, you cannot even configure it not to do this because it would cause
> too
> much harm.
>
> (...)
> > I believe that the common architecture is that backend servers are
> > phisically close to haproxy and are on high speed networks where no
> packet
> > lost is expected. So we don't really need TIME_WAIT state here. It's not
> > needed on localhost for sure.
>
> When haproxy closes the connection to a server, we never need the TIME_WAIT
> anyway, because if it closes, it means it has nothing left to say to the
> server and is not interested in getting its response. So even if some data
> were lost, it would not be an issue.
>
> For instance, one situation where you can observe this close is when you
> enable forceclose or http-server-close. You'll see that as soon as the
> last byte of payload is received, haproxy sends an RST to the server to
> release the connection so that another pending request may immediately
> reuse it.
>
> Even if the RST was lost, a packet from the server would reach the haproxy
> machine and match no known connection, causing an RST in return.
>
> That said, I completely agree with all your analysis and that's what has
> caused me a lot of gray hair when implementing the client-side keep-alive,
> precisely because I needed a way to make haproxy close the server
> connection
> without being affected by the TIME_WAIT on this side.
>
> Best regards,
> Willy
>
>


Re: HAProxy and TIME_WAIT

2011-11-29 Thread Willy Tarreau
Hi Daniel,

On Tue, Nov 29, 2011 at 06:10:46PM +0200, Daniel Rankov wrote:
> For sure TIME_WAIT connections are not an issue when thay keep information
> about sockets to clients, but when TIME_WAIT connections keep sockets bussy
> for your host where HAProxy is deployed to the backend the limit can be
> reached - it's defined by ip_local_port_range.
> Here is what I mean:
> Client -(1)-> HAProxy -(2)-> Webserver
>  / it doesn't metter if the web server and haproxy are on the same server./
> I) client connects to haproxy
> socket is tooken - clientIP:random_port:haproxy_ip:haproxy_port
> 
> II) haproxy connects to webserver
> socket is tooken haproxy_local_ip:random_port:backend_ip:backend_port
> 
> III) client closes a conneciton to haproxy (1) in normal way -
> FIN/FIN-ACK/ACK
> this way we have one connections that goes from ESTABLISHED to TIME_WAIT
> state. we don't really care about this TIME_WAIT connection beacause the
> socket that is tooken is between a client and haproxy
> - clientIP:random_port:haproxy_ip:haproxy_port
> 
> IV) haproxy closes the connection to backend (2) with FIN/FIN-ACK/ACK
> Now this ESTABLISHED connection goes to TIME_WAIT state. and the socket
> that is tooken is between the haporxy and backend server.

I agree on this point and this is why it does not happen :-)

Haproxy uses an RST to close the connection to the backend server precisely
because of this, otherwise it would not work at all. You can strace it, you
will notice that it does a setsockopt(SO_LINGER, {0}) before closing. In
fact, you cannot even configure it not to do this because it would cause too
much harm.

(...)
> I believe that the common architecture is that backend servers are
> phisically close to haproxy and are on high speed networks where no packet
> lost is expected. So we don't really need TIME_WAIT state here. It's not
> needed on localhost for sure.

When haproxy closes the connection to a server, we never need the TIME_WAIT
anyway, because if it closes, it means it has nothing left to say to the
server and is not interested in getting its response. So even if some data
were lost, it would not be an issue.

For instance, one situation where you can observe this close is when you
enable forceclose or http-server-close. You'll see that as soon as the
last byte of payload is received, haproxy sends an RST to the server to
release the connection so that another pending request may immediately
reuse it.

Even if the RST was lost, a packet from the server would reach the haproxy
machine and match no known connection, causing an RST in return.

That said, I completely agree with all your analysis and that's what has
caused me a lot of gray hair when implementing the client-side keep-alive,
precisely because I needed a way to make haproxy close the server connection
without being affected by the TIME_WAIT on this side.

Best regards,
Willy




Re: HAProxy and TIME_WAIT

2011-11-29 Thread Daniel Rankov
For sure TIME_WAIT connections are not an issue when thay keep information
about sockets to clients, but when TIME_WAIT connections keep sockets bussy
for your host where HAProxy is deployed to the backend the limit can be
reached - it's defined by ip_local_port_range.
Here is what I mean:
Client -(1)-> HAProxy -(2)-> Webserver
 / it doesn't metter if the web server and haproxy are on the same server./
I) client connects to haproxy
socket is tooken - clientIP:random_port:haproxy_ip:haproxy_port

II) haproxy connects to webserver
socket is tooken haproxy_local_ip:random_port:backend_ip:backend_port

III) client closes a conneciton to haproxy (1) in normal way -
FIN/FIN-ACK/ACK
this way we have one connections that goes from ESTABLISHED to TIME_WAIT
state. we don't really care about this TIME_WAIT connection beacause the
socket that is tooken is between a client and haproxy
- clientIP:random_port:haproxy_ip:haproxy_port

IV) haproxy closes the connection to backend (2) with FIN/FIN-ACK/ACK
Now this ESTABLISHED connection goes to TIME_WAIT state. and the socket
that is tooken is between the haporxy and backend server.
it looks like haproxy_local_ip:random_port:backend_ip:backend_port
if we say that haproxy and webserver will comunicate on 127.0.0.1 and web
server working on port 8080 - then we have a socket like that tooken:
127.0.0.1:RANDOM_PORT:127.0.0.1:8080

This RANDOM_PORT is in range defined in Sysctl ip_local_port_range
This connection on CentOS will be kept for 60seconds.
As you see on a loaded server this limit of open ports might be reached.
(some math - by default we have about 3 open ports for 60 seconds is
about 500 new_connections/second.)

That is why it would be great one to be able to configure haproxy to reset
connection to backend.
I believe that the common architecture is that backend servers are
phisically close to haproxy and are on high speed networks where no packet
lost is expected. So we don't really need TIME_WAIT state here. It's not
needed on localhost for sure.

All the best !



2011/11/29 Willy Tarreau 

> On Tue, Nov 29, 2011 at 09:41:30AM -0500, James Bardin wrote:
> > From looking into this, I don't see an option in HAProxy to RST all
> > closed connections on a backend, though the documentation makes it
> > sound like the nolinger options does do this. Hopefully one of the
> > devs (Willy?) can chime in with some advice.
>
> Indeed, nolinger does this but it's strongly advised not to use it,
> because it precisely kills the TCP connection (reason why there is no
> time_wait left), which causes truncated objects on the remote server
> if the last segments are lost. The reason is that these lost segments
> will not be retransmitted and the client will get an RST instead.
>
> TIME_WAIT sockets are not an issue on a server. The only trouble they're
> causing is that they pollute the "netstat -a" output. But that's all.
> These sockets are totally normal and expected. My record is 5 million
> on a heavily loaded server :-)
>
> There is absolutely no reason to worry about these sockets, they're
> closed and waiting for either the TCP timer, a SYN or an RST to expire.
>
> Best regards,
> Willy
>
>


Re: HAProxy and TIME_WAIT

2011-11-29 Thread Daniel Rankov
I would preffer not to use tw_reuse, couse that will affect the whole
server tcp comunication, not just one process (HAProxy in this case).
So I've tested nolinger but what it does isn't completely the solution.
That's what happens when ising it - lets say that client is closing the
connection to HAProxy with RST - then RST is sent to the backend. But when
a client closes the connection normally with FIN/FIN-ACK/ACK
then FIN/FIN-ACK/ACK is used by Haproxy to close the connection to backend.
That way we hit the core problem again.

I'm looking for a solution as decribed in TCP illustrated:
"it's also possible to abort a connection by sending a reset instead of a
FIN. This is sometimes called an abortive release."
so that no matter how the client closes a connection always RST is sent to
backend.
and it's interesting can it be done if the backend closes a connection with
FIN, a RST to be sent from HAProxy to backend.
This way no useless resources will be taken.

Greetings


2011/11/28 James Bardin 

> On Mon, Nov 28, 2011 at 12:28 PM, Daniel Rankov 
> wrote:
> > Yeap, I'm aware of net.ipv4.tcp_tw_reuse and the need of TIME_WAIT state,
> > but still if there is a way to send a RST /either configuration or
> compile
> > parameter/ the connection will be destroyed.
> >
>
> TIME_WAIT is usually not a problem if port reuse is enabled (I haven't
> seen an example otherwise), and you will usually have FIN_WAIT1
> sockets if there is a problem with connections terminating badly.
>
> Now that I recall that the socket option to always send RST packets is
> called SO_NOLINGER, I noticed that there is an 'option nolinger' for
> both front and backends in happroxy.
>
>
> -jim
>


Re: HAProxy and TIME_WAIT

2011-11-28 Thread James Bardin
On Mon, Nov 28, 2011 at 12:28 PM, Daniel Rankov  wrote:
> Yeap, I'm aware of net.ipv4.tcp_tw_reuse and the need of TIME_WAIT state,
> but still if there is a way to send a RST /either configuration or compile
> parameter/ the connection will be destroyed.
>

TIME_WAIT is usually not a problem if port reuse is enabled (I haven't
seen an example otherwise), and you will usually have FIN_WAIT1
sockets if there is a problem with connections terminating badly.

Now that I recall that the socket option to always send RST packets is
called SO_NOLINGER, I noticed that there is an 'option nolinger' for
both front and backends in happroxy.


-jim



Re: HAProxy and TIME_WAIT

2011-11-28 Thread Daniel Rankov
Yeap, I'm aware of net.ipv4.tcp_tw_reuse and the need of TIME_WAIT state,
but still if there is a way to send a RST /either configuration or compile
parameter/ the connection will be destroyed.


2011/11/28 James Bardin 

> On Mon, Nov 28, 2011 at 11:50 AM, Daniel Rankov 
> wrote:
>  And on
> > loaded server this will cause trouble. Isn't there a chance for HAProxy
> to
> > send RST, so that conneciton will be dropped ?
>
> An RST packet won't make the TIME_WAIT socket disappear. It's part if
> the TCP protocol, and a socket will sit in that state for 2 minutes
> after closing.
>
>
> You can put `net.ipv4.tcp_tw_reuse = 1` in your sysctl.conf to allow
> sockets in TIME_WAIT to be reused is needed.
>
> -jim
>


Re: HAProxy and TIME_WAIT

2011-11-28 Thread James Bardin
On Mon, Nov 28, 2011 at 11:50 AM, Daniel Rankov  wrote:
 And on
> loaded server this will cause trouble. Isn't there a chance for HAProxy to
> send RST, so that conneciton will be dropped ?

An RST packet won't make the TIME_WAIT socket disappear. It's part if
the TCP protocol, and a socket will sit in that state for 2 minutes
after closing.


You can put `net.ipv4.tcp_tw_reuse = 1` in your sysctl.conf to allow
sockets in TIME_WAIT to be reused is needed.

-jim