On Fri, Mar 10, 2017 at 04:09:10PM +0100, Willy Tarreau wrote:
> Hi again,
> 
> I'm having an issue with your reproducer, it doesn't work at
> all for me and I'm a bit surprized by this :
> 
> On Wed, Mar 08, 2017 at 10:09:25PM +0800, longhb wrote:
> >  [PATCH] BUG/MAJOR: stream: fix tcp half connection expire causes cpu 100%
> > 
> >  Repetition condition:     
> >      haproxy config:         
> >          global:            
> >              tune.bufsize 10485760         
> >          defaults             
> >              timeout server-fin 90s    
> >              timeout client-fin 90s
> >          backend node2
> >              mode tcp
> >              timeout server 900s
> >              timeout connect 10s
> >              server def 127.0.0.1:3333
> >          frontend fe_api
> >              mode  tcp
> >              timeout client 900s
> >              bind :1990
> >              use_backend node2
> >     timeout server-fin shorter than timeout server, the backend server
> >     sends data, this package is left in the cache of haproxy, the backend
> >     server continue sending fin package, haproxy recv fin package. this
> >     time the session information is as follows:
> >         0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
> >         srv=def ts=08 age=1s calls=3 
> > rq[f=848000h,i=0,an=00h,rx=14m58s,wx=,ax=]
> >         rp[f=8004c020h,i=0,an=00h,rx=,wx=14m58s,ax=] s0=[7,0h,fd=6,ex=]
> >         s1=[7,18h,fd=7,ex=] exp=14m58s
> >     rp has set the CF_SHUTR state, next, the client sends the fin package,
> >     session information is as follows:
> >         0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
> >         srv=def ts=08 age=38s calls=4 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
> >         rp[f=8004c020h,i=0,an=00h,rx=1m11s,wx=14m21s,ax=] s0=[7,0h,fd=6,ex=]
> >         s1=[9,10h,fd=7,ex=] exp=1m11s
> 
> Here, as you mentionned, both remotes have sent their FIN so once the
> data are transferred the sessions should close. So I'm definitely missing
> something. Does the server (or the client) send more data than the buffer
> can store ? Does one of the other side refrain from reading all the data ?
> I've tried various such scenarios and I cannot reproduce your situation
> unfortunately. I have an idea of how to definitely get rid of all this
> mess but I have no way to validate that it will work in your case. Any
> help would be much appreciated. BTW, if you want to have more detailed
> session dumps, you can type "show sess <ptr>" or "show sess all", you'll
> get many more details about the internals.
> 
> Also, could you tell me what version you are using ?

OK don't waste your time, I finally managed to get it to work by filling
the buffer from the client to the server and preventing the server from
reading these data. I did it with tcploop (I've also reduced the timeouts) :

# config :

global
    tune.bufsize 10485760

defaults
    timeout server-fin 3s
    timeout client-fin 3s

backend node2
    mode tcp
    timeout server 90s
    timeout connect 1s
    server def 127.0.0.1:3333

frontend fe_api
    mode  tcp
    timeout client 90s
    bind :1990
    use_backend node2

$ tcploop 3333 L W N20 A P100 F P10000 &
$ tcploop 127.0.0.1:1990 C S1000000 F

strace shows that epoll_wait() loops after 3 seconds.

I think we can address it centrally in the shutw() functions by setting
the clientfin/serverfin values into the stream interface, which will
allow us to remove all the incomplete tests that are spread all over
the code.

Willy

Reply via email to