On 2018/8/9 01:01, Patrick Hemmer wrote:
> There's an issue on current master (287527a) where haproxy is losing
> track of its connections, and they're getting stuck in CLOSE_WAIT. And
> it's counting these connections towards limits (such as maxconn).
> Eventually maxconn is reached and everything starts queuing, and
> timing out.
>
> Here's an example:
>         # lsof -p 25520
>         COMMAND   PID    USER   FD     TYPE             DEVICE  
> SIZE/OFF    NODE NAME
>         ....
>         haproxy 25520 phemmer    0u     CHR               16,5
> 0x1409380c    1797 /dev/ttys005
>         haproxy 25520 phemmer    1u     CHR               16,5
> 0x1409380c    1797 /dev/ttys005
>         haproxy 25520 phemmer    2u     CHR               16,5
> 0x1409380c    1797 /dev/ttys005
>         haproxy 25520 phemmer    3u 
> KQUEUE                                       count=0, state=0xa
>         haproxy 25520 phemmer    4u    unix 0x933b1c4c9ed396b7       
> 0t0         /tmp/haproxy.sock.25520.tmp
>         haproxy 25520 phemmer    5u    IPv4 0x933b1c4cbf5fc73f       
> 0t0     TCP *:8001 (LISTEN)
>         haproxy 25520 phemmer    6u    IPv4 0x933b1c4c925fe437       
> 0t0     UDP *:51977
>         haproxy 25520 phemmer    7     PIPE 0x933b1c4c9f89c457     
> 16384         ->0x933b1c4c9f89bf17
>         haproxy 25520 phemmer    8     PIPE 0x933b1c4c9f89bf17     
> 16384         ->0x933b1c4c9f89c457
>         haproxy 25520 phemmer    9u    unix 0x933b1c4c9548c96f       
> 0t0         /tmp/haproxy.sock.25520.tmp
>         haproxy 25520 phemmer   11u    IPv4 0x933b1c4cc0cca73f       
> 0t0     TCP 127.0.0.1:8001->127.0.0.1:59199 (CLOSE_WAIT)
>         haproxy 25520 phemmer   12u    IPv4 0x933b1c4cbfd20ddf       
> 0t0     TCP 127.0.0.1:59200->127.0.0.1:8081 (CLOSE_WAIT)
>
>
> FD 11 was the client making the request to haproxy, and FD 12 was
> haproxy to the server.
>
>         > show fd
>                     4 : st=0x05(R:PrA W:pra) ev=0x01(heopI) [lc]
> cnext=-13 cprev=-2 tmask=0x1 umask=0x0 owner=0x7f852af08430
> iocb=0x10bb89c40(listener_accept) l.st=RDY fe=GLOBAL
>                     5 : st=0x05(R:PrA W:pra) ev=0x01(heopI) [lc]
> cnext=-13 cprev=-2 tmask=0x1 umask=0x0 owner=0x7f852af08a40
> iocb=0x10bb89c40(listener_accept) l.st=RDY fe=f1
>                     7 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc]
> cnext=-3 cprev=0 tmask=0x1 umask=0x0 owner=0x10bbc2ca0
> iocb=0x10bbc2ca0(poller_pipe_io_handler)
>                     9 : st=0x20(R:pra W:pRa) ev=0x00(heopi) [lc]
> cnext=-3 cprev=-2 tmask=0x1 umask=0x1 owner=0x7f852ae14550
> iocb=0x10bba60a0(conn_fd_handler) cflg=0x00201300 fe=GLOBAL mux=PASS
> mux_ctx=0x7f852ca00000
>                  11 : st=0x22(R:pRa W:pRa) ev=0x10(Heopi) [lc]
> cnext=-3 cprev=-2 tmask=0x1 umask=0x0 owner=0x7f852ae13ef0
> iocb=0x10bba60a0(conn_fd_handler) cflg=0x80241300 fe=f1 mux=PASS
> mux_ctx=0x7f852ae14080
>                  12 : st=0x25(R:PrA W:pRa) ev=0x00(heopi) [Lc]
> cnext=-17 cprev=-2 tmask=0x1 umask=0x0 owner=0x7f852ca000c0
> iocb=0x10bba60a0(conn_fd_handler) cflg=0x00202306 sv=s1/b1 mux=PASS
> mux_ctx=0x7f852ae14bb0
>
>         > show sess
>         0x7f852ae14140: proto=tcpv4 src=127.0.0.1:59199 fe=f1 be=b1
> srv=s1 ts=00 age=2m9s calls=6 rq[f=8840020h,i=0,an=8000h,rx=,wx=,ax=]
> rp[f=80400000h,i=0,an=a00000h,rx=,wx=,ax=] s0=[7,8h,fd=11,ex=]
> s1=[7,118h,fd=12,ex=] exp=
>         0x7f852ae14830: proto=unix_stream src=unix:1 fe=GLOBAL
> be=<NONE> srv=<none> ts=00 age=20s calls=3
> rq[f=40c08202h,i=0,an=00h,rx=,wx=,ax=]
> rp[f=c0048202h,i=0,an=00h,rx=,wx=,ax=] s0=[7,ch,fd=9,ex=]
> s1=[7,4018h,fd=-1,ex=] exp=23h59m
>
>
>
> Here's the config I'm using:
>         global
>             stats socket /tmp/haproxy.sock level admin
>         defaults
>             log 127.0.0.1:1234 daemon
>             mode http
>             option httplog
>             timeout queue 5s
>         frontend f1
>             bind :8001
>             default_backend b1
>         backend b1
>             server s1 127.0.0.1:8081 maxconn 1
>
>
> Log output around the time it broke:
>         <30>Aug  9 00:45:56 haproxy[25520]: 127.0.0.1:59186
> [09/Aug/2018:00:45:56.620] f1 b1/s1 0/167/1/107/275 200 121 - - ----
> 2/2/1/1/0 0/2 "GET / HTTP/1.1"
>         <30>Aug  9 00:45:57 haproxy[25520]: 127.0.0.1:59189
> [09/Aug/2018:00:45:56.726] f1 b1/s1 0/169/1/108/278 200 121 - - ----
> 2/2/1/1/0 0/2 "GET / HTTP/1.1"
>         <30>Aug  9 00:45:57 haproxy[25520]: 127.0.0.1:59193
> [09/Aug/2018:00:45:56.948] f1 b1/s1 0/56/1/109/167 200 121 - - ----
> 2/2/1/1/0 0/1 "GET / HTTP/1.1"
>         <30>Aug  9 00:45:57 haproxy[25520]: 127.0.0.1:59196
> [09/Aug/2018:00:45:57.056] f1 b1/s1 0/58/1/101/161 200 121 - - ----
> 2/2/1/1/0 0/1 "GET / HTTP/1.1"
>         <30>Aug  9 00:46:02 haproxy[25520]: 127.0.0.1:59202
> [09/Aug/2018:00:45:57.351] f1 b1/<NOSRV> 0/5001/-1/-1/5007 503 212 - -
> sQ-- 3/3/2/0/0 0/1 "GET / HTTP/1.1"
>         <30>Aug  9 00:46:02 haproxy[25520]: 127.0.0.1:59204
> [09/Aug/2018:00:45:57.546] f1 b1/<NOSRV> 0/5005/-1/-1/5005 503 212 - -
> sQ-- 3/3/2/0/0 0/2 "GET / HTTP/1.1"
>         <30>Aug  9 00:46:07 haproxy[25520]: 127.0.0.1:59206
> [09/Aug/2018:00:46:02.414] f1 b1/<NOSRV> 0/5010/-1/-1/5011 503 212 - -
> sQ-- 3/3/2/0/0 0/2 "GET / HTTP/1.1"
>         <30>Aug  9 00:46:08 haproxy[25520]: 127.0.0.1:59208
> [09/Aug/2018:00:46:03.610] f1 b1/<NOSRV> 0/5010/-1/-1/5010 503 212 - -
> sQ-- 3/3/2/0/0 0/2 "GET / HTTP/1.1"
>
>
> Attached is the debug output (compiled with DEBUG=-DDEBUG_FULL)
>
> To reproduce I was hammering it with several clients simultaneously,
> and a server which basically slept for 100ms before responding. Even
> with this it's still rather difficult to trigger. I sometimes will
> have to run for a few hours before it happens.
>
> Version data:
>         # ./haproxy -vv     
>         Sharing comp_state with email_alert
>         Sharing filter with tcpcheck_ru
>         Sharing sig_handlers with pipe
>         Sharing tasklet with email_alert
>         Sharing notification with email_alert
>         Sharing session with task
>         Sharing conn_stream with task
>         Sharing pendconn with tcpcheck_ru
>         Sharing http_txn with spoe_ctx
>         Sharing hlua with uniqueid
>         HA-Proxy version 1.9-dev1-287527-26 2018/08/08
>         Copyright 2000-2018 Willy Tarreau <wi...@haproxy.org>
>
>         Build options :
>             TARGET  = osx
>             CPU     = generic
>             CC      = gcc
>             CFLAGS  = -O0 -g -fno-strict-aliasing
> -Wdeclaration-after-statement -fwrapv -fno-strict-overflow
> -Wno-address-of-packed-member -Wno-null-dereference -Wno-unused-label
>             OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
>
>         Default settings :
>             maxconn = 2000, bufsize = 16384, maxrewrite = 1024,
> maxpollevents = 200
>
>         Built with OpenSSL version : OpenSSL 1.1.0h  27 Mar 2018
>         Running on OpenSSL version : OpenSSL 1.1.0h  27 Mar 2018
>         OpenSSL library supports TLS extensions : yes
>         OpenSSL library supports SNI : yes
>         OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
>         Built with Lua version : Lua 5.3.4
>         Built with transparent proxy support using:
>         Encrypted password support via crypt(3): yes
>         Built with PCRE version : 8.42 2018-03-20
>         Running on PCRE version : 8.42 2018-03-20
>         PCRE library supports JIT : no (USE_PCRE_JIT not set)
>         Built with zlib version : 1.2.11
>         Running on zlib version : 1.2.11
>         Compression algorithms supported : identity("identity"),
> deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
>         Built with network namespace support.
>
>         Available polling systems :
>                  kqueue : pref=300,  test result OK
>                      poll : pref=200,  test result OK
>                  select : pref=150,  test result OK
>         Total: 3 (3 usable), will use kqueue.
>
>         Available multiplexer protocols :
>         (protocols markes as <default> cannot be specified using
> 'proto' keyword)
>                                     h2 : mode=HTTP       side=FE
>                      <default> : mode=TCP|HTTP   side=FE|BE
>
>         Available filters :
>             [SPOE] spoe
>             [COMP] compression
>             [TRACE] trace
>
>
> -Patrick

I think this is my last ping.

-Patrick

Reply via email to