Did a bit more digging on the most recent instance, and found that the haproxy pid doing the hogging was handling a connection to the stats port :
listen haproxy_stats :50000 stats enable stats uri / no log , with this 'netstat -pantlu' entry : tcp 0 99756 10.34.176.98:50000 10.255.247.189:54484 ESTABLISHED 9499/haproxy I'm suspecting that a connection to the stats port goes wonky with a '-sf' reload, but I'll have to wait for it to re-appear to poke further. I'll look first for a stats port connection handled by the pegged process, then use 'tcpkill' to kill just that connection (rather than the whole process, which may be handling other connections). Its been happening 2 to 3 times a week, and I now have alerting around the event - I'll post more info as I get it ... On Fri, Apr 15, 2016 at 4:28 PM, Cyril Bonté <cyril.bo...@free.fr> wrote: > Hi Jim, > > Le 15/04/2016 23:20, Jim Freeman a écrit : >> >> I have haproxy slaved to 2d cpu (CPU1), with frequent config changes >> and a '-sf' soft-stop with the now-old non-listening process nannying >> old connections. >> >> Sometimes CPU1 goes to %100, and then a few minutes later request >> latencies suffer across multiple haproxy peers. >> >> An strace of the nanny haproxy process shows a tight loop of : >> >> epoll_wait(0, {}, 200, 0) = 0 >> epoll_wait(0, {}, 200, 0) = 0 >> epoll_wait(0, {}, 200, 0) = 0 >> >> I've searched the archives and found similar but old-ish complaints >> about similar circumstances, but with fixes/patches mentioned. >> >> This has happened with both 1.5.3 and 1.5.17. >> >> Insights ? > > > Can you provide your configuration (without sensible data) ? > Are you using peers ? > > Also, do you have a reproductible testcase that we can play with, or is it > absolutely random ? > > > >> >> =========== >> >> # cat /proc/version >> Linux version 3.16.0-0.bpo.4-amd64 (debian-ker...@lists.debian.org) >> (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian >> 3.16.7-ckt25-1~bpo70+1 (2016-04-02) >> >> # haproxy -vv >> HA-Proxy version 1.5.17 2016/04/13 >> Copyright 2000-2016 Willy Tarreau <wi...@haproxy.org> >> >> Build options : >> TARGET = linux2628 >> CPU = generic >> CC = gcc >> CFLAGS = -g -O2 -fstack-protector --param=ssp-buffer-size=4 >> -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 >> OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_PCRE=1 >> >> Default settings : >> maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 >> >> Encrypted password support via crypt(3): yes >> Built with zlib version : 1.2.7 >> Compression algorithms supported : identity, deflate, gzip >> Built with OpenSSL version : OpenSSL 1.0.1e 11 Feb 2013 >> Running on OpenSSL version : OpenSSL 1.0.1e 11 Feb 2013 >> OpenSSL library supports TLS extensions : yes >> OpenSSL library supports SNI : yes >> OpenSSL library supports prefer-server-ciphers : yes >> Built with PCRE version : 8.30 2012-02-04 >> PCRE library supports JIT : no (USE_PCRE_JIT not set) >> Built with transparent proxy support using: IP_TRANSPARENT >> IPV6_TRANSPARENT IP_FREEBIND >> >> Available polling systems : >> epoll : pref=300, test result OK >> poll : pref=200, test result OK >> select : pref=150, test result OK >> Total: 3 (3 usable), will use epoll. >> > > > -- > Cyril Bonté