Hi Willy, and community developers, I am not sure if I am doing something wrong, but wanted to report some issues that I am seeing. Please let me know if this is a problem.
1. HAProxy system: Kernel: 4.17.13, CPU: 48 core E5-2670 v3 Memory: 128GB memory NIC: Mellanox 40g with IRQ pinning 2. Client, 48 core similar to server. Test command line: wrk -c 4800 -t 48 -d 30s http://<IP:80>/128 3. HAProxy version: I am testing both 1.8.14 and 1.9-dev3 (git checkout as of Oct 2nd). # haproxy-git -vv HA-Proxy version 1.9-dev3 2018/09/29 Copyright 2000-2018 Willy Tarreau <wi...@haproxy.org> Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits OPTIONS = USE_ZLIB=yes USE_OPENSSL=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016 Running on OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Encrypted password support via crypt(3): yes Built with multi-threading support. Built with PCRE version : 8.38 2015-11-23 Running on PCRE version : 8.38 2015-11-23 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built with zlib version : 1.2.8 Running on zlib version : 1.2.8 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available multiplexer protocols : (protocols markes as <default> cannot be specified using 'proto' keyword) h2 : mode=HTTP side=FE <default> : mode=TCP|HTTP side=FE|BE Available filters : [SPOE] spoe [COMP] compression [TRACE] trace 4. HAProxy results for #processes and #threads # Threads-RPS Procs-RPS 1 20903 19280 2 46400 51045 4 96587 142801 8 172224 254720 12 210451 437488 16 173034 437375 24 79069 519367 32 55607 586367 48 31739 596148 5. Lock stats for 1.9-dev3: Some write locks on average took a lot more time to acquire, e.g. "POOL" and "TASK_WQ". For 48 threads, I get: Stats about Lock FD: # write lock : 143933900 # write unlock: 143933895 (-5) # wait time for write : 11370.245 msec # wait time for write/lock: 78.996 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock TASK_RQ: # write lock : 2062874 # write unlock: 2062875 (1) # wait time for write : 7820.234 msec # wait time for write/lock: 3790.941 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock TASK_WQ: # write lock : 2601227 # write unlock: 2601227 (0) # wait time for write : 5019.811 msec # wait time for write/lock: 1929.786 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock POOL: # write lock : 2823393 # write unlock: 2823393 (0) # wait time for write : 11984.706 msec # wait time for write/lock: 4244.788 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock LISTENER: # write lock : 184 # write unlock: 184 (0) # wait time for write : 0.011 msec # wait time for write/lock: 60.554 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock PROXY: # write lock : 291557 # write unlock: 291557 (0) # wait time for write : 109.694 msec # wait time for write/lock: 376.235 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock SERVER: # write lock : 1188511 # write unlock: 1188511 (0) # wait time for write : 854.171 msec # wait time for write/lock: 718.690 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock LBPRM: # write lock : 1184709 # write unlock: 1184709 (0) # wait time for write : 778.947 msec # wait time for write/lock: 657.501 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock BUF_WQ: # write lock : 669247 # write unlock: 669247 (0) # wait time for write : 252.265 msec # wait time for write/lock: 376.939 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock STRMS: # write lock : 9335 # write unlock: 9335 (0) # wait time for write : 0.910 msec # wait time for write/lock: 97.492 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec Stats about Lock VARS: # write lock : 901947 # write unlock: 901947 (0) # wait time for write : 299.224 msec # wait time for write/lock: 331.753 nsec # read lock : 0 # read unlock : 0 (0) # wait time for read : 0.000 msec # wait time for read/lock : 0.000 nsec 6. CPU utilization after test for processes/threads: haproxy-1.9-dev3 runs at 4800% (48 cpus) for 30 seconds after the test is done. For 1.8.14, this behavior was not seen. Ran the following command for both: "ss -tnp | awk '{print $1}' | sort | uniq -c | sort -n" 1.8.14 during test: 451 SYN-SENT 9166 ESTAB 1.8.14 after test: 2 ESTAB 1.9-dev3 during test: 109 SYN-SENT 9400 ESTAB 1.9-dev3 after test: 2185 CLOSE-WAIT 2187 ESTAB All connections that were in CLOSE-WAIT were from the client, while all connections in ESTAB state were to the server. This lasted for 30 seconds. On the client system, all sockets were in FIN-WAIT-2 state: 2186 FIN-WAIT-2 This (2185/2186) seems to imply that client closed the connection but haproxy did not close the socket for 30 seconds. This also results in high CPU utilization on haproxy for some reason (100% for each process for 30 seconds), which is also unexpected as the remote side has closed the socket. 7. Configuration file for process mode: global daemon maxconn 26000 nbproc 48 stats socket /var/run/ha-1-admin.sock mode 600 level admin process 1 # (and so on for 48 processes). defaults option http-keep-alive balance leastconn retries 2 option redispatch maxconn 25000 option splice-response option tcp-smart-accept option tcp-smart-connect option splice-auto timeout connect 5000ms timeout client 30000ms timeout server 30000ms timeout client-fin 30000ms timeout http-request 10000ms timeout http-keep-alive 75000ms timeout queue 10000ms timeout tarpit 15000ms frontend fk-fe-upgrade-80 mode http default_backend fk-be-upgrade bind <VIP>:80 process 1 # (and so on for 48 processes). backend fk-be-upgrade mode http default-server maxconn 2000 slowstart # 58 server lines follow, e.g.: "server <name> <ip:80>" 8. Configuration file for thread mode: global daemon maxconn 26000 stats socket /var/run/ha-1-admin.sock mode 600 level admin nbproc 1 nbthread 48 # cpu-map auto:1/1-48 0-39 defaults option http-keep-alive balance leastconn retries 2 option redispatch maxconn 25000 option splice-response option tcp-smart-accept option tcp-smart-connect option splice-auto timeout connect 5000ms timeout client 30000ms timeout server 30000ms timeout client-fin 30000ms timeout http-request 10000ms timeout http-keep-alive 75000ms timeout queue 10000ms timeout tarpit 15000ms frontend fk-fe-upgrade-80 mode http bind <VIP>:80 process 1/1-48 default_backend fk-be-upgrade backend fk-be-upgrade mode http default-server maxconn 2000 slowstart # 58 server lines follow, e.g.: "server <name> <ip:80>" I had also captured 'perf' output for the system for thread vs processes, can send it later if required. Thanks, - Krishna