Hi, On Wed, Apr 29, 2015 at 04:26:56PM +0530, Krishna Kumar (Engineering) wrote: > ------------------------------------------------------------------------------------------------------------------------ > Request directly to 1 nginx backend server, size=256 bytes: > > Command: ab -k -n 100000 -c 1000 <nginx>:80/256 > Requests per second: 69749.02 [#/sec] (mean) > Transfer rate: 34600.18 [Kbytes/sec] received > ------------------------------------------------------------------------------------------------------------------------ > Request to haproxy configured with 4 nginx backends (nbproc=4), size=256 > bytes: > > Command: ab -k -n 100000 -c 1000 <haproxy>:80/256 > Requests per second: 19071.55 [#/sec] (mean) > Transfer rate: 9461.28 [Kbytes/sec] received
These numbers are extremely low and very likely indicate an http close mode combined with an untuned nf_conntrack. > mpstat (first 4 processors only, rest are almost zero): > Average: CPU %usr %nice %sys %iowait %irq %soft %steal > %guest %gnice %idle > Average: all 0.44 0.00 1.59 0.00 0.00 2.96 0.00 > 0.00 0.00 95.01 > Average: 0 0.25 0.00 0.75 0.00 0.00 98.01 0.00 > 0.00 0.00 1.00 This CPU is spending its time in softirq, probably due to conntrack spending a lot of time looking for the session for each packet in too small a hash table. > ------------------------------------------------------------------------------------------------------------------------ > Request directly to 1 nginx backend server, size=64K > > Command: ab -k -n 100000 -c 1000 <nginx>:80/64K > Requests per second: 3342.56 [#/sec] (mean) > Transfer rate: 214759.11 [Kbytes/sec] received > ------------------------------------------------------------------------------------------------------------------------ Note, this is about 2 Gbps. How is your network configured ? You should normally see either 1 Gbps with a gig NIC or 10 Gbps with a 10G NIC, because retrieving a static file is very cheap. Would you happen to be using bonding in round-robin mode maybe ? If that's the case, it's a performance disaster due to out-of-order packets and could explain some of the high %softirq. > Request to haproxy configured with 4 nginx backends (nbproc=4), size=64K > > Command: ab -k -n 100000 -c 1000 <haproxy>:80/64K > > Requests per second: 1283.62 [#/sec] (mean) > Transfer rate: 82472.35 [Kbytes/sec] received That's terribly low. I'm doing more than that on a dockstar that fits in my hand and is powered over USB! > pidstat: > Average: UID PID %usr %system %guest %CPU CPU Command > Average: 105 471 0.93 14.70 0.00 15.63 - haproxy > Average: 105 472 1.12 21.55 0.00 22.67 - haproxy > Average: 105 473 1.41 20.95 0.00 22.36 - haproxy > Average: 105 475 0.22 4.85 0.00 5.07 - haproxy Far too much time is spent in the system, the TCP stack is waiting for the softirqs on CPU0 to do their job. > ------------------------------------------------------------------------------ > Configuration file: > global > daemon > maxconn 60000 > quiet > nbproc 4 > maxpipes 16384 > user haproxy > group haproxy > stats socket /var/run/haproxy.sock mode 600 level admin > stats timeout 2m > > defaults > option forwardfor > option http-server-close Please retry without http-server-close to maintain keep-alive to the servers, that will avoid the session setup/teardown. If that becomes better, there's definitely something to fix in the conntrack or maybe in iptables rules if you have some. But in any case don't put such a system in production like this, it almost does not work, you should see roughly 10 times the numbers you're currently getting. It can be interesting as well to see what ab to nginx does without "-k", as it will do part of the job haproxy is doing with nginx as well and can help troubleshoot the issue in a simplified setup first. Willy