Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
On Wed, May 6, 2015 at 7:15 AM, Krishna Kumar (Engineering) krishna...@flipkart.com wrote: Hi Baptiste, On Wed, May 6, 2015 at 1:24 AM, Baptiste bed...@gmail.com wrote: Also, during the test, the status of various backend's change often between OK to DOWN, and then gets back to OK almost immediately: www-backend,nginx-3,0,0,0,10,3,184,23843,96517588,,0,,27,0,0,180,DOWN 1/2,1,1,0,7,3,6,39,,7,3,1,,220,,2,0,,37,L4CON,,0,0,184,0,0,0,0,00,0,6,Out of local source ports on the system,,0,2,3,92, this error is curious with the type of traffic your generating! Maybe you should let HAProxy manage the source ports on behalf of the server. Try adding the source 0.0.0.0:1024-65535 parameter in your backend description. Yes, this has fixed the issue - I no longer get state change after an hour testing. The performance didn't improve though. I will check the sysctl parameters that were different between haproxy/nginx nodes. Thanks, - Krishna Kumar You have to investigate why this issue happened. I mean, it is not normal. As Pavlos mentionned, you connection rate is very low, since you do keep alive and you opened only 500 ports. Wait, I know, could you share the keep-alive connection from your nginx servers? By default, they close connections every 100 requests... This might be the root of the issue. The configuration I sent you just tells haproxy to manage himself the source ports on behalf of the kernel. It is much more efficient for this task. We never enable it, since in most cases, kernel is good enough. Baptiste
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
On Wed, May 06, 2015 at 12:03:12PM +0200, Baptiste wrote: On Wed, May 6, 2015 at 7:15 AM, Krishna Kumar (Engineering) krishna...@flipkart.com wrote: Hi Baptiste, On Wed, May 6, 2015 at 1:24 AM, Baptiste bed...@gmail.com wrote: Also, during the test, the status of various backend's change often between OK to DOWN, and then gets back to OK almost immediately: www-backend,nginx-3,0,0,0,10,3,184,23843,96517588,,0,,27,0,0,180,DOWN 1/2,1,1,0,7,3,6,39,,7,3,1,,220,,2,0,,37,L4CON,,0,0,184,0,0,0,0,00,0,6,Out of local source ports on the system,,0,2,3,92, this error is curious with the type of traffic your generating! Maybe you should let HAProxy manage the source ports on behalf of the server. Try adding the source 0.0.0.0:1024-65535 parameter in your backend description. Yes, this has fixed the issue - I no longer get state change after an hour testing. The performance didn't improve though. I will check the sysctl parameters that were different between haproxy/nginx nodes. Thanks, - Krishna Kumar You have to investigate why this issue happened. I mean, it is not normal. As Pavlos mentionned, you connection rate is very low, since you do keep alive and you opened only 500 ports. Wait, I know, could you share the keep-alive connection from your nginx servers? By default, they close connections every 100 requests... This might be the root of the issue. But even then there is no reason why the local ports would remain in use. There definitely is a big problem. It also explains why servers are going up and down all the time and errors are reported. Willy
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
On 06/05/2015 12:03 μμ, Baptiste wrote: On Wed, May 6, 2015 at 7:15 AM, Krishna Kumar (Engineering) krishna...@flipkart.com wrote: Hi Baptiste, On Wed, May 6, 2015 at 1:24 AM, Baptiste bed...@gmail.com wrote: Also, during the test, the status of various backend's change often between OK to DOWN, and then gets back to OK almost immediately: www-backend,nginx-3,0,0,0,10,3,184,23843,96517588,,0,,27,0,0,180,DOWN 1/2,1,1,0,7,3,6,39,,7,3,1,,220,,2,0,,37,L4CON,,0,0,184,0,0,0,0,00,0,6,Out of local source ports on the system,,0,2,3,92, this error is curious with the type of traffic your generating! Maybe you should let HAProxy manage the source ports on behalf of the server. Try adding the source 0.0.0.0:1024-65535 parameter in your backend description. Yes, this has fixed the issue - I no longer get state change after an hour testing. The performance didn't improve though. I will check the sysctl parameters that were different between haproxy/nginx nodes. Thanks, - Krishna Kumar You have to investigate why this issue happened. I mean, it is not normal. As Pavlos mentionned, you connection rate is very low, since you do keep alive and you opened only 500 ports. Wait, I know, could you share the keep-alive connection from your nginx servers? By default, they close connections every 100 requests... This might be the root of the issue. That reminds that in my setup I configured nginx backend servers to have keepalive_requests 10. You need to increase the keepalive limit because stress tool use keepalive as well. So, let's say you have 500 TCP concurrent connections open and stress test does 5M requests in total, you need to allow 10K keepalived http requests on nginx . I have a suggestion, rerun the test with haproxy and on your nginx server, and nginx on your haproxy server. Cheers, Pavlos signature.asc Description: OpenPGP digital signature
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
The performance is really good now, thanks to the great responses on this list. I also increased the nginx's keepalive to 1m as Pavlos suggested. # ab -k -n 100 -c 500 http://haproxy:80/64 Requests per second:181623.35 [#/sec] (mean) Transfer rate: 53414.40 [Kbytes/sec] received (both values are as good as doing direct backend) # ab -k -n 10 -c 500 http://haproxy:80/256K Requests per second:4191.92 [#/sec] (mean) Transfer rate: 1074111.06 [Kbytes/sec] received (4.8% less for both numbers as compared to direct backend) If it is helpful, I can post the various parameters that were set (system level + haproxy + backend) if it will be useful for someone else in future. Thanks, - Krishna Kumar On Thu, May 7, 2015 at 8:31 AM, Baptiste bed...@gmail.com wrote: Le 7 mai 2015 04:24, Krishna Kumar (Engineering) krishna...@flipkart.com a écrit : I found the source of the problem. One of the backends was being shared with another person who was testing iptables rules/tunnel setups, and that might have caused some connection drops. I have now removed that backend from my setup and use dedicated systems, after which the original configuration without specifying source port is working, no connection flaps now. Thanks, - Krishna Kumar How much performance do you have now? Baptiste
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
I found the source of the problem. One of the backends was being shared with another person who was testing iptables rules/tunnel setups, and that might have caused some connection drops. I have now removed that backend from my setup and use dedicated systems, after which the original configuration without specifying source port is working, no connection flaps now. Thanks, - Krishna Kumar On Wed, May 6, 2015 at 4:53 PM, Willy Tarreau w...@1wt.eu wrote: On Wed, May 06, 2015 at 12:03:12PM +0200, Baptiste wrote: On Wed, May 6, 2015 at 7:15 AM, Krishna Kumar (Engineering) krishna...@flipkart.com wrote: Hi Baptiste, On Wed, May 6, 2015 at 1:24 AM, Baptiste bed...@gmail.com wrote: Also, during the test, the status of various backend's change often between OK to DOWN, and then gets back to OK almost immediately: www-backend,nginx-3,0,0,0,10,3,184,23843,96517588,,0,,27,0,0,180,DOWN 1/2,1,1,0,7,3,6,39,,7,3,1,,220,,2,0,,37,L4CON,,0,0,184,0,0,0,0,00,0,6,Out of local source ports on the system,,0,2,3,92, this error is curious with the type of traffic your generating! Maybe you should let HAProxy manage the source ports on behalf of the server. Try adding the source 0.0.0.0:1024-65535 parameter in your backend description. Yes, this has fixed the issue - I no longer get state change after an hour testing. The performance didn't improve though. I will check the sysctl parameters that were different between haproxy/nginx nodes. Thanks, - Krishna Kumar You have to investigate why this issue happened. I mean, it is not normal. As Pavlos mentionned, you connection rate is very low, since you do keep alive and you opened only 500 ports. Wait, I know, could you share the keep-alive connection from your nginx servers? By default, they close connections every 100 requests... This might be the root of the issue. But even then there is no reason why the local ports would remain in use. There definitely is a big problem. It also explains why servers are going up and down all the time and errors are reported. Willy
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Le 7 mai 2015 04:24, Krishna Kumar (Engineering) krishna...@flipkart.com a écrit : I found the source of the problem. One of the backends was being shared with another person who was testing iptables rules/tunnel setups, and that might have caused some connection drops. I have now removed that backend from my setup and use dedicated systems, after which the original configuration without specifying source port is working, no connection flaps now. Thanks, - Krishna Kumar How much performance do you have now? Baptiste
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Hi Willy, Pavlos, Thank you once again for your advice. Requests per second:19071.55 [#/sec] (mean) Transfer rate: 9461.28 [Kbytes/sec] received These numbers are extremely low and very likely indicate an http close mode combined with an untuned nf_conntrack. Yes, it was due to http close mode, and wrong irq pinning (nf_conntrack_max was set to 640K). mpstat (first 4 processors only, rest are almost zero): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: 00.250.000.750.000.00 98.010.00 0.000.001.00 This CPU is spending its time in softirq, probably due to conntrack spending a lot of time looking for the session for each packet in too small a hash table. I had not done irq pinning. Today I am getting much better results with irq pinning and keepalive. Note, this is about 2 Gbps. How is your network configured ? You should normally see either 1 Gbps with a gig NIC or 10 Gbps with a 10G NIC, because retrieving a static file is very cheap. Would you happen to be using bonding in round-robin mode maybe ? If that's the case, it's a performance disaster due to out-of-order packets and could explain some of the high %softirq. My setup is as follows (no bonding, etc, and Sys stands for baremetal system, each with 48 core, 128GB mem, ixgbe single ethernet port card). Sys1-with-ab -eth0- Sys1-with-Haproxy, which uses two nginx backend systems over the same eth0 card (that is the current restriction, no extra ethernet interface for separate frontend/backend, etc). Today I am getting a high of 7.7 Gbps with your suggestions. Is it possible to get higher than that (direct to server gets 8.6 Gbps)? Please retry without http-server-close to maintain keep-alive to the servers, that will avoid the session setup/teardown. If that becomes better, there's definitely something to fix in the conntrack or maybe in iptables rules if you have some. But in any case don't put such a There are a few iptables rules, which seem clean. The results now are: ab -k -n 100 -c 500 http://haproxy:80/64 (I am getting some errors though, which is not present when running against the backend directly): Document Length:64 bytes Concurrency Level: 500 Time taken for tests: 6.181 seconds Complete requests: 100 Failed requests:18991 (Connect: 0, Receive: 0, Length: 9675, Exceptions: 9316) Write errors: 0 Keep-Alive requests:990330 Total transferred: 296554848 bytes HTML transferred: 63381120 bytes Requests per second:161783.42 [#/sec] (mean) Time per request: 3.091 [ms] (mean) Time per request: 0.006 [ms] (mean, across all concurrent requests) Transfer rate: 46853.18 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:00 0.5 0 8 Processing: 03 6.2 31005 Waiting:03 6.2 31005 Total: 03 6.3 31010 Percentage of the requests served within a certain time (ms) 50% 3 66% 3 75% 3 80% 3 90% 4 95% 5 98% 6 99% 8 100% 1010 (longest request) pidstat (some system numbers are very high, 50%, maybe due to small packet sizes?): Average: UID PID%usr %system %guest%CPU CPU Command Average: 110 526016.009.330.00 15.33 - haproxy Average: 110 526026.33 11.830.00 18.17 - haproxy Average: 110 52603 11.33 17.830.00 29.17 - haproxy Average: 110 52604 17.50 30.330.00 47.83 - haproxy Average: 110 52605 20.50 38.500.00 59.00 - haproxy Average: 110 52606 24.50 51.330.00 75.83 - haproxy Average: 110 52607 22.50 51.330.00 73.83 - haproxy Average: 110 52608 23.67 47.170.00 70.83 - haproxy mpstat (of interesting cpus only): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: all2.580.004.360.000.000.890.00 0.000.00 92.17 Average: 06.840.00 11.460.000.002.030.00 0.000.00 79.67 Average: 1 11.150.00 19.850.000.005.290.00 0.000.00 63.71 Average: 28.320.00 12.200.000.002.220.00 0.000.00 77.26 Average: 37.920.00 11.970.000.002.390.00 0.000.00 77.72 Average: 48.810.00 13.760.000.002.390.00 0.000.00 75.05 Average: 56.960.00 12.270.000.002.380.00 0.000.00 78.39 Average: 69.210.00 12.520.000.003.310.00 0.000.00 74.95 Average: 77.560.00 13.650.000.00
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Hi Baptiste, On Wed, May 6, 2015 at 1:24 AM, Baptiste bed...@gmail.com wrote: Also, during the test, the status of various backend's change often between OK to DOWN, and then gets back to OK almost immediately: www-backend,nginx-3,0,0,0,10,3,184,23843,96517588,,0,,27,0,0,180,DOWN 1/2,1,1,0,7,3,6,39,,7,3,1,,220,,2,0,,37,L4CON,,0,0,184,0,0,0,0,00,0,6,Out of local source ports on the system,,0,2,3,92, this error is curious with the type of traffic your generating! Maybe you should let HAProxy manage the source ports on behalf of the server. Try adding the source 0.0.0.0:1024-65535 parameter in your backend description. Yes, this has fixed the issue - I no longer get state change after an hour testing. The performance didn't improve though. I will check the sysctl parameters that were different between haproxy/nginx nodes. Thanks, - Krishna Kumar
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Hi Pavlos On Wed, May 6, 2015 at 1:24 AM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Shall I assume that you have run the same tests without iptables and got the same results? Yes, I had tried it yesterday and saw no measurable difference. May I suggest to try also httpress and wrk tool? I tried it today, will post it after your result below. Have you compared 'sysctl -a' between haproxy and nginx server? Yes, the difference is very litle: 11c11 fs.dentry-state = 266125 130939 45 0 0 0 --- fs.dentry-state = 19119 0 45 0 0 0 13,17c13,17 fs.epoll.max_user_watches = 27046277 fs.file-max = 1048576 fs.file-nr = 1536 0 1048576 fs.inode-nr = 262766 98714 fs.inode-state = 262766 98714 0 0 0 0 0 --- fs.epoll.max_user_watches = 27046297 fs.file-max = 262144 fs.file-nr = 1536 0 262144 fs.inode-nr = 27290 8946 fs.inode-state = 2729089460 0 0 0 0 134c134 kernel.sched_domain.cpu0.domain0.max_newidle_lb_cost = 2305 --- kernel.sched_domain.cpu0.domain0.max_newidle_lb_cost = 3820 (and for each cpu, similar lb_cost) Have you checked if you got all backends reported down at the same time? Yes, that has not happened. After Baptiste's suggestion of adding port number, this has disappeared completely. How many workers do you use on your Nginx which acts as LB? I was using default of 4. Increasing to 16 seems to improve numbers 10-20%.\ www-backend,nginx-3,0,0,0,10,3,184,23843,96517588,,0,,27,0,0,180,DOWN 1/2,1,1,0,7,3,6,39,,7,3,1,,220,,2,0,,37,L4CON,,0,0,184,0,0,0,0,00,0,6,Out of local source ports on the system,,0,2,3,92, Hold on a second, what is this 'Out of local source ports on the system' message? ab reports 'Concurrency Level: 500' and you said that HAProxy runs in keepalive mode(default on 1.5 releases) which means there will be only 500 TCP connections opened from HAProxy towards the backends, which it isn't that high and you shouldn't get that message unless net.ipv4.ip_local_port_range is very small( I don't think so). It was set to net.ipv4.ip_local_port_range = 3276861000. I have not seen this issue after making the change Baptiste suggested. Though I could increase the range above and check too. # wrk --timeout 3s --latency -c 1000 -d 5m -t 24 http://a.b.c.d Running 5m test @ http://a.b.c.d 24 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency87.07ms 593.84ms 7.85s95.63% Req/Sec16.45k 7.43k 60.89k74.25% Latency Distribution 50%1.75ms 75%2.40ms 90%3.57ms 99%3.27s 111452585 requests in 5.00m, 15.98GB read Socket errors: connect 0, read 0, write 0, timeout 33520 Requests/sec: 371504.85 Transfer/sec: 54.56MB I get very strange result: # wrk --timeout 3s --latency -c 1000 -d 1m -t 24 http://haproxy Running 1m test @ http://haproxy 24 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 2.40ms 26.64ms 1.02s99.28% Req/Sec 8.77k 8.20k 26.98k62.39% Latency Distribution 50%1.14ms 75%1.68ms 90%2.40ms 99%6.14ms 98400 requests in 1.00m, 34.06MB read Requests/sec: 1637.26 Transfer/sec:580.36KB # wrk --timeout 3s --latency -c 1000 -d 1m -t 24 http://nginx Running 1m test @ http://nginx 24 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 5.56ms 12.01ms 444.71ms 99.41% Req/Sec 8.53k 825.8018.50k90.91% Latency Distribution 50%4.81ms 75%6.80ms 90%8.58ms 99% 11.92ms 12175205 requests in 1.00m, 4.31GB read Requests/sec: 202584.48 Transfer/sec: 73.41MB Thank you, Regards, - Krishna Kumar
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
On 05/05/2015 02:06 μμ, Krishna Kumar (Engineering) wrote: Hi Willy, Pavlos, Thank you once again for your advice. Requests per second:19071.55 [#/sec] (mean) Transfer rate: 9461.28 [Kbytes/sec] received These numbers are extremely low and very likely indicate an http close mode combined with an untuned nf_conntrack. Yes, it was due to http close mode, and wrong irq pinning (nf_conntrack_max was set to 640K). mpstat (first 4 processors only, rest are almost zero): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: 00.250.000.750.000.00 98.010.00 0.000.001.00 This CPU is spending its time in softirq, probably due to conntrack spending a lot of time looking for the session for each packet in too small a hash table. I had not done irq pinning. Today I am getting much better results with irq pinning and keepalive. Note, this is about 2 Gbps. How is your network configured ? You should normally see either 1 Gbps with a gig NIC or 10 Gbps with a 10G NIC, because retrieving a static file is very cheap. Would you happen to be using bonding in round-robin mode maybe ? If that's the case, it's a performance disaster due to out-of-order packets and could explain some of the high %softirq. My setup is as follows (no bonding, etc, and Sys stands for baremetal system, each with 48 core, 128GB mem, ixgbe single ethernet port card). Sys1-with-ab -eth0- Sys1-with-Haproxy, which uses two nginx backend systems over the same eth0 card (that is the current restriction, no extra ethernet interface for separate frontend/backend, etc). Today I am getting a high of 7.7 Gbps with your suggestions. Is it possible to get higher than that (direct to server gets 8.6 Gbps)? Please retry without http-server-close to maintain keep-alive to the servers, that will avoid the session setup/teardown. If that becomes better, there's definitely something to fix in the conntrack or maybe in iptables rules if you have some. But in any case don't put such a There are a few iptables rules, which seem clean. The results now are: Shall I assume that you have run the same tests without iptables and got the same results? May I suggest to try also httpress and wrk tool? Have you compared 'sysctl -a' between haproxy and nginx server? ab -k -n 100 -c 500 http://haproxy:80/64 (I am getting some errors though, which is not present when running against the backend directly): Document Length:64 bytes Concurrency Level: 500 Time taken for tests: 6.181 seconds Complete requests: 100 Failed requests:18991 (Connect: 0, Receive: 0, Length: 9675, Exceptions: 9316) Write errors: 0 Keep-Alive requests:990330 Total transferred: 296554848 bytes HTML transferred: 63381120 bytes Requests per second:161783.42 [#/sec] (mean) Time per request: 3.091 [ms] (mean) Time per request: 0.006 [ms] (mean, across all concurrent requests) Transfer rate: 46853.18 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:00 0.5 0 8 Processing: 03 6.2 31005 Waiting:03 6.2 31005 Total: 03 6.3 31010 Percentage of the requests served within a certain time (ms) 50% 3 66% 3 75% 3 80% 3 90% 4 95% 5 98% 6 99% 8 100% 1010 (longest request) pidstat (some system numbers are very high, 50%, maybe due to small packet sizes?): Average: UID PID%usr %system %guest%CPU CPU Command Average: 110 526016.009.330.00 15.33 - haproxy Average: 110 526026.33 11.830.00 18.17 - haproxy Average: 110 52603 11.33 17.830.00 29.17 - haproxy Average: 110 52604 17.50 30.330.00 47.83 - haproxy Average: 110 52605 20.50 38.500.00 59.00 - haproxy Average: 110 52606 24.50 51.330.00 75.83 - haproxy Average: 110 52607 22.50 51.330.00 73.83 - haproxy Average: 110 52608 23.67 47.170.00 70.83 - haproxy mpstat (of interesting cpus only): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: all2.580.004.360.000.000.89 0.000.000.00 92.17 Average: 06.840.00 11.460.000.002.03 0.000.000.00 79.67 Average: 1 11.150.00 19.850.000.005.29 0.000.000.00 63.71 Average: 28.320.00 12.200.000.00
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Also, during the test, the status of various backend's change often between OK to DOWN, and then gets back to OK almost immediately: www-backend,nginx-3,0,0,0,10,3,184,23843,96517588,,0,,27,0,0,180,DOWN 1/2,1,1,0,7,3,6,39,,7,3,1,,220,,2,0,,37,L4CON,,0,0,184,0,0,0,0,00,0,6,Out of local source ports on the system,,0,2,3,92, this error is curious with the type of traffic your generating! Maybe you should let HAProxy manage the source ports on behalf of the server. Try adding the source 0.0.0.0:1024-65535 parameter in your backend description. Please let me know if this can be fixed, as it might help performance even more. In short, for small file sizes, haproxy results are *much* better than running against a single backend server directly (with some failures as shown above). For big files, the numbers for haproxy are slightly lower. devil might be in your sysctls. Baptiste
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Dear all, Sorry, my lab systems were down for many days and I could not get back on this earlier. After new systems were allocated, I managed to get all the requested information with a fresh ru (Sorry, this is a long mail too!). There are now 4 physical servers, running Debian 3.2.0-4-amd64, connected directly to a common switch: server1: Run 'ab' in a container, no cpu/memory restriction. server2: Run haproxy in a container, configured with 4 nginx's, cpu/memory configured as shown below. server3: Run 2 different nginx containers, no cpu/mem restriction. server4: Run 2 different nginx containers, for a total of 4 nginx, no cpu/mem restriction. The servers have 2 sockets, each with 24 cores. Socket 0 has cores 0,2,4,..,46 and Socket 1 has cores 1,3,5,..,47. The NIC (ixgbe) is bound to CPU 0. Haproxy is started on cpu's: 2,4,6,8,10,12,14,16, so that is in the same cache line as the nic (nginx is run on different servers as explained above). No tuning on nginx servers as the comparison is between 'ab' - 'nginx' and 'ab' and 'haproxy' - nginx(s). The cpus are Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz. The containers are all configured with 8GB, server having 128GB memory. mpstat and iostat were captured during the test, where the capture started after 'ab' started and capture ended just before 'ab' finished so as to get warm numbers. Request directly to 1 nginx backend server, size=256 bytes: Command: ab -k -n 10 -c 1000 nginx:80/256 Requests per second:69749.02 [#/sec] (mean) Transfer rate: 34600.18 [Kbytes/sec] received Request to haproxy configured with 4 nginx backends (nbproc=4), size=256 bytes: Command: ab -k -n 10 -c 1000 haproxy:80/256 Requests per second:19071.55 [#/sec] (mean) Transfer rate: 9461.28 [Kbytes/sec] received mpstat (first 4 processors only, rest are almost zero): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: all0.440.001.590.000.002.960.00 0.000.00 95.01 Average: 00.250.000.750.000.00 98.010.00 0.000.001.00 Average: 11.260.005.280.000.002.510.00 0.000.00 90.95 Average: 22.760.008.790.000.005.780.00 0.000.00 82.66 Average: 31.510.006.780.000.003.020.00 0.000.00 88.69 pidstat: Average: 105 4715.00 33.500.00 38.50 - haproxy Average: 105 4726.50 44.000.00 50.50 - haproxy Average: 105 4738.50 40.000.00 48.50 - haproxy Average: 105 4752.50 14.000.00 16.50 - haproxy Request directly to 1 nginx backend server, size=64K Command: ab -k -n 10 -c 1000 nginx:80/64K Requests per second:3342.56 [#/sec] (mean) Transfer rate: 214759.11 [Kbytes/sec] received Request to haproxy configured with 4 nginx backends (nbproc=4), size=64K Command: ab -k -n 10 -c 1000 haproxy:80/64K Requests per second:1283.62 [#/sec] (mean) Transfer rate: 82472.35 [Kbytes/sec] received mpstat (first 4 processors only, rest are almost zero): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: all0.080.000.740.010.002.620.00 0.000.00 96.55 Average: 00.000.000.000.000.00 100.000.00 0.000.000.00 Average: 11.030.009.980.210.007.670.00 0.000.00 81.10 Average: 20.700.006.320.000.004.500.00 0.000.00 88.48 Average: 30.150.002.040.060.001.730.00 0.000.00 96.03 pidstat: Average: UID PID%usr %system %guest%CPU CPU Command Average: 105 4710.93 14.700.00 15.63 - haproxy Average: 105 4721.12 21.550.00 22.67 - haproxy Average: 105 4731.41 20.950.00 22.36 - haproxy Average: 105 4750.224.850.005.07 - haproxy -- Build information: HA-Proxy version 1.5.8 2014/10/31 Copyright 2000-2014 Willy Tarreau w...@1wt.eu Build options : TARGET = linux2628
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
On 29/04/2015 12:56 μμ, Krishna Kumar (Engineering) wrote: Dear all, Sorry, my lab systems were down for many days and I could not get back on this earlier. After new systems were allocated, I managed to get all the requested information with a fresh ru (Sorry, this is a long mail too!). There are now 4 physical servers, running Debian 3.2.0-4-amd64, connected directly to a common switch: server1: Run 'ab' in a container, no cpu/memory restriction. server2: Run haproxy in a container, configured with 4 nginx's, cpu/memory configured as shown below. server3: Run 2 different nginx containers, no cpu/mem restriction. server4: Run 2 different nginx containers, for a total of 4 nginx, no cpu/mem restriction. The servers have 2 sockets, each with 24 cores. Socket 0 has cores 0,2,4,..,46 and Socket 1 has cores 1,3,5,..,47. The NIC (ixgbe) is bound to CPU 0. It is considered bad thing to bind all queues of NIC to 1 CPU as it creates a major bottleneck. HAProxy will have to wait for the interrupt to be processed by a single CPU which is saturated. Haproxy is started on cpu's: 2,4,6,8,10,12,14,16, so that is in the same cache line as the nic (nginx is run on different servers as explained above). No tuning on nginx servers as the comparison is between how many workers to run on Nginx? 'ab' - 'nginx' and 'ab' and 'haproxy' - nginx(s). The cpus are Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz. The containers are all configured with 8GB, server having 128GB memory. mpstat and iostat were captured during the test, where the capture started after 'ab' started and capture ended just before 'ab' finished so as to get warm numbers. Request directly to 1 nginx backend server, size=256 bytes: Command: ab -k -n 10 -c 1000 nginx:80/256 Requests per second:69749.02 [#/sec] (mean) Transfer rate: 34600.18 [Kbytes/sec] received Request to haproxy configured with 4 nginx backends (nbproc=4), size=256 bytes: Command: ab -k -n 10 -c 1000 haproxy:80/256 Requests per second:19071.55 [#/sec] (mean) Transfer rate: 9461.28 [Kbytes/sec] received mpstat (first 4 processors only, rest are almost zero): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: all0.440.001.590.000.002.96 0.000.000.00 95.01 Average: 00.250.000.750.000.00 98.01 0.000.000.001.00 All network interrupts are processed by CPU 0 which is saturated. You need to spread the queues of NIC to different CPUs. Either use irqbalancer or the following 'ugly' script which you need to modify a bit as I have 2 NICs and you have only 1. You also need to adjust the number of queues, grep eth /proc/interrupts and you will find out how many you have. #!/bin/sh awk ' function get_affinity(cpus) { split(cpus,list,/,/) mask=0 for (val in list) { mask+=lshift(1,list[val]) } return mask } BEGIN { # Interrupt - CPU core(s) mapping map[eth0-q0]=0 map[eth0-q1]=1 map[eth0-q2]=2 map[eth0-q3]=3 map[eth0-q4]=4 map[eth0-q5]=5 map[eth0-q6]=6 map[eth0-q7]=7 map[eth1-q0]=12 map[eth1-q1]=13 map[eth1-q2]=14 map[eth1-q3]=15 map[eth1-q4]=16 map[eth1-q5]=17 map[eth1-q6]=18 map[eth1-q7]=19 } /eth/ { irq=substr($1,0,length($1)-1) queue=$NF printf %s (%s) - %s (%08X)\n,queue,irq,map[queue],get_affinity(map[queue]) system(sprintf(echo %08X /proc/irq/%s/smp_affinity\n,get_affinity(map[queue]),irq)) } ' /proc/interrupts Average: 11.260.005.280.000.002.51 0.000.000.00 90.95 Average: 22.760.008.790.000.005.78 0.000.000.00 82.66 Average: 31.510.006.780.000.003.02 0.000.000.00 88.69 pidstat: Average: 105 4715.00 33.500.00 38.50 - haproxy Average: 105 4726.50 44.000.00 50.50 - haproxy Average: 105 4738.50 40.000.00 48.50 - haproxy Average: 105 4752.50 14.000.00 16.50 - haproxy Request directly to 1 nginx backend server, size=64K I would like to see pidstat and mpstat while you test nginx. Cheers, Pavlos signature.asc Description: OpenPGP digital
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Hi, On Wed, Apr 29, 2015 at 04:26:56PM +0530, Krishna Kumar (Engineering) wrote: Request directly to 1 nginx backend server, size=256 bytes: Command: ab -k -n 10 -c 1000 nginx:80/256 Requests per second:69749.02 [#/sec] (mean) Transfer rate: 34600.18 [Kbytes/sec] received Request to haproxy configured with 4 nginx backends (nbproc=4), size=256 bytes: Command: ab -k -n 10 -c 1000 haproxy:80/256 Requests per second:19071.55 [#/sec] (mean) Transfer rate: 9461.28 [Kbytes/sec] received These numbers are extremely low and very likely indicate an http close mode combined with an untuned nf_conntrack. mpstat (first 4 processors only, rest are almost zero): Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %gnice %idle Average: all0.440.001.590.000.002.960.00 0.000.00 95.01 Average: 00.250.000.750.000.00 98.010.00 0.000.001.00 This CPU is spending its time in softirq, probably due to conntrack spending a lot of time looking for the session for each packet in too small a hash table. Request directly to 1 nginx backend server, size=64K Command: ab -k -n 10 -c 1000 nginx:80/64K Requests per second:3342.56 [#/sec] (mean) Transfer rate: 214759.11 [Kbytes/sec] received Note, this is about 2 Gbps. How is your network configured ? You should normally see either 1 Gbps with a gig NIC or 10 Gbps with a 10G NIC, because retrieving a static file is very cheap. Would you happen to be using bonding in round-robin mode maybe ? If that's the case, it's a performance disaster due to out-of-order packets and could explain some of the high %softirq. Request to haproxy configured with 4 nginx backends (nbproc=4), size=64K Command: ab -k -n 10 -c 1000 haproxy:80/64K Requests per second:1283.62 [#/sec] (mean) Transfer rate: 82472.35 [Kbytes/sec] received That's terribly low. I'm doing more than that on a dockstar that fits in my hand and is powered over USB! pidstat: Average: UID PID%usr %system %guest%CPU CPU Command Average: 105 4710.93 14.700.00 15.63 - haproxy Average: 105 4721.12 21.550.00 22.67 - haproxy Average: 105 4731.41 20.950.00 22.36 - haproxy Average: 105 4750.224.850.005.07 - haproxy Far too much time is spent in the system, the TCP stack is waiting for the softirqs on CPU0 to do their job. -- Configuration file: global daemon maxconn 6 quiet nbproc 4 maxpipes 16384 user haproxy group haproxy stats socket /var/run/haproxy.sock mode 600 level admin stats timeout 2m defaults option forwardfor option http-server-close Please retry without http-server-close to maintain keep-alive to the servers, that will avoid the session setup/teardown. If that becomes better, there's definitely something to fix in the conntrack or maybe in iptables rules if you have some. But in any case don't put such a system in production like this, it almost does not work, you should see roughly 10 times the numbers you're currently getting. It can be interesting as well to see what ab to nginx does without -k, as it will do part of the job haproxy is doing with nginx as well and can help troubleshoot the issue in a simplified setup first. Willy
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Hi, On Mon, Mar 30, 2015 at 10:43:51AM +0530, Krishna Kumar Unnikrishnan (Engineering) wrote: Hi all, I am testing haproxy as follows: System1: 24 Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 64 GB. This system is running 3.19.0 kernel, and hosts the following servers: 1. nginx1 server - cpu 1-2, 1G memory, runs as a Linux container using cpuset.cpus feature. 2. nginx2 server - cpu 3-4, 1G memory, runs via LXC. 3. nginx3 server - cpu 5-6, 1G memory, runs via LXC. 4. nginx4 server - cpu 7-8, 1G memory, runs via LXC. 5. haproxy - cpu 9-10, 1G memory runs via LXC. Runs haproxy ver 1.5.8: configured with above 4 container's ip addresses as the backend. System2: 56 Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 128 GB. This system is running 3.19.0, and run's 'ab' either to the haproxy node, or directly to an nginx container. System1 System2 are locally connected via a switch with Intel 10G cards. With very small packets of 64 bytes, I am getting the following results: A. ab -n 10 -c 4096 http://nginx1:80/64 - Concurrency Level: 4096 Time taken for tests: 3.232 seconds Complete requests: 10 Failed requests:0 Total transferred: 2880 bytes HTML transferred: 640 bytes Requests per second:30943.26 [#/sec] (mean) Time per request: 132.371 [ms] (mean) Time per request: 0.032 [ms] (mean, across all concurrent requests) Transfer rate: 8702.79 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:9 65 137.4 451050 Processing: 4 52 25.3 51 241 Waiting:3 37 19.2 35 234 Total: 16 117 146.11111142 Percentage of the requests served within a certain time (ms) 50%111 66%119 75%122 80%124 90%133 95%215 98%254 99% 1126 100% 1142 (longest request) B. ab -n 10 -c 4096 http://haproxy:80/64 -- Concurrency Level: 4096 Time taken for tests: 5.503 seconds Complete requests: 10 Failed requests:0 Total transferred: 2880 bytes HTML transferred: 640 bytes Requests per second:18172.96 [#/sec] (mean) Time per request: 225.390 [ms] (mean) Time per request: 0.055 [ms] (mean, across all concurrent requests) Transfer rate: 5111.15 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:0 134 358.3 233033 Processing: 2 61 47.7 51 700 Waiting:2 50 43.0 42 685 Total: 7 194 366.7 793122 Percentage of the requests served within a certain time (ms) 50% 79 66%105 75%134 80%159 90%318 95% 1076 98% 1140 99% 1240 100% 3122 (longest request) I expected haproxy to deliver better results with multiple connections, since haproxy will round-robin between the 4 servers. I have done no tuning, and have used the config file at the end of this mail. With 256K file size, the times are slightly better for haproxy vs nginx. I notice that %requests served is similar for both cases till about 90%. I'm seeing a very simple and common explanation to this. You're stressing the TCP stack and it becomes the bottleneck. Both haproxy and nginx make very little use of userland and spend most of their time in the kernel, so by putting both of them on the same system image, you're still subject to the session table lookups, locking and whatever limits the processing. And in fact, by adding haproxy in front of nginx on the same system, you have effectively double the kernel's job, and you're measuring about half of the performance, so there's nothing much surprizing here. Please check the CPU usage as Pavlos mentionned. I'm guessing that your system is spending most of its time in system and/or softirq. Also, maybe you have conntrack enabled on the system. In this case, having the components on the machine will triple the conntrack session rate, effectively increasing its work. There's something you can try in your config below to see if the connection rate is mostly responsible for the trouble : global maxconn 65536 ulimit-n 65536 Please remove ulimit-n BTW, it's wrong and not needed. daemon quiet nbproc 2 user haproxy group haproxy defaults #log global modehttp option dontlognull retries 3 option redispatch maxconn 65536 timeout connect 5000 timeout client 5 timeout server 5 listen my_ha_proxy 192.168.1.110:80 mode http stats enable stats auth
Re: [haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
On 30/03/2015 07:13 πμ, Krishna Kumar Unnikrishnan (Engineering) wrote: Hi all, I am testing haproxy as follows: System1: 24 Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 64 GB. This system is running 3.19.0 kernel, and hosts the following servers: 1. nginx1 server - cpu 1-2, 1G memory, runs as a Linux container using cpuset.cpus feature. 2. nginx2 server - cpu 3-4, 1G memory, runs via LXC. 3. nginx3 server - cpu 5-6, 1G memory, runs via LXC. 4. nginx4 server - cpu 7-8, 1G memory, runs via LXC. 5. haproxy - cpu 9-10, 1G memory runs via LXC. Runs haproxy ver 1.5.8: configured with above 4 container's ip addresses as the backend. System2: 56 Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 128 GB. This system is running 3.19.0, and run's 'ab' either to the haproxy node, or directly to an nginx container. System1 System2 are locally connected via a switch with Intel 10G cards. With very small packets of 64 bytes, I am getting the following results: A. ab -n 10 -c 4096 http://nginx1:80/64 - Concurrency Level: 4096 Time taken for tests: 3.232 seconds Complete requests: 10 Failed requests:0 Total transferred: 2880 bytes HTML transferred: 640 bytes Requests per second:30943.26 [#/sec] (mean) Time per request: 132.371 [ms] (mean) Time per request: 0.032 [ms] (mean, across all concurrent requests) Transfer rate: 8702.79 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:9 65 137.4 451050 Processing: 4 52 25.3 51 241 Waiting:3 37 19.2 35 234 Total: 16 117 146.11111142 Percentage of the requests served within a certain time (ms) 50%111 66%119 75%122 80%124 90%133 95%215 98%254 99% 1126 100% 1142 (longest request) B. ab -n 10 -c 4096 http://haproxy:80/64 -- Concurrency Level: 4096 Time taken for tests: 5.503 seconds Complete requests: 10 Failed requests:0 Total transferred: 2880 bytes HTML transferred: 640 bytes Requests per second:18172.96 [#/sec] (mean) Time per request: 225.390 [ms] (mean) Time per request: 0.055 [ms] (mean, across all concurrent requests) Transfer rate: 5111.15 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:0 134 358.3 233033 Processing: 2 61 47.7 51 700 Waiting:2 50 43.0 42 685 Total: 7 194 366.7 793122 Percentage of the requests served within a certain time (ms) 50% 79 66%105 75%134 80%159 90%318 95% 1076 98% 1140 99% 1240 100% 3122 (longest request) I expected haproxy to deliver better results with multiple connections, since haproxy will round-robin between the 4 servers. I have done no tuning, and have used the config file at the end of this mail. With 256K file size, the times are slightly better for haproxy vs nginx. I notice that %requests served is similar for both cases till about 90%. Any help is very much appreciated. You have mentioned the CPU load on the host and on the guest systems. Use pidstat -p $(pgrep -d ',' haproxy) -u 1 to monitor CPU stats of haproxy processes and mpstat -P ALL 1 and check CPU load for software interrupts. Cheers, Pavlos signature.asc Description: OpenPGP digital signature
[haproxy]: Performance of haproxy-to-4-nginx vs direct-to-nginx
Hi all, I am testing haproxy as follows: System1: 24 Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 64 GB. This system is running 3.19.0 kernel, and hosts the following servers: 1. nginx1 server - cpu 1-2, 1G memory, runs as a Linux container using cpuset.cpus feature. 2. nginx2 server - cpu 3-4, 1G memory, runs via LXC. 3. nginx3 server - cpu 5-6, 1G memory, runs via LXC. 4. nginx4 server - cpu 7-8, 1G memory, runs via LXC. 5. haproxy - cpu 9-10, 1G memory runs via LXC. Runs haproxy ver 1.5.8: configured with above 4 container's ip addresses as the backend. System2: 56 Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 128 GB. This system is running 3.19.0, and run's 'ab' either to the haproxy node, or directly to an nginx container. System1 System2 are locally connected via a switch with Intel 10G cards. With very small packets of 64 bytes, I am getting the following results: A. ab -n 10 -c 4096 http://nginx1:80/64 - Concurrency Level: 4096 Time taken for tests: 3.232 seconds Complete requests: 10 Failed requests:0 Total transferred: 2880 bytes HTML transferred: 640 bytes Requests per second:30943.26 [#/sec] (mean) Time per request: 132.371 [ms] (mean) Time per request: 0.032 [ms] (mean, across all concurrent requests) Transfer rate: 8702.79 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:9 65 137.4 451050 Processing: 4 52 25.3 51 241 Waiting:3 37 19.2 35 234 Total: 16 117 146.11111142 Percentage of the requests served within a certain time (ms) 50%111 66%119 75%122 80%124 90%133 95%215 98%254 99% 1126 100% 1142 (longest request) B. ab -n 10 -c 4096 http://haproxy:80/64 -- Concurrency Level: 4096 Time taken for tests: 5.503 seconds Complete requests: 10 Failed requests:0 Total transferred: 2880 bytes HTML transferred: 640 bytes Requests per second:18172.96 [#/sec] (mean) Time per request: 225.390 [ms] (mean) Time per request: 0.055 [ms] (mean, across all concurrent requests) Transfer rate: 5111.15 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:0 134 358.3 233033 Processing: 2 61 47.7 51 700 Waiting:2 50 43.0 42 685 Total: 7 194 366.7 793122 Percentage of the requests served within a certain time (ms) 50% 79 66%105 75%134 80%159 90%318 95% 1076 98% 1140 99% 1240 100% 3122 (longest request) I expected haproxy to deliver better results with multiple connections, since haproxy will round-robin between the 4 servers. I have done no tuning, and have used the config file at the end of this mail. With 256K file size, the times are slightly better for haproxy vs nginx. I notice that %requests served is similar for both cases till about 90%. Any help is very much appreciated. -- A. haproxy config file: - global maxconn 65536 ulimit-n 65536 daemon quiet nbproc 2 user haproxy group haproxy defaults #log global modehttp option dontlognull retries 3 option redispatch maxconn 65536 timeout connect 5000 timeout client 5 timeout server 5 listen my_ha_proxy 192.168.1.110:80 mode http stats enable stats auth someuser:somepassword balance roundrobin cookie JSESSIONID prefix option httpclose option forwardfor option httpchk HEAD /check.txt HTTP/1.0 server nginx1 192.168.1.101:80 check server nginx2 192.168.1.102:80 check server nginx3 192.168.1.103:80 check server nginx4 192.168.1.104:80 check B. haproxy version/build-option: -- HA-Proxy version 1.5.8 2014/10/31 Copyright 2000-2014 Willy Tarreau w...@1wt.eu Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.7 Compression algorithms supported : identity, deflate, gzip Built with OpenSSL version : OpenSSL 1.0.1e 11 Feb 2013 Running on OpenSSL version : OpenSSL