Hi guys, I’m currently performing a few tests on haproxy and nginx to determine the best software to terminate SSL early in a hosting stack, and also perform some load balancing to varnish machines.
I’d love to use haproxy for this setup, since haproxy does one thing really great – load balancing and the way it determines if backends are down. However, one issue I’m facing is the SSL termination performance, on a single core I manage to terminate 21000 connections per second on nginx, but only 748 connections per second in haproxy. They’re using the exact same cipher suite (TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256) to minimize the SSL overhead, I decided to go for AES128 since the security itself isn’t super important, but rather just that things are somewhat encrypted (mainly serving static files or non-sensitive content). I’m testing with a single apache benchmark client (actually from the hypervisor on where I have my VM running, so the network latency is minimal to rule out any networking being the cause to get the highest possible numbers. I generated a flame graph for both haproxy and nginx using `perf` tool Haproxy flame graph can be found here: https://snaps.trollcdn.com/sadiZsJd96twAez0GUiWJdDiEbwsRPWUxJ3sRskLG4.svg Nginx flame graph can be found here: https://snaps.trollcdn.com/P7PVyDkjhsxbsXCmK6bzVeqWsHHwnOxRucnCYG084f.svg What I find odd, is that in haproxy you’ll libcrypto.so.1.0.2k with 81k samples, but the function right below (unknown) only got 8.3k samples, where in nginx the gap is *a lot* smaller, and I’ve still not really figured out what actually happens in haproxy that causes this gap. However, my main concern is the fact that terminating SSL, nginx performs 28 times better. I’ve tried running haproxy with both 10 threads, or 10 processes on a 12 core machine – pinning each thread or process to a specific core, and putting RX and TX queues on individual cores as well to ensure that load would be evenly distributed. Doing the same with nginx, it still reveals a 5.5k requests per second on haproxy, but 125.000 requests per second on nginx (22 times difference). I got absolutely best performance on haproxy by using processes over threads – with the processes, it’s not maxing out on the CPU but it is with the threads, so not sure why this happens either. Now, since nginx can serve static files directly, I wanted to replicate the same in haproxy so I wouldn’t have to have a backend that would then do a connection in the backend, since this could surely degrade the overall requests per second on haproxy. I did this by using an errorfile 200 /etc/haproxy/errorfiles/200.http to just serve a file directly on the frontend. My haproxy config looks like this: https://gist.github.com/lucasRolff/36fc84ac44aad559c1d43ab6f30237c8 Do anyone have any suggestions or maybe insight into why haproxy seems to be terminating SSL connections at a way lower rate per second, than for example nginx? Is there any missing functionality in haproxy that isn’t available, and thus causing nginx to succeed in terms of the performance/scalability for terminations? There’s many things I absolutely love about haproxy, but if there’s a 22-28x difference in how many SSL terminations it can handle per second, then we’re talking about a lot of added hardware to be able to handle, let’s say 500k requests per second. The VM has AES-NI available as well. Thanks in advance! Best Regards, Lucas Rolff