On Sun, Apr 29, 2012 at 05:25:01PM +0300, Bar Ziony wrote: > Hi Willy, > > Thanks for your time. > > I really didn't know this are such low results. > > I ran 'ab' from a different machine than haproxy and nginx (which are > different machines too). I also tried to run 'ab' from multiple machines > (not haproxy or nginx) and the results are pretty much / 3 the single > result 'ab' result...
OK so this clearly means that the limitation comes from the tested components and not the machine running ab. > I'm using VPS machines from Linode.com, they are quite powerful. They're > based on Xen. I don't see the network card saturated. OK I see now. There's no point searching anywhere else. Once again you're a victim of the high overhead of virtualization that vendors like to pretend is almost unnoticeable :-( > As for nf_conntrack, I have iptables enabled with rules as a firewall on > each machine, I stopped it on all involved machines and I still get those > results. nf_conntrack is compiled to the kernel (it's a kernel provided by > Linode) so I don't think I can disable it completely. Just not use it (and > not use any firewall between them). It's having the module loaded with default settings which is harmful, so even unloading the rules will not change anything. Anyway, now I'm pretty sure that the overhead caused by the default conntrack settings is nothing compared with the overhead of Xen. > Even if 6-7K is very low (for nginx directly), why is haproxy doing half > than that? That's quite simple : it has two sides so it must process twice the number of packets. Since you're virtualized, you're packet-bound. Most of the time is spent communicating with the host and with the network, so the more the packets and the less performance you get. That's why you're seeing a 2x increase even with nginx when enabling keep-alive. I'd say that your numbers are more or less in line with a recent benchmark we conducted at Exceliance and which is summarized below (each time the hardware was running a single VM) : http://blog.exceliance.fr/2012/04/24/hypervisors-virtual-network-performance-comparison-from-a-virtualized-load-balancer-point-of-view/ (BTW you'll note that Xen was the worst performer here with 80% loss compared to native performance). In your case it's very unlikely that you'd have dedicated hardware, and since you don't have access to the host, you don't know what its settings are, so I'd say that what you managed to reach is not that bad for such an environment. You should be able to slightly increase performance by adding the following options in your defaults section : option tcp-smart-accept option tcp-smart-connect Each of them will save one packet during the TCP handshake, which may slightly compensate for the losses caused by virtualization. Note that I have also encountered a situation once where conntrack was loaded on the hypervisor and not tuned at all, resulting in extremely low performance. The effect is that the performance continuously drops as you add requests, until your source ports roll over and the performance remains stable. In your case, you run with only 10k reqs, which is not enough to measure the performance under such conditions. You should have one injecter running a constant load (eg: 1M requests in loops) and another one running the 10k reqs several times in a row to observe if the results are stable or not. > about nginx static backend maxconn - what is a high maxconn number? Just > the limit I can see with 'ab'? It depends on your load, but nginx will have no problem handling as many concurrent requests as haproxy on static files. So not having a maxconn there makes sense. Otherwise you can limit it to a few thousands if you want, but the purpose of maxconn is to protect a server, so here there is not really anything to protect. Last point about virtualized environments, they're really fine if you're seeking costs before performance. However, if you're building a high traffic site (>6k req/s might qualify as a high traffic site), you'd be better with a real hardware. You would not want to fail such a site just for saving a few dollars a month. To give you an idea, even with a 15EUR/month dedibox consisting on a single-core Via Nano processor and which runs nf_conntrack, I can achieve 14300 req/s. Hoping this helps, Willy