On Sun, Apr 29, 2012 at 05:25:01PM +0300, Bar Ziony wrote:
> Hi Willy,
> 
> Thanks for your time.
> 
> I really didn't know this are such low results.
> 
> I ran 'ab' from a different machine than haproxy and nginx (which are
> different machines too). I also tried to run 'ab' from multiple machines
> (not haproxy or nginx) and the results are pretty much / 3 the single
> result 'ab' result...

OK so this clearly means that the limitation comes from the tested
components and not the machine running ab.

> I'm using VPS machines from Linode.com, they are quite powerful. They're
> based on Xen. I don't see the network card saturated.

OK I see now. There's no point searching anywhere else. Once again you're
a victim of the high overhead of virtualization that vendors like to pretend
is almost unnoticeable :-(

> As for nf_conntrack, I have iptables enabled with rules as a firewall on
> each machine, I stopped it on all involved machines and I still get those
> results. nf_conntrack is compiled to the kernel (it's a kernel provided by
> Linode) so I don't think I can disable it completely. Just not use it (and
> not use any firewall between them).

It's having the module loaded with default settings which is harmful, so
even unloading the rules will not change anything. Anyway, now I'm pretty
sure that the overhead caused by the default conntrack settings is nothing
compared with the overhead of Xen.

> Even if 6-7K is very low (for nginx directly), why is haproxy doing half
> than that?

That's quite simple : it has two sides so it must process twice the number
of packets. Since you're virtualized, you're packet-bound. Most of the time
is spent communicating with the host and with the network, so the more the
packets and the less performance you get. That's why you're seeing a 2x
increase even with nginx when enabling keep-alive.

I'd say that your numbers are more or less in line with a recent benchmark
we conducted at Exceliance and which is summarized below (each time the
hardware was running a single VM) :

   
http://blog.exceliance.fr/2012/04/24/hypervisors-virtual-network-performance-comparison-from-a-virtualized-load-balancer-point-of-view/

(BTW you'll note that Xen was the worst performer here with 80% loss
 compared to native performance).

In your case it's very unlikely that you'd have dedicated hardware, and
since you don't have access to the host, you don't know what its settings
are, so I'd say that what you managed to reach is not that bad for such an
environment.

You should be able to slightly increase performance by adding the following
options in your defaults section :

   option tcp-smart-accept
   option tcp-smart-connect

Each of them will save one packet during the TCP handshake, which may
slightly compensate for the losses caused by virtualization. Note that
I have also encountered a situation once where conntrack was loaded
on the hypervisor and not tuned at all, resulting in extremely low
performance. The effect is that the performance continuously drops as
you add requests, until your source ports roll over and the performance
remains stable. In your case, you run with only 10k reqs, which is not
enough to measure the performance under such conditions. You should have
one injecter running a constant load (eg: 1M requests in loops) and
another one running the 10k reqs several times in a row to observe if
the results are stable or not.

> about nginx static backend maxconn - what is a high maxconn number? Just
> the limit I can see with 'ab'?

It depends on your load, but nginx will have no problem handling as
many concurrent requests as haproxy on static files. So not having
a maxconn there makes sense. Otherwise you can limit it to a few
thousands if you want, but the purpose of maxconn is to protect a
server, so here there is not really anything to protect.

Last point about virtualized environments, they're really fine if
you're seeking costs before performance. However, if you're building
a high traffic site (>6k req/s might qualify as a high traffic site),
you'd be better with a real hardware. You would not want to fail such
a site just for saving a few dollars a month. To give you an idea,
even with a 15EUR/month dedibox consisting on a single-core Via Nano
processor and which runs nf_conntrack, I can achieve 14300 req/s.

Hoping this helps,
Willy


Reply via email to