On Mon, Dec 09, 2013 at 03:43:09PM +0000, Annika Wickert wrote:
> - Two Intel(R) Xeon(R) CPU X6550 @ 2.00GHz in each cluster node
> - 2x Emulex Corporation OneConnect 10Gb NIC (rev 02) in each cluster node
> - 32gbit RAM in each cluster node
> - Two nodes per cluster (active-active in the new one)

I never had the opportunity to test Emulex NICs yet. It could be possible
that they disable some TCP optimizations by default resulting in worse
performance with splice().

> - Debian Squeeze / 3.1.0-1-amd64 / Tickrate 250
> - CentOS release 6.4 (Final) / 3.11.5-1.el6 / Tickrate 1000
> 
> The higher the tickrate, the higher the CPU load. You quadripled
> the tickrate, and your load what - quadripled? I suggest you
> try a lower tickrate in the very same configuration.

250 is the best tick rate for network related traffic, it allows a
number of timing conversions to milliseconds to be done with a simple
shift instead of a divide, while not hammering the system too fast.

> - We are forcing by splice-request / splice-responce

OK so I suspect this is purely TCP.

> I believe splice is not always more efficient than recv/send;

Confirmed, especially with small transfers (less than a page = 4 kB).

> use splice-auto to use it less aggressively (doc: splice-auto):
> 
> For testing we disabled splicing on one of the cluster members on the new
> cluster (after succesfull tests). Now load drops below 8 from 16. So I maybe
> try it with splice-auto and if that does not help with a new haproxy build
> with the following git commits:
> http://haproxy.1wt.eu/git?p=haproxy.git;a=commit;h=61d39a0e2a047df78f7f3bfcf5584090913cdc65

Oh good point, I completely forgot about this one. Yes it could be a culprit!

> http://haproxy.1wt.eu/git?p=haproxy.git;a=commit;h=fa8e2bc68c583a227ebc78bab5779b84065b28da
> 
> Haproxy uses heuristics to estimate if kernel splicing might improve
> performance or not. Both directions are handled independently. Note
> that the heuristics used are not much aggressive in order to limit
> excessive use of splicing.

Yes, the heuristics consist in detecting if haproxy manages to read a full
buffer a once and to purge it at once. If that works, then it's considered
that the traffic is high enough for making a good use of splice(). Otherwise
with non-complete buffers, it sticks to recv/send. It tends to work really
well in web environments when you don't want favicon.ico to be spliced but
you want your photos to be.

Regards,
Willy


Reply via email to