On Mon, 6 Aug 2012, Chastity Blackwell wrote:

On Fri, 2012-08-03 at 16:29 -0400, [email protected] wrote:
On Fri, 3 Aug 2012, Chastity Blackwell wrote:

We're looking at doing some load balancing for our rsyslog
infrastructure and as part of that, obviously, I'd like to use the
tcprebindinterval directive; however, I can't seem to find usage
examples or syntax on the wiki or elsewhere from a google search. Does
anyone have just a quick snippet from a conf file they can show me?

It's simple
$tcprebindinterval number
where number is the number of messages sent before disconnecting and
reconnecting.

I would suggest setting this number relativly high, reconnecting once per
second or so is more than you normally need to load balance reasonably and
avoids wasting too much time reconnecting.

Thanks -- I'm assuming this is a per queue setting, so you'd want to set
it higher for say, a queue handling access logs for a a high-traffic
webserver and lower for system logs in general?

Like all other paramters in rsyslog, it affects the queue that you are currently configuring and any future queues, so if you have a whole bunch of separate action queues you can set them differently. If you don't have a lot of action queues, the paramter will be across the board.

However, in practice, you really don't need to worry that much about it.

Rsyslog can receive messages _very_ quickly, the bottleneck is almost always in processing/delivering the messages. As long as your main queue on the receivng boxes can handle the burst size you are in good shape.

And as far as lower volume traffic goes, if it's low volume, you shouldn't need to worry much about load balancing it, right :-)

This isn't trying to do 'perfect' load balancing where all recievers get exactly the same number of messages, it's allowing you to do 'statistical' load balancing where they will all recieve about the same number of messages over time.

I have rsyslog load balancing across a farm of 10 machines (splunk servers), where I'm load balancing not because the recievers can't handle the rate of inbound messages, but because I want to have about the same number of messages on each system so that when I do searches across the logs, each system has about the same amount of work to do.

I just have one parameter set on the senders, I don't try to do different things for different types of logs, and over time (a day) the servers are so close to having the same number of log messages that on 150G of logs/day (15G per server) the difference in the size of the log files is well under 1M (0.001% variation).

In hindsight, there really isn't much need to have this as a configurable parameter, if this was a boolean switch that reconnected after every thousand (or even 10K) messages it would work for well over 99% of cases. At one reconnect per 1K messages, the overhead of the reconnect is minimal, and reconnecting every 1K or 10K messages is more than good enough to spread the logs across the receiving boxes.

Where this would fall down is if there is a HUGE overhead in processing each log file, AND the logs are arriving relatively slowly so that one box would be unable to process the burst of messages, or the lag in processing that many messages on one box would become unacceptably larger.

We didn't know this when we created this feature, so we went with the easier to implement, and more flexible option of letting the user set the value.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards

Reply via email to