On Fri, 21 Aug 2015, Radu Gheorghe wrote:
Hello rsyslog users :)
We've seen a problem that is similar to the one reported here:
http://www.gossamer-threads.com/lists/rsyslog/users/17550 While that looks
like a bug, ours seems like a design issue.
Basically we see bulks of one document all over the place. Not 100% what's
the root cause, but I'm thinking: if you have many machines with rsyslog
installed that send logs to Elasticsearch, but most of them send little
logs, they would never get enough messages in the queue to push in large
batches. Unless you add a slowdown, in which case you restrict rsyslog's
ability to push data when it's under load.
if you have all your systems send to a central aggregation point, rather than
into ES directly, that aggregation point is going to have the combined traffic,
and is much more likely to have data available to send.
If you have 10K docs/s coming in 1 doc batches (say, from 10K machines),
there's a lot of unnecessary load on ES. Sure, if ES is overloaded things
will get better (as documents will add up in queues, resulting in bigger
batches) but even then I'd imagine things will look quite inefficient.
Plus, I'd like to avoid ES being overloaded in the first place.
The solution, in my mind, was to add two options:
- one that says "if you don't have at least N items in the bulk, wait a bit
until you have"
- one that overrides it saying "if M seconds passed since the last bulk,
send the bulk anyway"
this sort of logic tends to be rather fragile (and setting timers, checking how
long it's been, etc ends up realy hurting you when you are under load. It's also
the sort of thing that is routinely misconfigured to really hurt you.
The approach that Rsyslog takes is to send something as soon as it's available,
let things queue up while that's being processed, and then send what's queued up
(with a max limit)
This has the advantage of simplicity and performance. There are no timers to
setup, not timestamps to check, and teh latency in message delivery is the
minimum possible.
As a result, the sort of change you are looking for will almost certinly not
go into the core. I believe that the ES module has it's own buffer of messages
that it's sending, so it could go there (IIRC the omelasticsearch module was
contributed)
Now, where does this help
when traffic is really slow, this won't help, everything will still be
singletons
when traffic is heavy (just above the minimum batch size) this won't help,
everything will be sent the same way with either set of logic.
There is a middle ground where fewer, but larger batches are being sent, and
things will flow more efficiently. How much of a difference does this make?
I don't think it will make much difference, but I can be convinced by numbers.
Let's invetigate what the best case situation is (I don't have the numbers for
this, so we'll have to do some research)
the best case is where without this setting, rsyslog would send singleton
messages, but with this setting, it would batch up exactly minbatch messages and
send them.
what sort of setting are you thinking of for your 'minimum size' batch?
on the sender side, each batch sent has a fairly small overhead, the message
being sent doesn't have a lot of overhead besides the messages being inserted.
There is going to be some amount of additional RAM used to hold on to these
logs, but the system is idle, so it really shouldn't hurt. What I think is more
likely to hurt is that when things go wrong, more data will be lost.
On the receivers side, how much of a performance benefit is there? (this depends
on the internals of ES)
batch mode was created in rsylog because my testing was showing that on low-end
hardware I could insert ~1000 records into postgres as a batch in the same time
that it took to insert two records individually.
can we get someone who has an ES setup to run a test, force the batch size to 1
and hammer it until you have the max rate, then set the batch size really large,
and keep increasing the dequeue delay time until the total rate of inserts drops
back to the same and report how large the delay time needs to be for them to
even out. Also the load on the ES server under the 'many small' vs 'few large'
(vmstat and iostat output, and possibly /proc/meminfo so that we can see disk,
ram, and cpu utilization)
the recent request to add compression to the ES transaction will matter here as
well, a larger batch will compress better
For example, if you are talking 1 vs 5 messages/batch, will that really make a
difference on the ES server? if so, how big a difference? if it's 500%
improvement, but the 'bad' situation is only using 20% of a cpu on ES, do we
care? If ES does the insert into a datastructue in RAM, and then pushes it out
to disk and updates it's indexes to make things visible only every several
seconds, then it may be that there is no noticable difference between the two
modes for quite a while. If we push the single-item rate until the server can't
keep up, we will be hiting some limit.
There's also the question of what is the value of what's being saved? depending
on what resource on the ES server ends up being the bottleneck that is saved by
using larger batches, it may be that it's not something that would really make
the ES server noticably better if it wasn't being used. It's also possible that
we will find that it's a really critical resource and would make a huge
difference.
Also, should the minimum queue size be based on the number of messages, or the
size of the data being sent?
you could also test this by having a program to insert into ES that reads from
stdin and set it up as omprog and cache everything up until the minimum batch
size, possibly with a signal that forces it to flush it's cache _now_ so you can
experiment with timing by changing the rate that you send signals to it from an
external script and the sending code doesn't need to have any of the clock
logic in it.
we know the cost to rsylog for doing something like this, but we don't know the
benefits.
Now the big questions:
- is this possible? where would one apply such a change?
- would it have a significant impact on the performance of outputs that
work well with the current design? Like omfwd, where the receiving end
wouldn't care how many docs it receives I imagine
- if it does have a significant impact, can we restrict such a change to
omelasticsearch, or does it have to go under rsyslog's core (in the way it
handles queues)?
- do you see better solutions?
I think the answer is that it would hurt in the general case and be very
invasive and not the right thing for many outputs, but it may be the right thing
for soem outputs, so let's test and see.
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.