On Fri, 19 Dec 2014, Marc Fournier wrote:
Excerpts from David Lang's message of 2014-12-18 19:03:54 +0100:
On Wed, 17 Dec 2014, Marc Fournier wrote:
Excerpts from David Lang's message of 2014-12-16 20:41:49 +0100:
On Tue, 16 Dec 2014, Marc Fournier wrote:
Hello,
openldap writes its logs using syslog(3). Three log messages are emitted
for each ldap search operation.
On a given openldap setup, I reach about 2500 operations per second. The
simple fact of shutting rsyslog down icreases this number up to 5000
(logs are lost obviously).
Writing to a local file, to /dev/null, or to a remote server over UDP
doesn't significantly change this number. I tried tweaking various
settings related to the main queue but didn't see any huge difference.
ipmstats shows exactly the same numbers for "imuxsock: submitted" than
for "main Q: enqueued". All the discarded/full/failed/etc counters have
a value of 0.
This is rsyslog-8.6.0-2.el6.x86_64 from the adiscon RPM repo (but had
similar problems with version 5.8.10-9.el6_6 which comes with el6).
Here's the relevant bits of my config:
module(load="impstats" log.syslog="off" log.file="/var/log/impstats.log"
resetCounters="off" interval="5")
module(load="imuxsock" SysSock.RateLimit.Interval="0" SysSock.Annotate="off"
SysSock.FlowControl="off" SysSock.IgnoreTimestamp="off")
$MaxMessageSize 2k
$MainMsgQueueDequeueBatchSize 1024
$MainMsgQueueType LinkedList
$MainMsgQueueSize 100000
$MainMsgQueueWorkerThreads 1
$RepeatedMsgReduction off
$SystemLogRateLimitInterval 0
$ActionQueueType LinkedList
$ActionQueueSize 100000
$ActionQueueDiscardMark 95000
$ActionQueueDiscardSeverity 0
$ActionQueueTimeoutEnqueue 0
$ActionQueueDequeueSlowdown 1000
$ActionQueueWorkerThreads 2
$ActionQueueDequeueBatchSize 128
$ActionResumeRetryCount -1
#local4.* -/var/log/ldap/ldap.log
#local4.* ~
local4.* -/dev/null
#local4.* @localhost
So I'm wondering which one of rsyslog or /dev/log is actually the
bottleneck here and what can I do about it ?
/dev/log is just a socket to talk to rsyslog
One major problem here, why do you have ActionQueueDequeueSlowdown configured?
that deliberatly pauses before trying to process the next messages.
you really shouldn't need to have an action queue for writing to a local file,
you should remove all of the actionqueue parameters (and if you do decide that
you want a queue, use the new action() syntax, it makes it much clearer what's
happening)
Ok, right my config is a mess I agree ! Been iterating over it since
back when I used 5.8.10.
So I started over with a brand new config. The best results I got so far
was with these settings:
module(load="impstats" log.syslog="off" log.file="/var/log/impstats.log"
resetCounters="off" interval="5")
module(load="imuxsock" SysSock.RateLimit.Interval="0" SysSock.Annotate="off"
SysSock.FlowControl="off" SysSock.IgnoreTimestamp="off")
$MaxMessageSize 2k
$MainMsgQueueDequeueBatchSize 4096
$MainMsgQueueType LinkedList
$MainMsgQueueSize 250000
$MainMsgQueueWorkerThreads 4
$MainMsgQueueWorkerThreadMinimumMessages 100000
$RepeatedMsgReduction off
$SystemLogRateLimitInterval 0
*.* ~
The numbers are slightly better now: ~3500 ops/s vs 2500 yesterday. But
still a significant difference with the max I reach when running with
rsyslogd stopped (~5000 ops/s).
Is this an expected compromise ? Or are there possible improvements I
can do ?
I'm not seeing anything obviously wrong here, does impstats show anything
interesting (errors, etc)
No, no errors. The discard/full counters also stay at 0.
I would actually reduce the worker threads down to 1 (it's counter-intuitive,
but having too many threads in rsyslog can actually slow things down because of
the overhead involved with locking and unlocking the queues)
Ok, tried that. No significant change with the issue I'm facing though.
Meanwhile I improved my stress-test setup, and upgraded the target
machine to a quad-core (I was suspecting competition for cpu time
between openldap and rsyslog on the previous single-core machine).
The good news is that I can reproduce the problem much more reliably.
The bad news is that numbers are even worse and that I have proof that
the bottleneck lies in rsyslog. Here are my latest observations:
- no rsyslog running: 15k to 17k ldap operations/second
- rsyslog with above config, except for MainMsgQueueWorkerThreads back
to 1: down to 7.5k ops/s
- when monitoring openldap's threads state, a surge of threads in
"pending" state as soon as rsyslog is started.
- a trivial syslog-ng config, reading from /dev/log and sending
everything to /dev/null, results in only about 10% ops/s decrease.
- also the pending threads count stay close to zero when using
syslog-ng.
To recap, the starting point of this issue was that openldap temporarily
stopped responding when a sudden surge of requests came in (ie: critical
service disruption for users). And I found out this problem didn't
happen when logging was disabled.
As the performance numbers we can read on the web about rsyslog are an
order of magnitude bigger *but* usually relate to the imtcp/imudp
modules, I'm wondering if there possibly was a problem with the imuxsock
module (FWIW "in:imuxsock" is constantly in the top 3 of "top -H" during
my tests) ?
Also no idea what value this has, but strace shows that rsyslog spends
significantly more time in system calls than syslog-ng, in my setup:
# timeout 10 strace -c -fp $(pidof rsyslogd)
Process 27446 attached with 4 threads - interrupt to quit
Process 27446 detached
Process 27447 detached
Process 27448 detached
Process 27449 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
94.85 2.981049 49 61079 select
4.39 0.137935 0 287362 55920 futex
0.68 0.021477 0 61077 recvmsg
0.07 0.002301 0 61084 gettimeofday
0.00 0.000000 0 6 writev
0.00 0.000000 0 2 getrusage
------ ----------- ----------- --------- --------- ----------------
100.00 3.142762 470610 55920 total
# timeout 10 strace -c -fp $(pgrep -f ^syslog-ng)
Process 29477 attached - interrupt to quit
Process 29477 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
41.39 0.001228 0 359614 gettimeofday
23.12 0.000686 0 1774 1 read
15.67 0.000465 0 105768 lseek
15.57 0.000462 0 105767 write
4.25 0.000126 0 21153 poll
0.00 0.000000 0 114 stat
0.00 0.000000 0 1 restart_syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.002967 594191 1 total
One thing that showed up in some tests Rainer did a few days ago is that with an
empty ruleset like you have here, rsyslog spends a lot of time fighting itself
for access to the main queue. In the benchmark that he did, he had better
performance when rsyslog was writing to a file than when it was just discarding
the message (because discarding the message takes so little time that there is a
lot of locking contention on the queue). In his test, he got almost double the
speed when writing out to a file than when just discarding the message
This matches your strace numbers.
While you are running your test, use the H option in top to show individual
threads. It may be interesting to see which thread is taking all the time.
Rainer, does the uxsocket input do batches?
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.