Long story short: the problem is with your config. You configured rsyslog to 
never lose any message, at some point it runs out of queue space, then the 
inputs begin to block, so the log socket blocks and finally the apps that write 
to the log socket.
There are many options to control under which circumstances you permit message 
loss, and you need to make sure you know what you want.
The current versions also have lower enq timeout, which can kind of solve this 
config issue, at least for systems that are not super-busy. Note that it is not 
really an issue, but may be intended behavior in some environments (e.g. 
banking, when you absolutely need to log).

Rainer

> -----Original Message-----
> From: [email protected] [mailto:rsyslog-
> [email protected]] On Behalf Of Timur I. Bakeyev
> Sent: Tuesday, November 06, 2012 3:26 PM
> To: [email protected]
> Cc: Michael Biebl
> Subject: [rsyslog] Rsyslog freezes the box when can't send logs over
> TCP
> 
> Hi all!
> 
> Recently we experienced a "nice" lock up of our web nodes that do
> remote TCP logging to the centralized syslog server.
> 
> Central rsyslog server was shut down due the maintenance for few
> hours. Very soon first Web servers with nginx/php-fpm started to fail
> with timeouts and later access to the nodes themselves over ssh
> stopped working, making boxes unaccessible. With the iLO it was
> possible to log to the consoles of the nodes, but no problems were
> spotted - memory, CPU, TCP sockets footprints were low, everything
> looked normal.
> 
> Educated guess that it may be related to the rsyslog, as it was the
> only change in the servers configuration on that day and killing
> rsyslogd on the nodes brought machines back to normal.
> 
> A bit of googling brought up this story:
> 
> http://blog.bitbucket.org/2012/01/12/follow-up-on-our-downtime-last-
> week/
> 
> as well as a last year conversation on this ML:
> 
> http://lists.adiscon.net/pipermail/rsyslog/2011-October/013944.html
> 
> as well as a bug report in RedHat Bugzilla:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=519201
> 
> Switching to UDP transport instead of TCP cured the problem and
> running simple test with:
> 
> # for each n in `seq 1 1000`; do echo $n; echo "Syslog test $n" |
> logger; done
> 
> showed that with TCP remote logging enabled and absent remote log
> server after first 250-300 messages all the following messages were
> sent to syslog very slowly, with the 1-2sec delay each.
> 
> All nodes are running Debian 6(squeeze) with:
> 
> rsyslogd 5.8.11, compiled with:
>         FEATURE_REGEXP:                         Yes
>         FEATURE_LARGEFILE:                      No
>         GSSAPI Kerberos 5 support:              Yes
>         FEATURE_DEBUG (debug build, slow code): No
>         32bit Atomic operations supported:      Yes
>         64bit Atomic operations supported:      Yes
>         Runtime Instrumentation (slow code):    No
> 
> And configuration:
> 
> $MaxMessageSize 8k
> $ModLoad imuxsock # provides support for local system logging
> $ModLoad imklog   # provides kernel logging support
> 
> # Store PID of the process in the log
> $SystemLogUsePIDFromSystem on
> # Rate limit for imuxsock
> $SystemLogRateLimitInterval 1
> $SystemLogRateLimitBurst 500
> 
> $WorkDirectory /var/spool/rsyslog
> 
> $ModLoad imfile
> $InputFilePollInterval 5
> 
> $InputFileName /var/log/nginx/access.log
> $InputFilePersistStateInterval 100
> $InputFileTag nginx/access:
> $InputFileStateFile nginx_access_log_state
> $InputFileFacility local7
> $InputFileSeverity notice
> $InputRunFileMonitor
> 
> $ActionQueueType                        LinkedList      # enable a
> separate queue for this action
> $ActionQueueFileName                 remote           # set file name,
> also enables disk mode
> $ActionResumeRetryCount            -1                  # infinite
> retries on insert failure
> $ActionQueueSaveOnShutdown     on
> *.*                                               @@10.0.0.200
> 
> I went through ChangeLog on the site for the versions of rsyslogd
> following 5.8.11, but didn't notice any fix to that problem. Neither
> there is an open ticket in bugzilla.
> 
> Interesting enough that RedHat claims they fixed the problem in version
> 3.22:
> 
> http://rhn.redhat.com/errata/RHBA-2010-0213.html
> 
> Can you point me to the version where this problem is fixed? Or, if
> it's not fixed yet - can it be done?
> 
> BTW, not sure how this would behave with REPL protocol instead of
> TCP...
> 
> with best regards,
> Timur Bakeyev.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
> if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to