I enabled debug-on-demand mode and found the lines noted earlier in this thread 
within the debug log.

I also found lots of messages like these:

$ sudo grep 'main Q' /var/log/rsyslog-debug-on-demand.log
7501.401841560:imuxsock.c     : main Q: queue.c: EnqueueMsg advised worker start
7502.328726922:imrelp.c       : main Q: queue.c: EnqueueMsg advised worker start
7502.329026161:imrelp.c       : main Q: queue.c: doEnqSingleObject: LightDelay 
mark reached for light delayable message - blocking a bit.
7502.329631733:imuxsock.c     : main Q: queue.c: EnqueueMsg advised worker start
7502.329664576:imuxsock.c     : main Q: queue.c: EnqueueMsg advised worker start
7503.329150535:imrelp.c       : main Q: queue.c: EnqueueMsg advised worker start
7503.329426002:imrelp.c       : main Q: queue.c: doEnqSingleObject: LightDelay 
mark reached for light delayable message - blocking a bit.
7503.329950433:imuxsock.c     : main Q: queue.c: EnqueueMsg advised worker start
7503.329989705:imuxsock.c     : main Q: queue.c: EnqueueMsg advised worker start
7504.329531389:imrelp.c       : main Q: queue.c: EnqueueMsg advised worker start
7504.329854248:imrelp.c       : main Q: queue.c: doEnqSingleObject: LightDelay 
mark reached for light delayable message - blocking a bit.


Turning to Google, I landed on these pages:

https://www.rsyslog.com/doc/master/concepts/queues.html
https://www.rsyslog.com/doc/master/rainerscript/queue_parameters.html
https://github.com/rsyslog/rsyslog/issues/1778#issuecomment-353135527
https://github.com/rsyslog/rsyslog/pull/3642
https://github.com/rsyslog/rsyslog-doc/pull/819

Is the fix here to set queue.lightDelayMark to 0? Should this be set for the 
main queue or the ruleset attached to imrelp?

For now, I've disabled port probes from Nagios on the receiver's imrelp port.

-----Original Message-----
From: rsyslog <rsyslog-boun...@lists.adiscon.com> On Behalf Of Adam Chalkley 
via rsyslog
Sent: Wednesday, August 19, 2020 11:38 AM
To: rsyslog-users <rsyslog@lists.adiscon.com>
Cc: Adam Chalkley <atc0...@auburn.edu>
Subject: [rsyslog] Upgraded receiver from Ubuntu 16.04 to 18.04, connections 
from clients failing with a high number of CLOSE_WAIT connections on receiver

Hi,

We upgraded the OS on our central receiver yesterday from Ubuntu 16.04 (4.4 
kernel) to 18.04 (4.15 kernel).

We are using the upstream PPA, so running 8.2006.0 on receivers and endpoints.

When we started getting reports from our Nagios instance that the rsyslog 
forward queues endpoints were beginning to fill we checked our receiver 
(sawmill1) and saw 94 open TCP connections with 40 of them in CLOSE_WAIT from 
our Nagios server, most of them I suspect from the TCP port connection test 
performed every 5 minutes.

Log samples from the receiver system (which are related to port probes from our 
Nagios instance):

2020-08-19T10:05:01.279416-05:00 lincoln rsyslogd: -- MARK --
2020-08-19T10:05:08.249358-05:00 lincoln rsyslogd: imrelp[2514]: error 'server 
closed relp session, session broken', object  'lstn 2514: conn to clt 
192.168.2.10/192.168.2.10' - input may not work as intended [v8.2006.0 try 
https://www.rsyslog.com/e/2353 ]
2020-08-19T10:05:08.249626-05:00 lincoln rsyslogd: imrelp[2514]: error 'error 
sending relp: Bad file descriptor', object  'lstn 2514: conn to clt 
192.168.2.10/192.168.2.10' - input may not work as intended [v8.2006.0 try 
https://www.rsyslog.com/e/2353 ]
2020-08-19T10:08:08.020625-05:00 lincoln rsyslogd: imrelp[2514]: error 'server 
closed relp session, session broken', object  'lstn 2514: conn to clt 
192.168.2.10/192.168.2.10' - input may not work as intended [v8.2006.0 try 
https://www.rsyslog.com/e/2353 ]
2020-08-19T10:08:08.021253-05:00 lincoln rsyslogd: imrelp[2514]: error 'error 
sending relp: Bad file descriptor', object  'lstn 2514: conn to clt 
192.168.2.10/192.168.2.10' - input may not work as intended [v8.2006.0 try 
https://www.rsyslog.com/e/2353 ]
2020-08-19T10:11:08.074712-05:00 lincoln rsyslogd: imrelp[2514]: error 'server 
closed relp session, session broken', object  'lstn 2514: conn to clt 
192.168.2.10/192.168.2.10' - input may not work as intended [v8.2006.0 try 
https://www.rsyslog.com/e/2353 ]

Log samples from the Nagios instance:

2020-08-19T11:19:53.444953-05:00 nagios rsyslogd: 
omrelp[lincoln.lib.auburn.edu:2514]: error 'error waiting on required session 
state, session broken', object  'conn to srvr lincoln.lib.auburn.edu:2514' - 
action may not work as intended [v8.2006.0 try https://www.rsyslog.com/e/2353 ]
2020-08-19T11:19:53.445260-05:00 nagios rsyslogd: 
omrelp[lincoln.lib.auburn.edu:2514]: error 'error opening connection to remote 
peer', object  'conn to srvr lincoln.lib.auburn.edu:2514' - action may not work 
as intended [v8.2006.0 try https://www.rsyslog.com/e/2353 ]

Is  there a setting I can apply to rsyslog to help resolve this?

Is this a known bug?

We didn't have the issue with v8.2006.0 on our receiver when it was running 
Ubuntu 16.04 (the prior OS release), even though it made the same complaints 
about the TCP port probes from Nagios.

Thanks in advance.

_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to