Ah ok, well that makes sense then - the behavior I'm experiencing is exactly 
like that. 

Without knowing the internals of rsyslog, I was just surprised that I could 
loose messages when having two actions configured with RELP. 
I was sure that the failover and 'action.execOnlyWhenPreviousIsSuspended' would 
take care of it. 

But I understand I was wrong in that assumption. 

So, then I fall back to, the only way to not loose messages when Log Server A 
reboots/goes down/etc. is to *always* send my logs to both Log Servers at the 
same time ? 

Ps. I actually tried setting windowSitze to 1 just to see if fewer messages 
were "lost", but I didn't notice any difference. 
The setting suppose to be on the first action right ? 

Thanks for the deep-diving of rsyslog internals and RELP.

Patrik Martinsson - Linux System Administrator
Mäster Samuelsgatan 17, 111 44 Stockholm, Sweden
 E [email protected]
W www.trioptima.com
YT Genuine Happiness !

Please see the disclaimer: www.trioptima.com/email


________________________________________
From: rsyslog <[email protected]> on behalf of David Lang via 
rsyslog <[email protected]>
Sent: Thursday, September 5, 2019 12:23 PM
To: Rainer Gerhards
Cc: David Lang; rsyslog-users
Subject: Re: [rsyslog] Making sure I understand execOnlyWhenPreviousIsSuspended 
correctly,

On Thu, 5 Sep 2019, Rainer Gerhards wrote:

>> I suspect that the omrelp module is keeping some messages in it's memory 
>> before
>> it suspends, but I'd need Rainer to comment on that and what can be done 
>> there.
>
> Depends on config and when the OS notifies us. There is the RELP
> window size (basically the same as TCP window). This may fill if we do
> not get an OS error quickly enough. If detecting broken session
> immediately is ultra-vital, the RELP window can be set to 1, but be
> aware that it has performance consequences. Depending on network, they
> may be severe.
>
> doc: 
> https://www.rsyslog.com/doc/v8-stable/configuration/modules/omrelp.html#windowsize

rephrasing to be sure I understand what's happening here.

RELP takes the messages off of the queue and keeps track of them internally
(the window is essentially a mini queue inside RELP that is a configuable size).

When the connection is lost and the RELP output is suspended, these messages are
stuck inside RELP. If the connection is re-established, they will be sent, but
there is currently no way to get them saved to disk or otherwise back out of
RELP at this time.

I'm assuming that it would take fairly significant sugery to change this (as a
brainstorming idea, could the internal window queue be replaced by calls to the
'normal' queue code, enabling disk assisted queues, save on shutdown, etc)

I suspect that this is a common problem with all the output modules that
support delayed delivery (elasticsearch, kafka, probably not the database
outputs since they do atomic commmits), am I right?

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to