Hi guys, Would this script ( http://git.adiscon.com/?p=rsyslog.git;a=blob;f=tools/recover_qi.pl;h=4e2cf9d561fb06bf0efdca73b90d7875a7b4102d;hb=HEAD ) be the one you mentioned ?
I would have to stop rsyslog and run this script so that at next startup rsyslog will notice all queue files and start processing them again ? For the record, we dealt with the TCP loadBalancing issue with rebindinterval (after upgrading to 7.4.10) Regards On 9 July 2015 at 20:49, David Lang <[email protected]> wrote: > On Thu, 9 Jul 2015, Nicolas Guyomar wrote: > > If there was such a script I could launch when rsyslog get stuck, while >> waiting our V8 upgrade, that would be perfect ! >> >> >> Yes I'm sending logs directly to logstash using omrelp, we have 40 rsyslog >> instances sending logs to 3 logstash through a TCP LoadBalancer. >> I do not know if logstash is the bottleneck here, because our log rate >> output is pretty stable and the problem occurs from time to time (not >> because of burst as I thought previously) >> > > configure impstats and look at what the queue sizes look like over time. > They really should be near zero unless something is wrong. > > > It looks like rsyslog is "kind of loosing" its connection to logstash after >> a certain time of inactivity. I can see (using netstat) that rsyslog is >> "connected" to logstash over a tcp connection on 5001 port, but no >> messages >> are sent over this socket. >> > > "loosing" it's connection, but still showing as connected is an indication > of a firewall/NAT device/load balancer timing the connection out. > > Could it have something to do with tcprebindinterval stated here ? >> www.rsyslog.com/load-balancing-for-rsyslog/ ? I've just discover this >> parameter in the doc >> I did some archeology in the ML, and I think V5 omrelp module does not >> support tcprebindinterval (based on this thread >> http://lists.adiscon.net/pipermail/rsyslog/2013-October/034451.html ) >> > > If you are sending logs through a log balancer using TCP, you have the > problem that the load balancer is only able to shift connections when the > TCP connection is established (which is before it has any idea how much > traffic is sent) > > the rebindinterval feature was implemented so that the senders will > disconnect and reconnect every X messages, giving the load balancer a > chance to adjust the load again. > > David Lang > > > > Thank you all for doing this community work! >> >> >> >> On 8 July 2015 at 19:17, Rainer Gerhards <[email protected]> >> wrote: >> >> 2015-07-08 19:12 GMT+02:00 David Lang <[email protected]>: >>> >>>> If you cannot loose any logs, then when you run out of disk space and >>>> >>> memory >>> >>>> queue space your systems will stop working and you won't even be able to >>>> login to them (because doing so attempts to create a log entry) >>>> >>>> Also, using Disk Assisted Queues means that you have some log entries on >>>> disk and others in memory. When you restart rsyslog, the ones in memory >>>> >>> are >>> >>>> going to be lost (because they can't be written to disk) >>>> >>>> So the biggest thing you need to do is to look at where the logs are >>>> >>> going >>> >>>> and try to make that fast enough to keep up. If you are delivering logs >>>> >>> to >>> >>>> logstash, what is logstash doing with them (sending them to >>>> ElasticSearch >>>> would be my guess, but are they manipulated first or sent elsewhere as >>>> well?) Rsyslog may be able to deliver directly to those destinations, >>>> bypassing the bottleneck of logstash >>>> >>>> Yes, there are ways to get rsyslog to read the queue files, I'd have to >>>> >>> hunt >>> >>>> in the archives, but IIRC there is a utility that will create the qi >>>> >>> files >>> >>>> so that rsyslog will notice them the next time it starts. I'd have to >>>> >>> hunt >>> >>>> the list archives to find references to how to do it. >>>> >>> >>> Out of my head: it's a perl script, I think in ./tools ... >>> qi_recover.pl? If not, search for *.pl >>> >>> Rainer >>> >>>> >>>> David Lang >>>> >>>> >>>> >>>> On Wed, 8 Jul 2015, Nicolas Guyomar wrote: >>>> >>>> Hi everyone, >>>>> >>>>> Unfortunately I cannot loose any log, because they all are access.log >>>>> >>>> with >>> >>>> the same severity so I can't juste say "discard the lower level" >>>>> >>>>> Let's say my problem could be solved by upgrading to the latest Rsyslog >>>>> stable version (planned for this summer), can I "replay" the log >>>>> flushed >>>>> into queue file in my work directory ? >>>>> >>>>> When my rsyslog V5 instance is stucked with its ActionQueueMaxDiskSpace >>>>> reached, restarting has no effect. I'd like to maybe save the queue >>>>> file >>>>> to >>>>> some other directory, and copy old queue file in the work directory so >>>>> that >>>>> Rsyslog send them to logstash. >>>>> >>>>> >>>>> >>>>> >>>>> On 30 June 2015 at 11:41, Radu Gheorghe <[email protected]> >>>>> wrote: >>>>> >>>>> Hi Nicolas, >>>>>> >>>>>> Unfortunately, I'm not aware of any specific issue here. But there are >>>>>> some >>>>>> options regarding discarding messages when the queue exceeds a certain >>>>>> size >>>>>> (look for DiscardMark and DiscardSeverity): >>>>>> >>>>>> >>>>>> >>>>>> >>> http://www.rsyslog.com/doc/v5-stable/configuration/action/index.html#action-queue-specific-configuration-statements >>> >>>> >>>>>> Maybe you can find a workaround that way (I assume you can discard >>>>>> everything after you hit a certain limit). >>>>>> >>>>>> Best regards, >>>>>> Radu >>>>>> >>>>>> -- >>>>>> Performance Monitoring * Log Analytics * Search Analytics >>>>>> Solr & Elasticsearch Support * http://sematext.com/ >>>>>> >>>>>> On Tue, Jun 30, 2015 at 12:32 PM, Nicolas Guyomar < >>>>>> [email protected] >>>>>> >>>>>>> >>>>>>> wrote: >>>>>>> >>>>>> >>>>>> >>>>>> Hi, >>>>>>> >>>>>>> Upgrade to V8 is planned for this year, but in the meantime I thought >>>>>>> >>>>>> I >>> >>>> could find a way to maybe discard messages when rsyslog has no more >>>>>>> space >>>>>>> left in its work directory (which see. >>>>>>> Losing some messages during burst period is acceptable for us, but >>>>>>> >>>>>> being >>> >>>> forced to manually delete and restart is more complicated. >>>>>>> >>>>>>> I hoped the problem was some sort of misconfiguration on my side, or >>>>>>> >>>>>> >>>>>> maybe >>>>>> >>>>>>> >>>>>>> a know issue using omrelp with logstash relp input. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 30 June 2015 at 09:55, Radu Gheorghe <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Hi Nicolas, >>>>>>>> >>>>>>>> I have some vague memories about nasty bugs in disk-assisted queues >>>>>>>> >>>>>>> >>>>>> that >>>>>> >>>>>>> >>>>>>>> were fixed in the last few years. RELP modules surely have changed >>>>>>>> as >>>>>>>> >>>>>>> >>>>>>> well. >>>>>>> >>>>>>>> >>>>>>>> Can you try with the latest stable (8.10 I think) and see if it >>>>>>>> >>>>>>> helps? >>> >>>> >>>>>>> Even >>>>>>> >>>>>>>> >>>>>>>> if it doesn't, I'm pretty sure the fix will come in the 8.x branch >>>>>>>> >>>>>>> >>>>>>> because >>>>>>> >>>>>>>> >>>>>>>> it sounds pretty serious. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Radu >>>>>>>> >>>>>>>> -- >>>>>>>> Performance Monitoring * Log Analytics * Search Analytics >>>>>>>> Solr & Elasticsearch Support * http://sematext.com/ >>>>>>>> >>>>>>>> On Tue, Jun 30, 2015 at 10:42 AM, Nicolas Guyomar < >>>>>>>> [email protected] >>>>>>>> >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> I've got a simple question on disk assisted queue behaviour, it >>>>>>>>> >>>>>>>> could >>> >>>> >>>>>>> be >>>>>>> >>>>>>>> >>>>>>>>> trivial, but I can't find an answer on the internet. >>>>>>>>> >>>>>>>>> I'm using rsyslog V5 to forward nginx access log to some logstash >>>>>>>>> >>>>>>>> >>>>>>>> instances >>>>>>>> >>>>>>>>> >>>>>>>>> using omrelp >>>>>>>>> >>>>>>>>> Sometimes, because of activity burst, rsyslog flush onto disk 200 >>>>>>>>> >>>>>>>> 1Mo >>> >>>> >>>>>>>> files >>>>>>>> >>>>>>>>> >>>>>>>>> (which is the expected behaviour), but then stays stuck, no more >>>>>>>>> >>>>>>>> >>>>>>> messages >>>>>>> >>>>>>>> >>>>>>>>> are sent to logstash. >>>>>>>>> I have to delete state files as well as queue files so that rsyslog >>>>>>>>> >>>>>>>> >>>>>>> start >>>>>>> >>>>>>>> >>>>>>>>> sending messages to logstash again. >>>>>>>>> Restarting rsyslog without deleting those files has no effect. >>>>>>>>> >>>>>>>>> Here is my rsyslog config in case I missed something >>>>>>>>> >>>>>>>>> $RuleSet nginxRuleSet >>>>>>>>> $RulesetCreateMainQueue on >>>>>>>>> >>>>>>>>> $WorkDirectory /tmp >>>>>>>>> $ActionQueueFileName queue-nginx >>>>>>>>> $ActionQueueMaxDiskSpace 200m >>>>>>>>> $ActionQueueSaveOnShutdown on >>>>>>>>> $ActionQueueType LinkedList >>>>>>>>> $ActionQueueSize 540000 >>>>>>>>> $ActionResumeRetryCount -1 >>>>>>>>> *.* :omrelp:<%= @lblog %>:5001;erableTmpl >>>>>>>>> & ~ >>>>>>>>> >>>>>>>>> >>>>>>>>> $InputUDPServerBindRuleset nginxRuleSet >>>>>>>>> $UDPServerRun 515 >>>>>>>>> >>>>>>>>> Is it a known behaviour ? >>>>>>>>> >>>>>>>>> Thank you for any help one could provide >>>>>>>>> >>>>>>>>> Nicolas >>>>>>>>> _______________________________________________ >>>>>>>>> rsyslog mailing list >>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>>> >>>>>>>> >>>>>>> myriad >>>>>>> >>>>>>>> >>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>>>> >>>>>>>> >>>>>> you >>>>>> >>>>>>> >>>>>>>>> DON'T LIKE THAT. >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>> rsyslog mailing list >>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>> >>>>>>> >>>>>> myriad >>>>>> >>>>>>> >>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>>> >>>>>>> you >>> >>>> DON'T LIKE THAT. >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>> rsyslog mailing list >>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>> >>>>>> myriad >>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>>> DON'T LIKE THAT. >>>>>>> >>>>>>> _______________________________________________ >>>>>> rsyslog mailing list >>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>> http://www.rsyslog.com/professional-services/ >>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>> >>>>> myriad >>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>> DON'T LIKE THAT. >>>>>> >>>>>> _______________________________________________ >>>>> rsyslog mailing list >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>> http://www.rsyslog.com/professional-services/ >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>> myriad >>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>> >>>> DON'T >>> >>>> LIKE THAT. >>>>> >>>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> >>> of >>> >>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T >>>> LIKE THAT. >>>> >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

