Thanks Rainer, I love your response times! Not sure if luck has much to do with it, we have a unit test that runs probably 30X per day and has done so for the past year with v7.4.5, and it¹s never failed on losing the first message. :) relp isn¹t an option as the remote end doesn¹t understand it.
I added the setting, and it sorta worked but not really. From what I can tell happens is this (debug output is below) -reconnect happens -last message is retried, dropped to main Q at 5261.342535097 (see below) -main Q does what it is configured to, but customerstreamserver ruleset is never invoked because the source is different. Here¹s my problem, forgive me if I have asked about this before. (I have, in http://lists.adiscon.net/pipermail/rsyslog/2014-September/038469.html which suffers the same root cause) First a bit of background: I have rsyslog configured with two separate ³streams² (infra and customer). The two should never be mixed, and have two completely different destinations (custom log analyzers) even though they come in through the same rsyslog instance. In configuration: ruleset(name="customerstreamserver") { action(type=³omfwd² target=³host1² etc) stop } input(type="imudp" port="514" ruleset="customerstreamserver²) input(type="imtcp" port="514" ruleset="customerstreamserver") ruleset(name=³infrastream²) { action(type=³omfwd² target=³host2² etc) } *.* call infrastream The problem: when the message gets resent, it drops back to mainq, and the customerstream ruleset is never invoked, and from the customerstream perspective the message is lost. On some nodes I could setup a filter that if the message source is not from localhost assume it¹s customerstream, but I can¹t do that everywhere. Short of setting up a second rsyslog instance, I don¹t know what else to do. Any ideas? Thanks mike 5261.341759885:customerstreamserver queue:Reg/w0: wti 0x1663880: worker awoke from idle processing 5261.341764634:customerstreamserver queue:Reg/w0: DeleteProcessedBatch: we deleted 0 objects and enqueued 0 objects 5261.341768824:customerstreamserver queue:Reg/w0: doDeleteBatch: delete batch from store, new sizes: log 1, phys 1 5261.341775528:customerstreamserver queue:Reg/w0: action 16 is transactional - executing in commit phase 5261.341779439:customerstreamserver queue:Reg/w0: omfwd: beginTransaction 5261.341783071:customerstreamserver queue:Reg/w0: localhost 5261.341786982:customerstreamserver queue:Reg/w0: Action 16 transitioned to state: itx 5261.341790893:customerstreamserver queue:Reg/w0: doTransaction: have commitTransaction IF, using that, pWrkrInfo 0x1664060 5261.341794804:customerstreamserver queue:Reg/w0: entering actionCallCommitTransaction(), state: itx, actionNbr 16, nMsgs 1 5261.341798436:customerstreamserver queue:Reg/w0: localhost 5261.341802347:customerstreamserver queue:Reg/w0: localhost:5544/tcp 5261.341806816:customerstreamserver queue:Reg/w0: omfwd: add 224 bytes to send buffer (curr offs 0) 5261.341811007:customerstreamserver queue:Reg/w0: omfwd: endTransaction, offsSndBuf 224, iRet -2121 5261.341817152:customerstreamserver queue:Reg/w0: CheckConnection detected broken connection - closing it 5261.341864643:customerstreamserver queue:Reg/w0: TCPSendBuf error -2027, destruct TCP Connection! 5261.341872744:customerstreamserver queue:Reg/w0: file netstrms.c released module 'lmnsd_ptcp', reference count now 2 5261.341877214:customerstreamserver queue:Reg/w0: Action 16 transitioned to state: rtry 5261.341883639:customerstreamserver queue:Reg/w0: actionDoRetry: customerstreamserver enter loop, iRetries=0 5261.341887829:customerstreamserver queue:Reg/w0: DDDD: tryResume: pWrkrData 0x7f75e80022b0 5261.341895093:customerstreamserver queue:Reg/w0: localhost 5261.341899004:customerstreamserver queue:Reg/w0: TCPSendInit CREATE 5261.341904870:customerstreamserver queue:Reg/w0: source file netstrms.c requested reference for module 'lmnsd_ptcp', reference count now 3 5261.342513028:customerstreamserver queue:Reg/w0: actionDoRetry: customerstreamserver action->tryResume returned 0 5261.342530069:customerstreamserver queue:Reg/w0: actionDoRetry: customerstreamserver had success RDY again (iRet=0) 5261.342535097:customerstreamserver queue:Reg/w0: Called LogMsg, msg: action 'customerstreamserver' resumed (module 'builtin:omfwd') 5261.342560518:main Q:Reg/w0 : customerstreamserver queue: EnqueueMsg advised worker start 5261.342594600:main Q:Reg/w0 : STOP 5261.342601304:main Q:Reg/w0 : END batch execution phase, entering to commit phase 5261.342723104:customerstreamserver queue:Reg/w0: main Q: qqueueAdd: entry added, size now log 1, phys 2 entries 5261.342732881:customerstreamserver queue:Reg/w0: main Q: EnqueueMsg advised worker start 5261.342737351:customerstreamserver queue:Reg/w0: Action 16 transitioned to state: rdy 5261.342741262:customerstreamserver queue:Reg/w0: omfwd: beginTransaction 5261.342744893:customerstreamserver queue:Reg/w0: localhost 5261.342748246:customerstreamserver queue:Reg/w0: Action 16 transitioned to state: itx 5261.342757185:customerstreamserver queue:Reg/w0: Action 16 transitioned to state: rdy 5261.343034585:customerstreamserver queue:Reg/w0: actionCommit, in retry loop, iRet 0 5261.343054140:customerstreamserver queue:Reg/w0: regular consumer finished, iret=0, szlog 0 sz phys 1 5261.343067549:customerstreamserver queue:Reg/w0: DeleteProcessedBatch: we deleted 1 objects and enqueued 0 objects 5261.343080120:customerstreamserver queue:Reg/w0: doDeleteBatch: delete batch from store, new sizes: log 0, phys 0 5261.343092412:customerstreamserver queue:Reg/w0: regular consumer finished, iret=4, szlog 0 sz phys 0 5261.343104704:customerstreamserver queue:Reg/w0: customerstreamserver queue:Reg/w0: worker IDLE, waiting for work. -- Michael Hart Arctic Wolf Networks M: 226-388-4773 On 2014-10-17, 12:27, "Rainer Gerhards" <[email protected]> wrote: >2014-10-17 18:22 GMT+02:00 Michael Hart <[email protected]>: > >> I'm running rsyslog 8.4.2 on ubuntu 12.04 (freshly compiled). I'm losing >> the first log line after a tcp connection is reset, and can replicate it >> with these steps. The same config with version 7.4.5 worked flawlessly: >> > >You were simply lucky in 7.4.5. See here: > >www.rsyslog.com/doc/v8-stable/configuration/modules/omfwd.html > >under " > >*ResendLastMSGOnReconnect" and be sure to follow the links quoted there.* > > > >*A short answer is that you may want to use RELP to avoid this >problem.Rainer* > >> >> Test case: >> >> * rsyslogd 8.4.2 running with omfwd configured to send to >> localhost:5544 over tcp. >> * netcat listening to port 5544 (nc -l 5544) >> * start rsyslogd >> * send message. Successfully sent, netcat displays it. >> * stop netcat. >> * netcat sends a FIN, rsyslog ACK's it. Connection goes into >> CLOSE_WAIT state forever (well until I send another message and force >> rsyslog to reconnect, but > 10 minutes if I wait) >> * start netcat. >> * send another message. >> * rsyslog sends a FIN packet, netcat acks. Then rsyslog builds >>the >> TCP connection but sends nothing. Message is lost forever. >> * send another message, goes through successfully (as does every >> subsequent message) >> >> The output from rsyslogd in debug mode shows this sequence of events >> (customerstreamserver is the name of my action) >> 1858.496126807:customerstreamserver queue:Reg/w0: omfwd: add 224 bytes >>to >> send buffer (curr offs 0) >> 1858.496130718:customerstreamserver queue:Reg/w0: omfwd: endTransaction, >> offsSndBuf 224, iRet -2121 >> 1858.496137143:customerstreamserver queue:Reg/w0: CheckConnection >>detected >> broken connection - closing it >> 1858.496287716:customerstreamserver queue:Reg/w0: TCPSendBuf error >>-2027, >> destruct TCP Connection! >> 1858.496300567:customerstreamserver queue:Reg/w0: file netstrms.c >>released >> module 'lmnsd_ptcp', reference count now 2 >> 1858.496305316:customerstreamserver queue:Reg/w0: Action 16 transitioned >> to state: rtry >> 1858.496312020:customerstreamserver queue:Reg/w0: actionDoRetry: >> customerstreamserver enter loop, iRetries=0 >> 1858.496315931:customerstreamserver queue:Reg/w0: DDDD: tryResume: >> pWrkrData 0x7f83fc0022b0 >> 1858.496319563:customerstreamserver queue:Reg/w0: localhost >> 1858.496323753:customerstreamserver queue:Reg/w0: TCPSendInit CREATE >> 1858.496329620:customerstreamserver queue:Reg/w0: source file netstrms.c >> requested reference for module 'lmnsd_ptcp', reference count now 3 >> 1858.496668479:customerstreamserver queue:Reg/w0: actionDoRetry: >> customerstreamserver action->tryResume returned 0 >> 1858.496673786:customerstreamserver queue:Reg/w0: actionDoRetry: >> customerstreamserver had success RDY again (iRet=0) >> 1858.496684123:customerstreamserver queue:Reg/w0: Called LogMsg, msg: >> action 'customerstreamserver' resumed (module 'builtin:omfwd') >> 1858.496695017:customerstreamserver queue:Reg/w0: main Q: qqueueAdd: >>entry >> added, size now log 1, phys 1 entries >> >> Config for the action is as follows: >> >> ruleset(name="customerstreamserver") { >> #ignore template stuff >> action(type="omfwd" >> name="customerstreamserver" >> target="localhost" >> port="5544" >> protocol="tcp" >> template="LongTagForwardFormat" >> queue.filename="customerstreamserver" >> queue.maxdiskspace="2147483648" >> queue.saveonshutdown="on" >> queue.type="LinkedList" >> queue.size="4000000" >> queue.highwatermark="1000000" >> queue.lowwatermark="900000" >> queue.discardmark="3600000" >> queue.discardseverity="4" >> queue.timeoutenqueue="0" >> action.resumeretrycount="-1" >> action.resumeinterval="1" >> action.reportSuspension="on" >> action.reportSuspensionContinuation="on" >> ) >> stop >> } >> input(type="imudp" port="1514" ruleset="customerstreamserver") >> input(type="imtcp" port="1514" ruleset="customerstreamserver") >> >> Any ideas? >> >> Thanks >> mike >> -- >> Michael Hart >> Arctic Wolf Networks >> M: 226-388-4773 >> >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >_______________________________________________ >rsyslog mailing list >http://lists.adiscon.net/mailman/listinfo/rsyslog >http://www.rsyslog.com/professional-services/ >What's up with rsyslog? Follow https://twitter.com/rgerhards >NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >DON'T LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

