Btw. I think the main part that Ciprian wants to point out us:
*2631.119110261:main_json_token queue[DA]:Reg/w0: main_json_token queue[DA]:DA queue is in emergency mode, disabling DA in parent2631.119146744:main_json_token queue[DA]:Reg/w0: Called LogMsg, msg: fatalerror on disk queue 'main_json_token queue[DA]', emergency switch to directmode* But I think the problem Ciprian is trying to point out here is this: * there is a problem with disk queue * but rsyslog doesn't report it * Ciprian saw it *only* because he restarted rsyslog in *debug* mode. I would think that if there is a problem with the on disk queue rsyslog would complain about it immediately and loudly, but I guess it doesn't? Small bug? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Fri, Aug 21, 2015 at 7:27 AM, Ciprian Hacman <[email protected] > wrote: > I just started Rsyslog on debug to see if there is any issue, and found we > have one, maybe this can help: > > 2631.117432280:main_json_token queue[DA]:Reg/w0: wti 0x1348150: worker > starting > > 2631.117464099:main_json_token queue[DA]:Reg/w0: DeleteProcessedBatch: we > deleted 0 objects and enqueued 0 objects > > 2631.117492596:main_json_token queue[DA]:Reg/w0: doDeleteBatch: delete > batch from store, new sizes: log 4530, phys 4530 > > 2631.117522634:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file 19 > read 0 bytes > > 2631.117551194:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file 19 > EOF > > 2631.117580484:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file > 19(json_and_token_action) closing > > 2631.117626778:main_json_token queue[DA]:Reg/w0: file > '/mnt/rsyslog/queues/json_and_token_action.00000070' opened as #-1 with > mode 384 > > 2631.117668522:main_json_token queue[DA]:Reg/w0: strm 0x134b970: open error > 2, file '/mnt/rsyslog/queues/json_and_token_action.00000070': No such file > or directory > > 2631.117700793:main_json_token queue[DA]:Reg/w0: objDeserialize error -2040 > during header processing - trying to recover > > 2631.117732844:main_json_token queue[DA]:Reg/w0: file > '/mnt/rsyslog/queues/json_and_token_action.00000070' opened as #-1 with > mode 384 > > 2631.117764597:main_json_token queue[DA]:Reg/w0: strm 0x134b970: open error > 2, file '/mnt/rsyslog/queues/json_and_token_action.00000070': No such file > or directory > > 2631.117794460:main_json_token queue[DA]:Reg/w0: deserializer has possibly > been able to re-sync and recover, state -2040 > > 2631.117823674:main_json_token queue[DA]:Reg/w0: main_json_token queue[DA]: > error -2040 dequeueing element - ignoring, but strange things may happen > > 2631.117852905:main_json_token queue[DA]:Reg/w0: main_json_token queue[DA]: > got 'file not found' error -2040, queue defunct > > 2631.117882095:main_json_token queue[DA]:Reg/w0: strm 0x1349450: file > 17(json_and_token_action) closing > > 2631.117911082:main_json_token queue[DA]:Reg/w0: strm 0x1349450: file > 17(json_and_token_action) flush, buflen 0 (no need to flush) > > 2631.117942739:main_json_token queue[DA]:Reg/w0: strm 0x134b970: file > -1(json_and_token_action) closing > > 2631.117974536:main_json_token queue[DA]:Reg/w0: strm 0x134a6e0: file > 18(json_and_token_action) closing > > 2631.118004770:main_json_token queue[DA]:Reg/w0: strmCloseFile: deleting > '/mnt/rsyslog/queues/json_and_token_action.00000069' > > 2631.119110261:main_json_token queue[DA]:Reg/w0: main_json_token queue[DA]: > DA queue is in emergency mode, disabling DA in parent > > 2631.119146744:main_json_token queue[DA]:Reg/w0: Called LogMsg, msg: fatal > error on disk queue 'main_json_token queue[DA]', emergency switch to direct > mode > > 2631.119183273:main_json_token queue[DA]:Reg/w0: main Q: qqueueAdd: entry > added, size now log 1, phys 9 entries > > 2631.119212733:main_json_token queue[DA]:Reg/w0: main Q: EnqueueMsg advised > worker start > > rsyslogd: fatal error on disk queue 'main_json_token queue[DA]', emergency > switch to direct mode [v8.12.0 try http://www.rsyslog.com/e/2040 ] > > 2631.119266265:main_json_token queue[DA]:Reg/w0: regular consumer finished, > iret=-2183, szlog 0 sz phys 0 > > 2631.119293213:main_json_token queue[DA]:Reg/w0: DDDD: wti 0x1348150: > worker cleanup action instances > > > The queue dir only contained a meta file, no data files: > > <OPB:1:qqueue:1: > > +iQueueSize:2:4:4530: > > +tVars.disk.sizeOnDisk:2:7:9517544: > > >End > > . > > <Obj:1:strm:1: > > +iCurrFNum:2:2:69: > > +pszFName:1:21:json_and_token_action: > > +iMaxFiles:2:8:10000000: > > +bDeleteOnClose:2:1:0: > > +sType:2:1:1: > > +tOperationsMode:2:1:2: > > +tOpenMode:2:3:384: > > +iCurrOffs:2:7:9517544: > > +inode:2:1:0: > > +bPrevWasNL:2:1:0: > > >End > > . > > <Obj:1:strm:1: > > +iCurrFNum:2:2:69: > > +pszFName:1:21:json_and_token_action: > > +iMaxFiles:2:8:10000000: > > +bDeleteOnClose:2:1:1: > > +sType:2:1:1: > > +tOperationsMode:2:1:1: > > +tOpenMode:2:3:384: > > +iCurrOffs:2:7:9517544: > > +inode:2:1:0: > > +bPrevWasNL:2:1:0: > > >End > > . > > > Ciprian > --- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ > > On Fri, Aug 21, 2015 at 12:58 PM, Ciprian Hacman < > [email protected]> wrote: > > > Hi David, > > > > We see Rsyslog starting to use a lot of memory when it cannot send data > to > > Elasticsearch. > > We expected to see logs written to disk, but instead found a message on > > startup that looked like this: > > > >> fatal error on disk queue 'main_nojson queue[DA]', emergency switch to > >> direct mode [v8.11.0 try http://www.rsyslog.com/e/2040 ] > > > > > > Permissions were correct on the queue files. > > > > Thanks, > > Ciprian > > -- > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > > Solr & Elasticsearch Support * http://sematext.com/ > > > > On Fri, Aug 21, 2015 at 12:34 PM, David Lang <[email protected]> wrote: > > > >> On Fri, 21 Aug 2015, Otis Gospodnetić wrote: > >> > >> Hi, > >>> > >>> Are there known situations where rsyslog disk queues can get corrupt? > >>> > >>> Sorry for such a high-level and open question, but we sometimes see > disk > >>> queue corruption and I wanted to see if anyone else sees that in > certain > >>> situation (e.g. when rsyslog is under pressure, when it is stopped in a > >>> certain way, when it runs out of memory, or something along these > lines)? > >>> > >> > >> Well, if it crashes as it's writing data, or if it's trying to flush the > >> queue to disk on shutdown and gets killed by -9 while it's doing so I > would > >> expect problems (some distros send a kill -15, wait a bit and then do > kill > >> -9, if there's too much work to do in writing the memory queue to disk, > >> rsyslog will be caught by this) > >> > >> Other than that, nothing specific to disk queues. > >> > >> what sort of corruption are you seeing? > >> > >> David Lang > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com/professional-services/ > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >> DON'T LIKE THAT. > >> > > > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

