[rsyslog] Does imfile-to-relp need DA queues?
Hi We're trying to confirm if *imfile-to-relp* is already 100% reliable and doesn't need any queue/DA, or if -*in the event of a failure*- messages being processed (in memory) will be lost/skipped. Would a configuration like this (without DA queue in ruleset) ensure all file contents are sent to relp server? (In case of error, it will continue reading from the last truly-processed line) ruleset( name="relp" ) { action( name="omrelp" action.reportSuspension="on" action.reportSuspensionContinuation="on" action.resumeInterval="60" action.resumeRetryCount="-1" port="20514" target="server" type="omrelp" ) } input( type="imfile" file="file1" tag="file1" ruleset="relp" ) Thanks ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
RELP just guards the transmission. Imfile reads from the file and puts data into the queue. So you need to guard the queue which means, yes, DA is needed. Be also sure to set saveonshutdown parameter. HTH Rainer 2017-10-03 10:43 GMT+02:00 mostolog--- via rsyslog : > Hi > > We're trying to confirm if *imfile-to-relp* is already 100% reliable and > doesn't need any queue/DA, or if -*in the event of a failure*- messages > being processed (in memory) will be lost/skipped. > > Would a configuration like this (without DA queue in ruleset) ensure all > file contents are sent to relp server? (In case of error, it will continue > reading from the last truly-processed line) > >ruleset( > name="relp" > ) { > action( > name="omrelp" > action.reportSuspension="on" > action.reportSuspensionContinuation="on" > action.resumeInterval="60" > action.resumeRetryCount="-1" > port="20514" > target="server" > type="omrelp" > ) >} >input( > type="imfile" > file="file1" > tag="file1" > ruleset="relp" >) > > Thanks > > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
and if the system crashes, or issues a kill -9 to rsyslog, there will not be a chance to save the memory queue and some messages will be lost. you would have to use a disk queue, or disable queues entirely (direct mode) to avoid any chance of log loss. This will cripple your performance (In a test several years ago, with a high-end PCI SSD, this resulted in a 1000x slowdown) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
Thanks Rainer and David About this: is there any option/planned feature to ensure message has been delivered before marking it as processed? eg: tell source message has been acknowledged when target has stored the event source--->relp--->rsyslog--->kafka This would be "as easy as" not moving the reading cursor from files for imfile status. (devil in details) On 03/10/17 19:38, David Lang wrote: and if the system crashes, or issues a kill -9 to rsyslog, there will not be a chance to save the memory queue and some messages will be lost. you would have to use a disk queue, or disable queues entirely (direct mode) to avoid any chance of log loss. This will cripple your performance (In a test several years ago, with a high-end PCI SSD, this resulted in a 1000x slowdown) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
2017-10-04 9:29 GMT+02:00 mostolog--- via rsyslog : > Thanks Rainer and David > > About this: is there any option/planned feature to ensure message has been > delivered before marking it as processed? > > eg: tell source message has been acknowledged when target has stored the > event > source--->relp--->rsyslog--->kafka > > This would be "as easy as" not moving the reading cursor from files for > imfile status. (devil in details) as David said, you can achieve this by using the "direct" queue mode. Of course, performance is bad, but that's always the case if you go to half-duplex operations. HTH Rainer > > > > > > On 03/10/17 19:38, David Lang wrote: >> >> and if the system crashes, or issues a kill -9 to rsyslog, there will not >> be a chance to save the memory queue and some messages will be lost. >> >> you would have to use a disk queue, or disable queues entirely (direct >> mode) to avoid any chance of log loss. This will cripple your performance >> (In a test several years ago, with a high-end PCI SSD, this resulted in a >> 1000x slowdown) >> >> David Lang >> ___ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > > > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
I was trying to make use of kafka redundancy As Kafka implements a safe mechanisms (copying the message into multiple nodes), I was wondering if theres a relayed-delivery mode ie: rsyslog acks the message once kafka has ack'ed the message. On 04/10/17 09:31, Rainer Gerhards wrote: 2017-10-04 9:29 GMT+02:00 mostolog--- via rsyslog : Thanks Rainer and David About this: is there any option/planned feature to ensure message has been delivered before marking it as processed? eg: tell source message has been acknowledged when target has stored the event source--->relp--->rsyslog--->kafka This would be "as easy as" not moving the reading cursor from files for imfile status. (devil in details) as David said, you can achieve this by using the "direct" queue mode. Of course, performance is bad, but that's always the case if you go to half-duplex operations. HTH Rainer On 03/10/17 19:38, David Lang wrote: and if the system crashes, or issues a kill -9 to rsyslog, there will not be a chance to save the memory queue and some messages will be lost. you would have to use a disk queue, or disable queues entirely (direct mode) to avoid any chance of log loss. This will cripple your performance (In a test several years ago, with a high-end PCI SSD, this resulted in a 1000x slowdown) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
2017-10-04 9:34 GMT+02:00 mosto...@gmail.com : > I was trying to make use of kafka redundancy > > As Kafka implements a safe mechanisms (copying the message into multiple > nodes), I was wondering if theres a relayed-delivery mode > ie: rsyslog acks the message once kafka has ack'ed the message. omkafka does this, but to do so it maintains a separate file of yet-unacked messages. Rainer > > > > > On 04/10/17 09:31, Rainer Gerhards wrote: >> >> 2017-10-04 9:29 GMT+02:00 mostolog--- via rsyslog >> : >>> >>> Thanks Rainer and David >>> >>> About this: is there any option/planned feature to ensure message has >>> been >>> delivered before marking it as processed? >>> >>> eg: tell source message has been acknowledged when target has stored the >>> event >>> source--->relp--->rsyslog--->kafka >>> >>> This would be "as easy as" not moving the reading cursor from files for >>> imfile status. (devil in details) >> >> as David said, you can achieve this by using the "direct" queue mode. >> Of course, performance is bad, but that's always the case if you go to >> half-duplex operations. >> >> HTH >> Rainer >>> >>> >>> >>> >>> >>> On 03/10/17 19:38, David Lang wrote: and if the system crashes, or issues a kill -9 to rsyslog, there will not be a chance to save the memory queue and some messages will be lost. you would have to use a disk queue, or disable queues entirely (direct mode) to avoid any chance of log loss. This will cripple your performance (In a test several years ago, with a high-end PCI SSD, this resulted in a 1000x slowdown) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. >>> >>> >>> ___ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of >>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >>> LIKE THAT. > > ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
As Kafka is a specific platform, which we wouldn't like "clients" to use, we were looking for something like this. 1. source reads file from current position 2. source sends to rsyslog *(relp is standarized)* 3. rsyslogs doesn't ack and forwards to kafka 4. kafka acks the message once properly saved 5. rsyslogs ack's to source the message 6. source current position++ 7. repeat from #1 Of course, kafka could be replaced with any other reliable target I don't know if this makes sense for you. On 04/10/17 09:36, Rainer Gerhards wrote: 2017-10-04 9:34 GMT+02:00 mosto...@gmail.com : I was trying to make use of kafka redundancy As Kafka implements a safe mechanisms (copying the message into multiple nodes), I was wondering if theres a relayed-delivery mode ie: rsyslog acks the message once kafka has ack'ed the message. omkafka does this, but to do so it maintains a separate file of yet-unacked messages. Rainer On 04/10/17 09:31, Rainer Gerhards wrote: 2017-10-04 9:29 GMT+02:00 mostolog--- via rsyslog : Thanks Rainer and David About this: is there any option/planned feature to ensure message has been delivered before marking it as processed? eg: tell source message has been acknowledged when target has stored the event source--->relp--->rsyslog--->kafka This would be "as easy as" not moving the reading cursor from files for imfile status. (devil in details) as David said, you can achieve this by using the "direct" queue mode. Of course, performance is bad, but that's always the case if you go to half-duplex operations. HTH Rainer On 03/10/17 19:38, David Lang wrote: and if the system crashes, or issues a kill -9 to rsyslog, there will not be a chance to save the memory queue and some messages will be lost. you would have to use a disk queue, or disable queues entirely (direct mode) to avoid any chance of log loss. This will cripple your performance (In a test several years ago, with a high-end PCI SSD, this resulted in a 1000x slowdown) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
You either need to use all direct queues or do e very considerate architecture change to rsyslog core (a couple of month worth of work, I think). But in the end, the architecture change would do exactly what queue direct mode does, so I do not even see any argument in favor of it. Rainer 2017-10-04 9:43 GMT+02:00 mosto...@gmail.com : > As Kafka is a specific platform, which we wouldn't like "clients" to use, we > were looking for something like this. > > source reads file from current position > source sends to rsyslog (relp is standarized) > rsyslogs doesn't ack and forwards to kafka > kafka acks the message once properly saved > rsyslogs ack's to source the message > source current position++ > repeat from #1 > > Of course, kafka could be replaced with any other reliable target > > I don't know if this makes sense for you. > > > > On 04/10/17 09:36, Rainer Gerhards wrote: > > 2017-10-04 9:34 GMT+02:00 mosto...@gmail.com : > > I was trying to make use of kafka redundancy > > As Kafka implements a safe mechanisms (copying the message into multiple > nodes), I was wondering if theres a relayed-delivery mode > ie: rsyslog acks the message once kafka has ack'ed the message. > > omkafka does this, but to do so it maintains a separate file of > yet-unacked messages. > > Rainer > > > > On 04/10/17 09:31, Rainer Gerhards wrote: > > 2017-10-04 9:29 GMT+02:00 mostolog--- via rsyslog > : > > Thanks Rainer and David > > About this: is there any option/planned feature to ensure message has > been > delivered before marking it as processed? > > eg: tell source message has been acknowledged when target has stored the > event > source--->relp--->rsyslog--->kafka > > This would be "as easy as" not moving the reading cursor from files for > imfile status. (devil in details) > > as David said, you can achieve this by using the "direct" queue mode. > Of course, performance is bad, but that's always the case if you go to > half-duplex operations. > > HTH > Rainer > > > > > On 03/10/17 19:38, David Lang wrote: > > and if the system crashes, or issues a kill -9 to rsyslog, there will > not > be a chance to save the memory queue and some messages will be lost. > > you would have to use a disk queue, or disable queues entirely (direct > mode) to avoid any chance of log loss. This will cripple your > performance > (In a test several years ago, with a high-end PCI SSD, this resulted in > a > 1000x slowdown) > > David Lang > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T > LIKE THAT. > > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. > > ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
On Wed, 4 Oct 2017, mostolog--- via rsyslog wrote: About this: is there any option/planned feature to ensure message has been delivered before marking it as processed? that is what is done, the problem is that it's marked as processed on the queue. If the memory is lost (crash, kill -9), the status and even existance of messages on the queues are lost. rsyslog does: source --> queue --> action (where --> can be a single message or a batch of messages) unless the queue mode is direct, the source and action are asynchronous from each other. If you want each message processed individually, everything slows to a crawl. With a disk based queue (and some very paranoid settings) you can force each message read from the source to be committed to disk and fsync called (on the file and the directory it's in) before it's marked as read on the input. And then the message can be read from the disk and processed to the output (such as RELP) before it's marked as delivered on disk. This is safe, but as I said, it resulted in a ~1000x slowdown several years ago (and disks are not much faster than that $5000 PCIe SSD I was using then), there are just a huge amount of system calls and disk I/O to do (and don't use ext3, it resulted in a futher slowdown, it doesn't do fsync well) By default, rsyslog is not that paranoid, keeping logs in RAM and flushing them to disk at shutdown is considered pretty severe protection of the logs. Normally we don't do fsync calls when writing things to disk, we assume that the system isn't going to crash before the OS writes things. Rsyslog isn't intended to handle things like financial transactions that must not be lost at any cost. It's designed to handle system and application logs in a best effort fashion (that best effort is pretty darn good, but we don't insist on perfection) Does the application that writes the logs that you are intending to read do a fsync on the log file and the directory that it's in every time it writes a log to the file? if not, you are at risk of loosing logs in a system crash just from that. when it creates a new file does it check if it needs to sync the parent directory of the directory it's in (in case adding the file cause the directory to grow larger)? And are you writing this to a RAID array? Does your controller card buffer any writes? Does it have a battery backup? It's very easy to say that the requirement is zero logs lost, but when you really dive into the details of what that requires, it gets very messy (and expensive) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
Clear as water. The idea which was in our minds due to rsyslog issues with RELP (mention on another thread) is, by these and Rainer words completely banned since now. Thanks a lot On 04/10/17 10:23, David Lang wrote: On Wed, 4 Oct 2017, mostolog--- via rsyslog wrote: About this: is there any option/planned feature to ensure message has been delivered before marking it as processed? that is what is done, the problem is that it's marked as processed on the queue. If the memory is lost (crash, kill -9), the status and even existance of messages on the queues are lost. rsyslog does: source --> queue --> action (where --> can be a single message or a batch of messages) unless the queue mode is direct, the source and action are asynchronous from each other. If you want each message processed individually, everything slows to a crawl. With a disk based queue (and some very paranoid settings) you can force each message read from the source to be committed to disk and fsync called (on the file and the directory it's in) before it's marked as read on the input. And then the message can be read from the disk and processed to the output (such as RELP) before it's marked as delivered on disk. This is safe, but as I said, it resulted in a ~1000x slowdown several years ago (and disks are not much faster than that $5000 PCIe SSD I was using then), there are just a huge amount of system calls and disk I/O to do (and don't use ext3, it resulted in a futher slowdown, it doesn't do fsync well) By default, rsyslog is not that paranoid, keeping logs in RAM and flushing them to disk at shutdown is considered pretty severe protection of the logs. Normally we don't do fsync calls when writing things to disk, we assume that the system isn't going to crash before the OS writes things. Rsyslog isn't intended to handle things like financial transactions that must not be lost at any cost. It's designed to handle system and application logs in a best effort fashion (that best effort is pretty darn good, but we don't insist on perfection) Does the application that writes the logs that you are intending to read do a fsync on the log file and the directory that it's in every time it writes a log to the file? if not, you are at risk of loosing logs in a system crash just from that. when it creates a new file does it check if it needs to sync the parent directory of the directory it's in (in case adding the file cause the directory to grow larger)? And are you writing this to a RAID array? Does your controller card buffer any writes? Does it have a battery backup? It's very easy to say that the requirement is zero logs lost, but when you really dive into the details of what that requires, it gets very messy (and expensive) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
by the way, there was a RELP related queue fix in the last week or so, check the git logs. It was marked as fixed just before you started posting your issue, so I don't know if they are at all related. David Lang On Wed, 4 Oct 2017, mosto...@gmail.com wrote: Date: Wed, 4 Oct 2017 10:33:36 +0200 From: "mosto...@gmail.com" To: David Lang , mostolog--- via rsyslog Subject: Re: [rsyslog] Does imfile-to-relp need DA queues? Clear as water. The idea which was in our minds due to rsyslog issues with RELP (mention on another thread) is, by these and Rainer words completely banned since now. Thanks a lot On 04/10/17 10:23, David Lang wrote: On Wed, 4 Oct 2017, mostolog--- via rsyslog wrote: About this: is there any option/planned feature to ensure message has been delivered before marking it as processed? that is what is done, the problem is that it's marked as processed on the queue. If the memory is lost (crash, kill -9), the status and even existance of messages on the queues are lost. rsyslog does: source --> queue --> action (where --> can be a single message or a batch of messages) unless the queue mode is direct, the source and action are asynchronous from each other. If you want each message processed individually, everything slows to a crawl. With a disk based queue (and some very paranoid settings) you can force each message read from the source to be committed to disk and fsync called (on the file and the directory it's in) before it's marked as read on the input. And then the message can be read from the disk and processed to the output (such as RELP) before it's marked as delivered on disk. This is safe, but as I said, it resulted in a ~1000x slowdown several years ago (and disks are not much faster than that $5000 PCIe SSD I was using then), there are just a huge amount of system calls and disk I/O to do (and don't use ext3, it resulted in a futher slowdown, it doesn't do fsync well) By default, rsyslog is not that paranoid, keeping logs in RAM and flushing them to disk at shutdown is considered pretty severe protection of the logs. Normally we don't do fsync calls when writing things to disk, we assume that the system isn't going to crash before the OS writes things. Rsyslog isn't intended to handle things like financial transactions that must not be lost at any cost. It's designed to handle system and application logs in a best effort fashion (that best effort is pretty darn good, but we don't insist on perfection) Does the application that writes the logs that you are intending to read do a fsync on the log file and the directory that it's in every time it writes a log to the file? if not, you are at risk of loosing logs in a system crash just from that. when it creates a new file does it check if it needs to sync the parent directory of the directory it's in (in case adding the file cause the directory to grow larger)? And are you writing this to a RAID array? Does your controller card buffer any writes? Does it have a battery backup? It's very easy to say that the requirement is zero logs lost, but when you really dive into the details of what that requires, it gets very messy (and expensive) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Does imfile-to-relp need DA queues?
one last datapoint on this topic. If you are using spinning rust drives, they are limited to 1 fsync/rev, so a 5400 rpm drive has a theoretical limit of 90 fsyncs/sec If you have multiple fsyncs per log (file, directory, mark completed, etc), it's very easy to get down to the low tens of logs/sec. The original syslog daemon did one fsync per log (it didn't do the directory, so there was still risk of loss), which is why it had a reputation of being way to slow to deal with anything important. SSDs are faster, but some SSDs lie to the system about the write being safe. battery backed RAID controllers or battery backed ram cards are the fastest way to get something in storage that will survive a system crash. that PCI SSD I mentioned was able to support rsyslog processing a few thousand logs/sec (depending on the filesystem in use, it ranged from ~1k to ~10k logs/sec) reliable storage is _hard_, this is why database writes are so incredibly slow compared to writing to a file, it's not (primarily) SQL overhead or complexities in the database, it's all the steps that the database takes to give you the ACID data protection guarantees. All the nosql databases that are so much faster than traditional databases relax one or more of the ACID guarantees. David Lang On Wed, 4 Oct 2017, David Lang wrote: Date: Wed, 4 Oct 2017 01:23:35 -0700 (PDT) From: David Lang Reply-To: rsyslog-users To: mostolog--- via rsyslog Subject: Re: [rsyslog] Does imfile-to-relp need DA queues? On Wed, 4 Oct 2017, mostolog--- via rsyslog wrote: About this: is there any option/planned feature to ensure message has been delivered before marking it as processed? that is what is done, the problem is that it's marked as processed on the queue. If the memory is lost (crash, kill -9), the status and even existance of messages on the queues are lost. rsyslog does: source --> queue --> action (where --> can be a single message or a batch of messages) unless the queue mode is direct, the source and action are asynchronous from each other. If you want each message processed individually, everything slows to a crawl. With a disk based queue (and some very paranoid settings) you can force each message read from the source to be committed to disk and fsync called (on the file and the directory it's in) before it's marked as read on the input. And then the message can be read from the disk and processed to the output (such as RELP) before it's marked as delivered on disk. This is safe, but as I said, it resulted in a ~1000x slowdown several years ago (and disks are not much faster than that $5000 PCIe SSD I was using then), there are just a huge amount of system calls and disk I/O to do (and don't use ext3, it resulted in a futher slowdown, it doesn't do fsync well) By default, rsyslog is not that paranoid, keeping logs in RAM and flushing them to disk at shutdown is considered pretty severe protection of the logs. Normally we don't do fsync calls when writing things to disk, we assume that the system isn't going to crash before the OS writes things. Rsyslog isn't intended to handle things like financial transactions that must not be lost at any cost. It's designed to handle system and application logs in a best effort fashion (that best effort is pretty darn good, but we don't insist on perfection) Does the application that writes the logs that you are intending to read do a fsync on the log file and the directory that it's in every time it writes a log to the file? if not, you are at risk of loosing logs in a system crash just from that. when it creates a new file does it check if it needs to sync the parent directory of the directory it's in (in case adding the file cause the directory to grow larger)? And are you writing this to a RAID array? Does your controller card buffer any writes? Does it have a battery backup? It's very easy to say that the requirement is zero logs lost, but when you really dive into the details of what that requires, it gets very messy (and expensive) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.