Re: [rsyslog] rsyslog queue subsystem - refactor or redesign?

Rainer Gerhards Fri, 02 Nov 2012 10:14:57 -0700

> -----Original Message-----
> From: [email protected] [mailto:rsyslog-
> [email protected]] On Behalf Of [email protected]
> Sent: Wednesday, October 31, 2012 10:20 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] rsyslog queue subsystem - refactor or redesign?
> 
> On Wed, 31 Oct 2012, Rainer Gerhards wrote:
> 
> looks good so far, the open/close thigns I was remember could easily
> have
> been for the queue state file.
> 
> A couple of thoughts.
> 
> disk space is really cheap compared to memory, so optimizing for disk
> space is probably not the best option. Moving to something of a fixed
> size
> (or even a multiple of a fixed size) and wasting some space could
> result
> in big wins. The fact that block aligned 4K writes tend to be atomic
> could
> be an added win.
> 
> the rest of the main file layout seems reasonable, but I would add a
> binary (or encoded binary to make the resulting file human readable)
> header on each message with pointers to where all the standard
> properties
> start to make reading it in more efficient. For nonstandard properties,
> a
> table of pointers <pointer to property name> <pointer to property
> contents> would work.


Let's save the above for a bit - will come back to it ;)

> Given what you are describing, I think the biggest win is going to be
> in
> changing the handling of the state file.
> 
> open, write, close, fsync is overkill. you should be able to do write,
> fsync (or a memory mapped file and fdatasync to sync just a range of
> the file since the state is small).

I did some tests today, just to proove what I thought how things work. And 
indeed, with default settings the state file is not written at all. HOWEVER, if 

$MainMsgQueueCheckpointInterval

I used, this changes. If you set it to 1 (as actually needed for the 
ultra-reliable use case), we have lots of open/close/sync calls, actually up to 
three times per message processed (message writer, reader, deleter). This is 
definitely very expensive.

But again, this is only the case if config params are very specifically set. By 
default, I don't see much of open/close activity.

In any case, a smart first step would probably be to see that the state file is 
opened/closed less often. I'll look into that and hope (but not yet know for 
sure) that this is something that can be changed quickly. However, there is a 
nagging feeling that this does not really help the common case...

> Also, I would make the max file size be 1G or so rather than 10M

Well, 10M is just the (quite conservative) default size. It stems back to the 
idea that rsyslog runs on many systems, including low-end. With system journal, 
we have new priorities and can probably claim that this use case is no longer 
important - and thus use a different -higher- default. But again, 10M is just a 
default value and can easily be changed... 

> At the possible cost of spawning lots of threads, the syncs for the
> state
> file may be worth spawning new thread for so that you can continue
> processing while the sync is pending (as per the linux-kernel
> discussion),
> or, if you are runing on ext3/4 filesystem, specify that the state file
> should be journaled.
That is definitely intersting, but I think something to be done after all other 
optimizations.

> 
> 
> I keep having the nagging question in the back of my mind asking if the
> two use cases (exceptions and reliability) are really the same. They
> are
> very similar, but there are also differences.
> 
They are definitely not the same. Actually, the current design was mostly 
focused on the exception case, and we added the reliability settings as an 
extra bonus and extended that in v5. In v5, it would have been much better, to 
write a new subsystem, but at this time we could not afford that... :-(

> In the Exception case, you don't have memory to deal with these log
> messages, and you 'know' that you aren't going to read them back or
> process them for a while.
> 
> In the Reliability case, the normal situation is that you aren't going
> to
> crash and the data on disk is going to be 'write-only' and then
> deleted.
> 
> In this second case, I'm wondering if something more along the lines of
> creating a memory mapped file for the existing memory array structure
> and
> then doing fdatasync() calls to force syncing of the data to disk could
> be
> the right answer. This would be doing all the real processing from
> memory
> (unless there is a crash) with just a stream of writes to the disk.
> 
> 
> Adding encryption to the existing file I/O model probably isn't that
> hard,
> adding it to the memmapped version would be a lot harder.

This again is very interesting. I am tempted for quite a while to offer an  
"extra reliability" queue mode in addition to the others- but simply had no 
time to do that. Maybe it would be a good idea to do this as part of the 
redesign. I would actually tackle the exception case first, and see how far we 
can come there.

Rainer
> 
> David Lang
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
> if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] rsyslog queue subsystem - refactor or redesign?

Reply via email to