On Wed, 31 Oct 2012, Rainer Gerhards wrote:

looks good so far, the open/close thigns I was remember could easily have been for the queue state file.

A couple of thoughts.

disk space is really cheap compared to memory, so optimizing for disk space is probably not the best option. Moving to something of a fixed size (or even a multiple of a fixed size) and wasting some space could result in big wins. The fact that block aligned 4K writes tend to be atomic could be an added win.

the rest of the main file layout seems reasonable, but I would add a binary (or encoded binary to make the resulting file human readable) header on each message with pointers to where all the standard properties start to make reading it in more efficient. For nonstandard properties, a table of pointers <pointer to property name> <pointer to property contents> would work.

Given what you are describing, I think the biggest win is going to be in changing the handling of the state file.

open, write, close, fsync is overkill. you should be able to do write, fsync (or a memory mapped file and fdatasync to sync just a range of the file since the state is small).

Also, I would make the max file size be 1G or so rather than 10M

At the possible cost of spawning lots of threads, the syncs for the state file may be worth spawning new thread for so that you can continue processing while the sync is pending (as per the linux-kernel discussion), or, if you are runing on ext3/4 filesystem, specify that the state file should be journaled.


I keep having the nagging question in the back of my mind asking if the two use cases (exceptions and reliability) are really the same. They are very similar, but there are also differences.

In the Exception case, you don't have memory to deal with these log messages, and you 'know' that you aren't going to read them back or process them for a while.

In the Reliability case, the normal situation is that you aren't going to crash and the data on disk is going to be 'write-only' and then deleted.

In this second case, I'm wondering if something more along the lines of creating a memory mapped file for the existing memory array structure and then doing fdatasync() calls to force syncing of the data to disk could be the right answer. This would be doing all the real processing from memory (unless there is a crash) with just a stream of writes to the disk.


Adding encryption to the existing file I/O model probably isn't that hard, adding it to the memmapped version would be a lot harder.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to