[ https://issues.apache.org/jira/browse/KAFKA-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481086#comment-13481086 ]
Dave Revell commented on KAFKA-77: ---------------------------------- > This is not really a good idea post 0.8 as we no longer have much dependence > on the disk flush. Jay, would you mind explaining a bit more? Is there a new feature in Kafka >0.8 that improves durability without the the needs for disk flushes? Or is there perhaps a new feature that decreases the performance penalty of flushing after every message? > Implement "group commit" for kafka logs > --------------------------------------- > > Key: KAFKA-77 > URL: https://issues.apache.org/jira/browse/KAFKA-77 > Project: Kafka > Issue Type: Improvement > Affects Versions: 0.7 > Reporter: Jay Kreps > Assignee: Jay Kreps > Fix For: 0.8 > > Attachments: kafka-group-commit.patch > > > The most expensive operation for the server is usually going to be the > fsync() call to sync data in a log to disk, if you don't flush your data is > at greater risk of being lost in a crash. Currently we give two knobs to tune > this trade--log.flush.interval and log.default.flush.interval.ms (no idea why > one has default and the other doesn't since they are both defaults). However > if you flush frequently, say on every write, then performance is not that > great. > One trick that can be used to improve this worst case of continual flushes is > to allow a single fsync() to be used for multiple writes that occur at the > same time. This is a lot like "group commit" in databases. It is unclear > which cases this would improve and by how much but it might be worth a try. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira