[ https://issues.apache.org/jira/browse/FLUME-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840452#comment-13840452 ]
Hari Shreedharan commented on FLUME-2181: ----------------------------------------- 1. Yes, that is correct. I am adding a flush at the end of each commit (sorry, if I was not clear). 2. We actually have exactly one sequential writer to each file. So all writes before a sync call get fsynced to disk (we can't make the first half of a file dirty after we fsync the 2nd half - since all writes are sequential). Yes, it is possible that the OS flushes the pages corresponding to the 2nd half before flushing the ones corresponding to the first half. So we will actually need to seek to each offset, read the buffer - try to parse it and see if it makes sense. If it does not, then we assume that the event was not full sync-ed. I forgot that the files are pre-allocated - so yes, seeking and parsing an event to see if it is corrupt seems to be the only way around it. > Optionally disable File Channel fsyncs > --------------------------------------- > > Key: FLUME-2181 > URL: https://issues.apache.org/jira/browse/FLUME-2181 > Project: Flume > Issue Type: Improvement > Reporter: Hari Shreedharan > Assignee: Hari Shreedharan > Attachments: FLUME-2181.patch > > > This will give File Channel performance a big boost, at the cost of possible > data loss if a crash happens between checkpoints. > Also we should make it configurable, with default to false. If the user does > not mind slight inconsistencies, this feature can be explicitly enabled > through configuration. So if it is not configured, then the behavior will be > exactly as it is now. -- This message was sent by Atlassian JIRA (v6.1#6144)