[ 
https://issues.apache.org/jira/browse/FLUME-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840452#comment-13840452
 ] 

Hari Shreedharan commented on FLUME-2181:
-----------------------------------------

1. Yes, that is correct. I am adding a flush at the end of each commit (sorry, 
if I was not clear).
2. We actually have exactly one sequential writer to each file. So all writes 
before a sync call get fsynced to disk (we can't make the first half of a file 
dirty after we fsync the 2nd half - since all writes are sequential). Yes, it 
is possible that the OS flushes the pages corresponding to the 2nd half before 
flushing the ones corresponding to the first half. So we will actually need to 
seek to each offset, read the buffer - try to parse it and see if it makes 
sense. If it does not, then we assume that the event was not full sync-ed. I 
forgot that the files are pre-allocated - so yes, seeking and parsing an event 
to see if it is corrupt seems to be the only way around it.

> Optionally disable File Channel fsyncs 
> ---------------------------------------
>
>                 Key: FLUME-2181
>                 URL: https://issues.apache.org/jira/browse/FLUME-2181
>             Project: Flume
>          Issue Type: Improvement
>            Reporter: Hari Shreedharan
>            Assignee: Hari Shreedharan
>         Attachments: FLUME-2181.patch
>
>
> This will give File Channel performance a big boost, at the cost of possible 
> data loss if a crash happens between checkpoints. 
> Also we should make it configurable, with default to false. If the user does 
> not mind slight inconsistencies, this feature can be explicitly enabled 
> through configuration. So if it is not configured, then the behavior will be 
> exactly as it is now.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to