[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

Jim Kellerman (JIRA) Mon, 19 Jan 2009 14:33:22 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665257#action_12665257
 ]


Jim Kellerman commented on HADOOP-4379:
---------------------------------------

HBase really needs visibility to partial blocks and restarting HDFS is not an 
option for a 100+ node cluster.

We don't sync after every record, but sync after a configurable number of 
writes or after a configurable amount of time has passed and 
there is unsync'd data.

This really needs to be addressed in 0.19.1, 0.20.1 and trunk.

> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt
>
>
> In the append design doc 
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before 
> the reader opened the file
> However, this feature is not yet implemented.  Note that the operation 
> 'flushed' is now called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

Reply via email to