[ 
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667760#action_12667760
 ] 

dhruba borthakur commented on HADOOP-4379:
------------------------------------------

Hi Doug,

the exception is expected. Please look at the test file Reader.java that I 
attached to this JIRA. It shows how the reader waits for the lease recovery to 
end (to ensure that correct file size is updated on namenode). Please let me 
know if this approach is suitable for your application.

We could enhance FileSystem.getFileStatus() to contact the datanode and 
retrieve and return the most-current file length (only for files that have a 
concurrent writer). This will not have have performance impact for most 
map-reduce applications.

> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, 
> fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, Reader.java, 
> Reader.java, Writer.java, Writer.java
>
>
> In the append design doc 
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before 
> the reader opened the file
> However, this feature is not yet implemented.  Note that the operation 
> 'flushed' is now called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to