[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

dhruba borthakur (JIRA) Mon, 26 Jan 2009 17:11:21 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667535#action_12667535
 ]


dhruba borthakur commented on HADOOP-4379:
------------------------------------------

Hi Doug,

The file length (as returned by getFileStatus) will not change at every write 
from the client to the datanode. Similarly, every fsync call from the client 
does not reach the namenode (only the first one per block reaches the 
namenode). That means the namenode has no good way to know the size of a block 
when the block is being written to by a writer.

In your case, the writer has died. The namenode has a timeout of 1 hour before 
it starts lease recovery for this file. The lease recovery process will set the 
correct file size on the namenode metadata. If you do not want to wait for one 
hour, then you can manually trigger lease recovery from your application by 
trying to reopen the file for append(please use FileSystem.append(pathname)). 
Lease recovery will update the true length of the file in the namenode metadata.



> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, 
> fsyncConcurrentReaders3.patch, Reader.java, Writer.java
>
>
> In the append design doc 
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before 
> the reader opened the file
> However, this feature is not yet implemented.  Note that the operation 
> 'flushed' is now called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

Reply via email to