[ 
https://issues.apache.org/jira/browse/HADOOP-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764758#action_12764758
 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6307:
------------------------------------------------

> Not sure why this issue only hits SequenceFile. The problem applies equally 
> to TFile (although this was pushed to the caller).

This problem applies to any implementation which gets the un-closed file length 
by calling fs.getFileStatus(file).getLen().  (By "problem", I mean that the 
reader may not see all hflushed bytes.  It sees some part of the file.  This is 
the same behavior before append.)  I did not check TFile before.  TFile does 
not have this problem if the caller manage to get the correct length and pass 
it to the TFile.Reader constructor.

> Support reading on un-closed SequenceFile
> -----------------------------------------
>
>                 Key: HADOOP-6307
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6307
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>            Reporter: Tsz Wo (Nicholas), SZE
>
> When a SequenceFile.Reader is constructed, it calls 
> fs.getFileStatus(file).getLen().  However, fs.getFileStatus(file).getLen() 
> does not return the hflushed length for un-closed file since the Namenode 
> does not know the hflushed length.  DFSClient have to ask a datanode for the 
> length last block which is being written; see also HDFS-570.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to