[ https://issues.apache.org/jira/browse/HADOOP-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764758#action_12764758 ]
Tsz Wo (Nicholas), SZE commented on HADOOP-6307: ------------------------------------------------ > Not sure why this issue only hits SequenceFile. The problem applies equally > to TFile (although this was pushed to the caller). This problem applies to any implementation which gets the un-closed file length by calling fs.getFileStatus(file).getLen(). (By "problem", I mean that the reader may not see all hflushed bytes. It sees some part of the file. This is the same behavior before append.) I did not check TFile before. TFile does not have this problem if the caller manage to get the correct length and pass it to the TFile.Reader constructor. > Support reading on un-closed SequenceFile > ----------------------------------------- > > Key: HADOOP-6307 > URL: https://issues.apache.org/jira/browse/HADOOP-6307 > Project: Hadoop Common > Issue Type: Improvement > Components: io > Reporter: Tsz Wo (Nicholas), SZE > > When a SequenceFile.Reader is constructed, it calls > fs.getFileStatus(file).getLen(). However, fs.getFileStatus(file).getLen() > does not return the hflushed length for un-closed file since the Namenode > does not know the hflushed length. DFSClient have to ask a datanode for the > length last block which is being written; see also HDFS-570. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.