Reading consistency for all readers 
------------------------------------

                 Key: HDFS-3152
                 URL: https://issues.apache.org/jira/browse/HDFS-3152
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs client
    Affects Versions: 0.21.0, 0.20.2
            Reporter: Denny Ye


I met an exception when I would like to seek to latest size of file that 
another client was writing. Message is "Cannot seek after EOF". I got the seek 
target from previous input stream and now I trying to obtains the file 
incremental. It means the target over than the file size limitation. 

In my opinion, the confirmed visible file length comes from the completed 
blocks(NameNode) plus replied size in last DataNode of pipeline for last block. 

Here are two cases: 1. How to obtains the confirmed visible file length to all 
readers. 2. For each reader, how can we pick out the best DN for concrete 
block. 

Actually, existing code mix up those two parts. NameNode sorted block locations 
due to local reading(HBase or local MapReduce, random DataNode for outer 
reader). DFSClient obtains the first DataNode of last block. Pay attention to 
this point! Client may obtains the 'dirty' file length from frist DN of last 
block that NameNode returned. And client always uses the frist DN for each 
block to read file content.

Should we split two cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to