[ https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504428#comment-14504428 ]
Yi Liu edited comment on HDFS-8033 at 4/21/15 6:25 AM: ------------------------------------------------------- Thanks [~zhz] for working on this. The patch is good, my comments: *1.* In DFSInputStream, the stateful read is not to read fully for the output *buf*, {{readWithStrategy}} will call {{readBuffer}} and return on success. In {{DFSStripedInputStream}} we override {{readBuffer}}, but we only read in one striped block, so the returned result should be something like (cell_0, cell_3, ....). This is not incorrect, in the test, you have tested stateful read, but you do fully read and the data size is *BLOCK_GROUP_SIZE*, so the result coincidentally is correct. I suggest we try to do fully read in {{readBuffer}} of {{DFSStripedInputStream}} unless we find the end of file, of course, the final read length could be less than the input buf length if we get eof. *2.* In {{blockSeekTo}}, we need to handle refetchToken and refetchEncryptionKey. And for other IOException, we can throw it. *3.* For the test, do stateful read: read once and fully read (please make the data size large than groupSize * cellSize), as I said in #1, *4.* {{connectFailedOnce}} in {{blockSeekTo}} is not necessary. *5.* Why you modify {{SimulatedFSDataset}}? was (Author: hitliuyi): Thanks [~zhz] for working on this. The patch is good, my comments: *1.* In DFSInputStream, the stateful read is not to read fully for the output *buf*, {{readWithStrategy}} will call {{readBuffer}} and return on success. In {{DFSStripedInputStream}} we override {{readBuffer}}, but we only read in one striped block, so the returned result should be something like (cell_0, cell_3, ....). This is not incorrect, in the test, you have tested stateful read, but you do fully read and the data size is *BLOCK_GROUP_SIZE*, so the result coincidentally is correct. I suggest we try to do fully read in {{readBuffer}} of {{DFSStripedInputStream}} unless we find the end of file, of course, the final read length could be less than the input buf length if we get eof. *2.* In {{blockSeekTo}}, we need to handle refetchToken and refetchEncryptionKey. And for other IOException, we can throw it. *3.* For the test, do stateful read: read once and fully read (please make the data size large than groupSize * cellSize), as I said in #1, *4.* {{connectFailedOnce}} in {{blockSeekTo}} is not necessary. *5.* Why you modify {{SimulatedFSDataset}}? > Erasure coding: stateful (non-positional) read from files in striped layout > --------------------------------------------------------------------------- > > Key: HDFS-8033 > URL: https://issues.apache.org/jira/browse/HDFS-8033 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Zhe Zhang > Assignee: Zhe Zhang > Attachments: HDFS-8033.000.patch, HDFS-8033.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)