[ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528086#comment-14528086
 ] 

Jing Zhao commented on HDFS-7678:
---------------------------------

Thanks for the explanation and updating the patch, Zhe! Yes, you're right about 
the idx calculation. Maybe we can first add some more javadoc to explain?

Some other comments on the latest patch:
# {{futures}} and {{stripedReadsService}} can be converted into local variables 
inside of {{DFSStripedInputStream}}.
# {{fetchBlockByteRange}} needs to be split into multiple functions
# I have some concerns about the current timeout mechanism:
#* The current timeout mechanism is unfair for fetching a parity block. E.g., 
one DN is very slow and finally failed, then there may be just a little time 
remaining for fetching a parity block for recovery.
#* In general, to set an appropriate overall timeout only based on 
configuration is very difficult, since the pread length and number of DataNodes 
depend on the user's input and file's EC schema. 5 seconds may be a long enough 
timeout for reading one stripe of data, but it may not be enough for reading a 
full block group (in case that a very big buffer is given). Also note that 
compared with the current hedge read for a contiguous block, the timeout 
handling mechanism here is more aggressive since all the pending tasks are 
cancelled.
#* Thus I suggest in this jira we only depend on the BlockReader's timeout and 
handle it as a read failure. Handling slow DN (without reading timeout) should 
be more like the current hedge read in DFSInputStream (i.e., read parity as an 
option to speed up) and can be done in a separate jira.
# Looks like the following code will not add {{r.index}} into 
{{fetchedBlkIndices}} for {{blk_0}} in the java comment example?
{code}
if (actualReadPortions[r.index].containsReadPortion(maxPortion)) {
  fetchedBlkIndices.add(r.index);
  if (DFSClient.LOG.isDebugEnabled()) {
    DFSClient.LOG.debug("Fully fetched block " + r.index);
  }
}
{code}
# When the reading length is smaller than a full stripe, this "if" will not be 
hit.
{code}
if (fetchedBlkIndices.size() == dataBlkNum) {
  // dataBlkNum blocks been fully fetched, no more fetching is needed
  clearFutures(futures.keySet());
}
{code}
# We also need to check if the number of missing blocks is already > number of 
parity blocks. In that case we should fail the read.
# We still need to add a lot more tests to cover different types of failure 
scenarios. But this can also be done in separate jiras.

> Erasure coding: DFSInputStream with decode functionality (pread)
> ----------------------------------------------------------------
>
>                 Key: HDFS-7678
>                 URL: https://issues.apache.org/jira/browse/HDFS-7678
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Li Bo
>            Assignee: Zhe Zhang
>         Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
> HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
> HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
> HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
> HDFS-7678-HDFS-7285.009.patch, HDFS-7678.000.patch, HDFS-7678.001.patch
>
>
> A block group reader will read data from BlockGroup no matter in striping 
> layout or contiguous layout. The corrupt blocks can be known before 
> reading(told by namenode), or just be found during reading. The block group 
> reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to