Rakesh R created HDFS-11032:
-------------------------------

             Summary: [SPS]: Handling of block movement failure at the 
coordinator datanode
                 Key: HDFS-11032
                 URL: https://issues.apache.org/jira/browse/HDFS-11032
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Rakesh R
            Assignee: Rakesh R


The idea of this jira is to discuss and implement an efficient failure(block 
movement failure) handling logic at the datanode cooridnator.  [Code 
reference|https://github.com/apache/hadoop/blob/HDFS-10285/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/StoragePolicySatisfyWorker.java#L243].

Following are the possible errors during block movement:
# Network errors(IOException) - provide retries(may be a hard coded 2 time 
retries) if the block storage movement is failed due to network errors. If its 
still end up with errors after 2 retries then marked as failure/retry to NN.
# No disk space(IOException) - no retries maked as failure/retry to NN.
# Block pinned - no retries marked as success/no-retry to NN. It is not 
possible to relocate this block to another datanode.
# Gen_Stamp mismatches - no retries marked as failure/retry to NN. Could be a 
case that the file might have re-opened.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to