[ 
https://issues.apache.org/jira/browse/HDFS-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8461:
----------------------------
    Description: 
{code:title=UnderReplicatedBlocks.java}
  private int getPriority(int curReplicas,
  ...
    } else if (curReplicas == 1) {
      //only on replica -risk of loss
      // highest priority
      return QUEUE_HIGHEST_PRIORITY;
  ...
{code}
For stripe blocks, we should return QUEUE_HIGHEST_PRIORITY when curReplicas == 
6( Suppose 6+3 schema).

That's important. Because
{code:title=BlockManager.java}
DatanodeDescriptor[] chooseSourceDatanodes(BlockInfo block,
  ...
     if(priority != UnderReplicatedBlocks.QUEUE_HIGHEST_PRIORITY 
          && !node.isDecommissionInProgress() 
          && node.getNumberOfBlocksToBeReplicated() >= maxReplicationStreams)
      {
        continue; // already reached replication limit
      }
  ...
{code}
It may return not enough source DNs ( maybe 5), and failed to recover.
A busy node should not be skiped if a block has highest risk/priority. The 
issue is the striped block doesn't have priority.

  was:
{code:title=UnderReplicatedBlocks.java}
  private int getPriority(int curReplicas,
  ...
    } else if (curReplicas == 1) {
      //only on replica -risk of loss
      // highest priority
      return QUEUE_HIGHEST_PRIORITY;
  ...
{code}
For stripe blocks, we should return QUEUE_HIGHEST_PRIORITY when curReplicas == 
6( Suppose 6+3 schema).

That's important. Because
{code:title=BlockManager.java}
DatanodeDescriptor[] chooseSourceDatanodes(BlockInfo block,
  ...
     if(priority != UnderReplicatedBlocks.QUEUE_HIGHEST_PRIORITY 
          && !node.isDecommissionInProgress() 
          && node.getNumberOfBlocksToBeReplicated() >= maxReplicationStreams)
      {
        continue; // already reached replication limit
      }
  ...
{code}
It may return not enough source DNs ( maybe 5), and failed to recover.
A busy node should not be skiped if a block has highest risk/priority.


> Erasure coding: fix priority level of UnderReplicatedBlocks for striped block
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-8461
>                 URL: https://issues.apache.org/jira/browse/HDFS-8461
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Walter Su
>            Assignee: Walter Su
>         Attachments: HDFS-8461-HDFS-7285.001.patch
>
>
> {code:title=UnderReplicatedBlocks.java}
>   private int getPriority(int curReplicas,
>   ...
>     } else if (curReplicas == 1) {
>       //only on replica -risk of loss
>       // highest priority
>       return QUEUE_HIGHEST_PRIORITY;
>   ...
> {code}
> For stripe blocks, we should return QUEUE_HIGHEST_PRIORITY when curReplicas 
> == 6( Suppose 6+3 schema).
> That's important. Because
> {code:title=BlockManager.java}
> DatanodeDescriptor[] chooseSourceDatanodes(BlockInfo block,
>   ...
>      if(priority != UnderReplicatedBlocks.QUEUE_HIGHEST_PRIORITY 
>           && !node.isDecommissionInProgress() 
>           && node.getNumberOfBlocksToBeReplicated() >= maxReplicationStreams)
>       {
>         continue; // already reached replication limit
>       }
>   ...
> {code}
> It may return not enough source DNs ( maybe 5), and failed to recover.
> A busy node should not be skiped if a block has highest risk/priority. The 
> issue is the striped block doesn't have priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to