[ 
https://issues.apache.org/jira/browse/HDFS-8786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167762#comment-15167762
 ] 

Jing Zhao commented on HDFS-8786:
---------------------------------

bq. I will add an extra check against the BlockManager#corruptReplicas map. If 
true will trigger EC reconstruction task otherwise replication task. 

corruptReplicas is not tracking if the whole blockInfo (i.e., the whole striped 
block group) is corrupted or requires internal block reconstructions. It is 
only used to track corrupted internal blocks/replicas that were reported by 
DataNode/client by detected through block reports. Thus here the key is not to 
check corruptReplicas, but to check if we have enough healthy internal blocks 
covering the complete ID range (i.e. {{hasAllInternalBlocks}}). I.e., we should 
augment the current {{chooseSource4SimpleReplication}} method by adding 
choosing decommissioning node logic.

For sorting the storages, we do not do real "sorting". Instead, we only need to 
check if we have duplicated internal blocks in the block group, and if yes, we 
make sure a decommissioned storage with duplicated block is put in the end. 
I.e., suppose we have storages
d0, d1, d2, d3, d4, d5, d6, d7, d8, d9
mapping to indices
0, 1, 2, 3, 4, 5, 6, 7, 8, 2

Thus we have duplicated internal blocks for b2, locating in d2 and d9. If we 
find that d2 is a decommissioned node and d9 is not, we should switch d2 and d9 
in the storage list.

> Erasure coding: DataNode should transfer striped blocks before being 
> decommissioned
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-8786
>                 URL: https://issues.apache.org/jira/browse/HDFS-8786
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Rakesh R
>         Attachments: HDFS-8786-001.patch, HDFS-8786-draft.patch
>
>
> Per [discussion | 
> https://issues.apache.org/jira/browse/HDFS-8697?focusedCommentId=14609004&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14609004]
>  under HDFS-8697, it's too expensive to reconstruct block groups for decomm 
> purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to