[ 
https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379785#comment-14379785
 ] 

Kai Zheng commented on HDFS-7344:
---------------------------------

Hello [~szetszwo],

Thanks for taking care of this. Let me address your comments together. Please 
let know if it works. Thanks.
bq.For 1 missing block, we may not need to recover it at all since 
(6,3)\-Reed-Solomon can tolerate 3 missing blocks. Also recovery is more 
efficient for 2- or 3- missing blocks.
Good thoughts. I remembered we had related discussion with [~zhz]. The idea is 
we have different priorities for recovery tasks considering how urgent the 
erased blocks are necessarily to be recovered. As you said, 2- or 3- erased 
blocks are more urgent than 1- erased so would be of higher priority when NN 
schedules. Note 1- erased block is still needed to be recovered when possible 
because as existing customer runs, in most cases only one block is erased and 
to be recovered. Recovering 1- erased block can also be efficient, because in 
such case simple XOR calculation can be used and no RS overhead will incur. 

bq.Since you are working on the client, how about letting someone else working 
on the datanode changes?
Good suggestion. Discussed with [~libo-intel], I will help before he can be 
back to this after done with the client side. As it's going in the client side 
where [~libo-intel] collaborates with [~jingzhao], [~zhz] and gets the hard 
part already done, I believe we also need the very good community collaboration 
here as well. How do you like this, let me update the design doc first in the 
early of next week, discussing with [~umamaheswararao], [~vinayrpet] and etc., 
incorporating the discussions here by [~zhz] and you. The doc is subject to 
your review and further discussion here. Meanwhile I will also update and 
refine Bo's codes based on the latest design and the branch in another week, so 
have concrete doable thoughts to break this whole down into smaller tasks, then 
others than me and Bo can also help in parallel as you suggested. Hope this 
works.

> Erasure Coding worker and support in DataNode
> ---------------------------------------------
>
>                 Key: HDFS-7344
>                 URL: https://issues.apache.org/jira/browse/HDFS-7344
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Kai Zheng
>            Assignee: Li Bo
>         Attachments: HDFS ECWorker Design.pdf, hdfs-ec-datanode.0108.zip, 
> hdfs-ec-datanode.0108.zip
>
>
> According to HDFS-7285 and the design, this handles DataNode side extension 
> and related support for Erasure Coding, and implements ECWorker. It mainly 
> covers the following aspects, and separate tasks may be opened to handle each 
> of them.
> * Process encoding work, calculating parity blocks as specified in block 
> groups and codec schema;
> * Process decoding work, recovering data blocks according to block groups and 
> codec schema;
> * Handle client requests for passive recovery blocks data and serving data on 
> demand while reconstructing;
> * Write parity blocks according to storage policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to