[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

Zhe Zhang (JIRA) Wed, 29 Apr 2015 17:26:51 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520589#comment-14520589
 ]


Zhe Zhang commented on HDFS-7348:
---------------------------------

Thanks for the discussion Yi and Bo.

On the write path:
# I wonder if we should have a "fast track" for the most common case, where the 
DN receiving EC command is the final destination? In this case, this DN should 
just create a local block and write to it.
# If we decide to have such a "fast track", then it seems natural to use that 
code to store a copy of all reconstructed blocks first. Then we can use 
existing {{DataNode#DataTransfer}} to push them out. Yi mentioned several 
drawbacks of storing a reconstructed block on disk before sending it out: i) 
performance; ii) disk space; iii) management; iv) calculate crc. The 
performance and disk usage overheads are still valid concerns even if we have a 
"fast track" code mentioned above. So how about split out the current logic of 
transferring to remote targets (e.g., {{transferCells2Targets}}) as a separate 
JIRA ("recovering multiple missing blocks")? Of course that's assuming we do 
want to have a "fast track" for recovering single block locally.

On the read path:
# bq. (read entire blocks and then decode) It's big issue for memory, 
especially there may be multiple stripe block recovery at the same time.
Yes I agree.. So block size is too large as the sync-and-decode unit and I 
think cell size is too small for that purpose. I think it's reasonable to use a 
few 100MB's of memory for recovery. So how about setting the default as 32MB or 
64MB? Assuming 6+3 schema that will be 300~600MB of memory usage. And we only 
need to create block reader 2~4 times to each source.
# Sequential vs. parallel reading is a hard decision. Since the current code is 
in parallel mode we should probably keep it that way in this stage, and add the 
other mode (like Bo suggested, Fast and Slow modes) later if needed.

> Erasure Coding: striped block recovery
> --------------------------------------
>
>                 Key: HDFS-7348
>                 URL: https://issues.apache.org/jira/browse/HDFS-7348
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Kai Zheng
>            Assignee: Yi Liu
>         Attachments: ECWorker.java, HDFS-7348.001.patch
>
>
> This JIRA is to recover one or more missed striped block in the striped block 
> group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

Reply via email to