[jira] [Commented] (HADOOP-15558) Implementation of Clay Codes plugin (Coupled Layer MSR codes)

Chaitanya Mukka (JIRA) Sun, 16 Sep 2018 02:22:41 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-15558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616641#comment-16616641
 ]


Chaitanya Mukka commented on HADOOP-15558:
------------------------------------------

Hi [~Sammi],
Sorry for the delay. Please find the answers inline.
{quote}1. The encoding and decoding of Clay Codes involve PFT, PRT and RS 
computation. So basically the idea is to reduce the network throughput and disk 
read during data repair phase by adding additional computations. In single data 
node failure case, Clay Codes can save 2/3 network bandwidth compared with RS. 
In the worst case, Clay Codes will behave the same as RS from network bandwidth 
wise. Given that most of the failures are single node failure in the storage 
cluster, the cluster can benefit from Clay Codes with no doubt. I assume all 
the benchmark data in slides are collected in a single data node failure case. 
Correct me if it's not correct.
{quote}
Yes, that's right most of the data in slides is for single data node failure 
cases. The Clay codes can handle some multiple erasures efficiently as well. 
However, the current implementation is specific to a single node erasure case.
{quote}2. On P22 of slides, it says " total encoding time remains the same 
while Clay Codec has 70% higher encode computation time". Confused, could you 
explain it further?
{quote}
The encoding time includes the time taken to transfer and write all the blocks 
to disk as well, however, the computation time is the time taken to compute the 
coded blocks. The computation time of RS is small in comparison to the time 
taken to send the coded block across the network and to write them to the disks.
{quote}3. On P21 of slices, Fragmented Read, it says there is no impact on SSD 
when sub-chunk size reaches 4KB. Do you have any data for HDD? Since the 
Hadoop/HDFS, HDD is still the majority.
{quote}
We do not currently have data for HDD. We believe that even for HDDs if the 
sub-chunk size is large enough, then the contiguous data read will be of 
sub-chunk size in the worst case scenario.
{quote}4. P23, what does the "Degraded I/O" scenario means in the slides?
{quote}
It means the read/write I/O speed when node recovery is happening in the 
background.
{quote}5. From the slices, we can see to configure a Clay Codec, k, m, d, and 
sub-trunk size all have matter. While in the implementation, only k and m are 
configurable. What about d and sub-trunk?
{quote}
We have currently implemented the code for d=k+m-1 (the case where the network 
bandwidth is minimum) in Hadoop patch. The Ceph implementation has it for any 
k,m,d. 
We plan to extend the implementation to d<n-1. Sub-chunk size is fixed but it 
is configurable. The number of sub-chunks is something that gets decided based 
on the block size, cell size, sub-packetization level of the Clay code (given 
by q^t where q=d-k+1, t=(k+m)/q).
{quote}6. I googled a lot but found very few links about PFT and PRT matrix. Do 
you have any documents for them?
{quote}
Please look at 
[https://www.usenix.org/system/files/conference/fast18/fast18-vajha.pdf] for 
details on the matrix. The example matrix given in there is just an example. 
PFT and PRT can be realized by using any (4,2) MDS code as well, where the 
symbols (C,C*,U,U*) are related by the (4,2) i.e. k=2 and m=2, MDS code. In 
case of PFT, C, C* are assumed to be erased and are recovered from U, U*. For 
PRT, U, U* are assumed to be erased and are recovered from C, C*.
{quote}7. For implementation part, is clone input blocks a must when 
prepareEncodingStep? Also could you add more comments, such as which part is 
PFT computation, and PRT computation. I will go through the code again later. 
Also ClayCodeUtil is better to be placed in a new file.
{quote}
The sequential decoding of ClayCode requires us to store the previously decoded 
layers while still maintaining the original encoded copies of the same. This 
was the main reason we had to maintain a clone of the blocks.
Also the reason ClayCodeUtil was placed inside the ClayCodeDecodingStep class 
due to its tight coupling with the same. This was a design decision we made due 
to the relationship. We will try to reconsider and see if we can refactor it 
better.

Also, we will surely we will make the necessary changes to take care of the 
code style and update the patch.

> Implementation of Clay Codes plugin (Coupled Layer MSR codes) 
> --------------------------------------------------------------
>
>                 Key: HADOOP-15558
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15558
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Chaitanya Mukka
>            Assignee: Chaitanya Mukka
>            Priority: Major
>         Attachments: ClayCodeCodecDesign-20180630.pdf, 
> HADOOP-15558.001.patch, HADOOP-15558.002.patch
>
>
> [Clay Codes|https://www.usenix.org/conference/fast18/presentation/vajha] are 
> new erasure codes developed as a research project at Codes and Signal Design 
> Lab, IISc Bangalore. A particular Clay code, with storage overhead 1.25x, has 
> been shown to reduce repair network traffic, disk read and repair times by 
> factors of 2.9, 3.4 and 3 respectively compared to the RS codes with the same 
> parameters. 
> This Jira aims to introduce Clay Codes to HDFS-EC as one of the pluggable 
> erasure codec.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15558) Implementation of Clay Codes plugin (Coupled Layer MSR codes)

Reply via email to