[ https://issues.apache.org/jira/browse/HADOOP-15558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616641#comment-16616641 ]
Chaitanya Mukka commented on HADOOP-15558: ------------------------------------------ Hi [~Sammi], Sorry for the delay. Please find the answers inline. {quote}1. The encoding and decoding of Clay Codes involve PFT, PRT and RS computation. So basically the idea is to reduce the network throughput and disk read during data repair phase by adding additional computations. In single data node failure case, Clay Codes can save 2/3 network bandwidth compared with RS. In the worst case, Clay Codes will behave the same as RS from network bandwidth wise. Given that most of the failures are single node failure in the storage cluster, the cluster can benefit from Clay Codes with no doubt. I assume all the benchmark data in slides are collected in a single data node failure case. Correct me if it's not correct. {quote} Yes, that's right most of the data in slides is for single data node failure cases. The Clay codes can handle some multiple erasures efficiently as well. However, the current implementation is specific to a single node erasure case. {quote}2. On P22 of slides, it says " total encoding time remains the same while Clay Codec has 70% higher encode computation time". Confused, could you explain it further? {quote} The encoding time includes the time taken to transfer and write all the blocks to disk as well, however, the computation time is the time taken to compute the coded blocks. The computation time of RS is small in comparison to the time taken to send the coded block across the network and to write them to the disks. {quote}3. On P21 of slices, Fragmented Read, it says there is no impact on SSD when sub-chunk size reaches 4KB. Do you have any data for HDD? Since the Hadoop/HDFS, HDD is still the majority. {quote} We do not currently have data for HDD. We believe that even for HDDs if the sub-chunk size is large enough, then the contiguous data read will be of sub-chunk size in the worst case scenario. {quote}4. P23, what does the "Degraded I/O" scenario means in the slides? {quote} It means the read/write I/O speed when node recovery is happening in the background. {quote}5. From the slices, we can see to configure a Clay Codec, k, m, d, and sub-trunk size all have matter. While in the implementation, only k and m are configurable. What about d and sub-trunk? {quote} We have currently implemented the code for d=k+m-1 (the case where the network bandwidth is minimum) in Hadoop patch. The Ceph implementation has it for any k,m,d. We plan to extend the implementation to d<n-1. Sub-chunk size is fixed but it is configurable. The number of sub-chunks is something that gets decided based on the block size, cell size, sub-packetization level of the Clay code (given by q^t where q=d-k+1, t=(k+m)/q). {quote}6. I googled a lot but found very few links about PFT and PRT matrix. Do you have any documents for them? {quote} Please look at [https://www.usenix.org/system/files/conference/fast18/fast18-vajha.pdf] for details on the matrix. The example matrix given in there is just an example. PFT and PRT can be realized by using any (4,2) MDS code as well, where the symbols (C,C*,U,U*) are related by the (4,2) i.e. k=2 and m=2, MDS code. In case of PFT, C, C* are assumed to be erased and are recovered from U, U*. For PRT, U, U* are assumed to be erased and are recovered from C, C*. {quote}7. For implementation part, is clone input blocks a must when prepareEncodingStep? Also could you add more comments, such as which part is PFT computation, and PRT computation. I will go through the code again later. Also ClayCodeUtil is better to be placed in a new file. {quote} The sequential decoding of ClayCode requires us to store the previously decoded layers while still maintaining the original encoded copies of the same. This was the main reason we had to maintain a clone of the blocks. Also the reason ClayCodeUtil was placed inside the ClayCodeDecodingStep class due to its tight coupling with the same. This was a design decision we made due to the relationship. We will try to reconsider and see if we can refactor it better. Also, we will surely we will make the necessary changes to take care of the code style and update the patch. > Implementation of Clay Codes plugin (Coupled Layer MSR codes) > -------------------------------------------------------------- > > Key: HADOOP-15558 > URL: https://issues.apache.org/jira/browse/HADOOP-15558 > Project: Hadoop Common > Issue Type: New Feature > Reporter: Chaitanya Mukka > Assignee: Chaitanya Mukka > Priority: Major > Attachments: ClayCodeCodecDesign-20180630.pdf, > HADOOP-15558.001.patch, HADOOP-15558.002.patch > > > [Clay Codes|https://www.usenix.org/conference/fast18/presentation/vajha] are > new erasure codes developed as a research project at Codes and Signal Design > Lab, IISc Bangalore. A particular Clay code, with storage overhead 1.25x, has > been shown to reduce repair network traffic, disk read and repair times by > factors of 2.9, 3.4 and 3 respectively compared to the RS codes with the same > parameters. > This Jira aims to introduce Clay Codes to HDFS-EC as one of the pluggable > erasure codec. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org