Implement erasure coding as a layer on HDFS
-------------------------------------------

                 Key: HDFS-503
                 URL: https://issues.apache.org/jira/browse/HDFS-503
             Project: Hadoop HDFS
          Issue Type: New Feature
            Reporter: dhruba borthakur
            Assignee: dhruba borthakur


The goal of this JIRA is to discuss how the cost of raw storage for a HDFS file 
system can be reduced. Keeping three copies of the same data is very costly, 
especially when the size of storage is huge. One idea is to reduce the 
replication factor and do erasure coding of a set of blocks so that the over 
probability of failure of a block remains the same as before.

Many forms of error-correcting codes are available, see 
http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
described DiskReduce 
https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.

My opinion is to discuss implementation strategies that are not part of base 
HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to