[ https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753349#action_12753349 ]
Raghu Angadi commented on HDFS-503: ----------------------------------- This seems pretty useful. since this is done outside HDFS, it is simpler for users to start experimenting. Say a file has 5 blocks with replication of 3 : total 15 blocks With this tool, replication could be reduced to 2, with one block for parity : total 10 + 2 blocks This is a savings of 20% space. Is this math correct? Detecting when to 'unRaid' : * The patch does this using a wrapper filesystem over HDFS. ** This requires file to be read by the client. ** More often than not, HDFS knows about irrecoverable blocks much before a client reads. ** this only semi-transparent to the users since they have to use the new filesystem. * Another completely transparent alternative could be to make 'RaidNode' ping NameNode for missing blocks. ** NameNode already knows about blocks that don't have any known good replica. And fetching that list is cheap. ** RaidNode could check if the corrupt/missing block belongs to any of its files. ** Rest of RaidNode pretty much remains the same as this patch. > Implement erasure coding as a layer on HDFS > ------------------------------------------- > > Key: HDFS-503 > URL: https://issues.apache.org/jira/browse/HDFS-503 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Attachments: raid1.txt > > > The goal of this JIRA is to discuss how the cost of raw storage for a HDFS > file system can be reduced. Keeping three copies of the same data is very > costly, especially when the size of storage is huge. One idea is to reduce > the replication factor and do erasure coding of a set of blocks so that the > over probability of failure of a block remains the same as before. > Many forms of error-correcting codes are available, see > http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has > described DiskReduce > https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt. > My opinion is to discuss implementation strategies that are not part of base > HDFS, but is a layer on top of HDFS. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.