[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753349#action_12753349
 ] 

Raghu Angadi commented on HDFS-503:
-----------------------------------

This seems pretty useful.  since this is done outside HDFS, it is simpler for 
users to start experimenting.

Say a file has 5 blocks with replication of 3 : total 15 blocks
With this tool, replication could be reduced to 2, with one block for parity : 
total 10 + 2 blocks
This is a savings of 20% space. Is this math correct?

Detecting when to  'unRaid' : 
  * The patch does this using a wrapper filesystem over HDFS.
      ** This requires file to be read by the client. 
      ** More often than not, HDFS knows about irrecoverable blocks much before 
a client  reads.
      ** this only semi-transparent to the users since they have to use the new 
filesystem.

  * Another completely transparent alternative could be to make 'RaidNode' ping 
NameNode for missing blocks.
      ** NameNode already knows about blocks that don't have any known good 
replica. And fetching that list is cheap.
      ** RaidNode could check if the corrupt/missing block belongs to any of 
its files. 
      ** Rest of RaidNode pretty much remains the same as this patch.

> Implement erasure coding as a layer on HDFS
> -------------------------------------------
>
>                 Key: HDFS-503
>                 URL: https://issues.apache.org/jira/browse/HDFS-503
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: raid1.txt
>
>
> The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
> file system can be reduced. Keeping three copies of the same data is very 
> costly, especially when the size of storage is huge. One idea is to reduce 
> the replication factor and do erasure coding of a set of blocks so that the 
> over probability of failure of a block remains the same as before.
> Many forms of error-correcting codes are available, see 
> http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
> described DiskReduce 
> https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
> My opinion is to discuss implementation strategies that are not part of base 
> HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to