from:"Vincent.Wei \(JIRA\)"

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-06-23 Thread Vincent.Wei (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597792#comment-14597792
]

Vincent.Wei commented on HDFS-7285:
---

I am on will out of office for biz trip form 6.23-6.26, I may reply e-mail
slowly, please call me 13764370648 when there are urgent mater.

Erasure Coding Support inside HDFS
--

Key: HDFS-7285
URL: https://issues.apache.org/jira/browse/HDFS-7285
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch,
HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf,
HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf,
HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf,
fsimage-analysis-20150105.pdf

Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice
of data reliability, comparing to the existing HDFS 3-replica approach. For
example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks,
with storage overhead only being 40%. This makes EC a quite attractive
alternative for big data storage, particularly for cold data.
Facebook had a related open source project called HDFS-RAID. It used to be
one of the contribute packages in HDFS but had been removed since Hadoop 2.0
for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends
on MapReduce to do encoding and decoding tasks; 2) it can only be used for
cold files that are intended not to be appended anymore; 3) the pure Java EC
coding implementation is extremely slow in practical use. Due to these, it
might not be a good idea to just bring HDFS-RAID back.
We (Intel and Cloudera) are working on a design to build EC into HDFS that
gets rid of any external dependencies, makes it self-contained and
independently maintained. This design lays the EC feature on the storage type
support and considers compatible with existing HDFS features like caching,
snapshot, encryption, high availability and etc. This design will also
support different EC coding schemes, implementations and policies for
different deployment scenarios. By utilizing advanced libraries (e.g. Intel
ISA-L library), an implementation can greatly improve the performance of EC
encoding/decoding and makes the EC solution even more attractive. We will
post the design document soon.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-06-08 Thread Vincent.Wei (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576695#comment-14576695
]

Vincent.Wei commented on HDFS-7285:
---

Hi all
I am a new comer , I want to know if I can add the HDFS-7285-initial-PoC.patch
on the hadoop v2.2.0 ?
Thanks .

Erasure Coding Support inside HDFS
--

Key: HDFS-7285
URL: https://issues.apache.org/jira/browse/HDFS-7285
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch,
HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf,
HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf,
fsimage-analysis-20150105.pdf

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-02-16 Thread Vincent.Wei (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323141#comment-14323141
]

Vincent.Wei commented on HDFS-7285:
---

I am will out of office for CN New Year from 2.15-2.26 , I may reply e-mail
slowly, please call me 13764370648 when there are urgent mater.

Erasure Coding Support inside HDFS
--

Key: HDFS-7285
URL: https://issues.apache.org/jira/browse/HDFS-7285
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
Attachments: ECAnalyzer.py, ECParser.py,
HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf,
HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf,
fsimage-analysis-20150105.pdf

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

2014-11-24 Thread Vincent.Wei (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224107#comment-14224107
 ] 

Vincent.Wei commented on HDFS-503:
--

Is anybody know how to build this patch on Hadoop v2.2.0 ?

 Implement erasure coding as a layer on HDFS
 ---

 Key: HDFS-503
 URL: https://issues.apache.org/jira/browse/HDFS-503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/raid
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.21.0

 Attachments: raid1.txt, raid2.txt


 The goal of this JIRA is to discuss how the cost of raw storage for a HDFS 
 file system can be reduced. Keeping three copies of the same data is very 
 costly, especially when the size of storage is huge. One idea is to reduce 
 the replication factor and do erasure coding of a set of blocks so that the 
 over probability of failure of a block remains the same as before.
 Many forms of error-correcting codes are available, see 
 http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has 
 described DiskReduce 
 https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
 My opinion is to discuss implementation strategies that are not part of base 
 HDFS, but is a layer on top of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

[jira] [Commented] (HDFS-503) Implement erasure coding as a layer on HDFS

4 matches

Site Navigation

Mail list logo

Footer information