[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones

Andrew Wang (JIRA) Wed, 05 Aug 2015 15:55:38 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659128#comment-14659128
 ]


Andrew Wang commented on HDFS-8833:
-----------------------------------

Good question, I think it basically comes down to our deployment scenarios 
being more broad than Quantcast or Facebook. You want a # of racks equal to the 
stripe width for fault tolerance. FB and Quantcast are big enough that they run 
14 rack or 9 rack clusters, but not all of our customers are at that same 
scale. So there isn't a one-size-fits-all schema that works for all HDFS users; 
the big ones will use (10,4) or (6,3) like FB and Quantcast, but the smaller 
ones will want (3,2).

I've also seen customers starting with small clusters and growing them by 
adding racks over time. This is also somewhat unique to HDFS compared to QFS 
and f4, and a reason why it'd be nice to support a few different policies even 
within the same cluster.

> Erasure coding: store EC schema and cell size in INodeFile and eliminate 
> notion of EC zones
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8833
>                 URL: https://issues.apache.org/jira/browse/HDFS-8833
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>
> We have [discussed | 
> https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
>  storing EC schema with files instead of EC zones and recently revisited the 
> discussion under HDFS-8059.
> As a recap, the _zone_ concept has severe limitations including renaming and 
> nested configuration. Those limitations are valid in encryption for security 
> reasons and it doesn't make sense to carry them over in EC.
> This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
> simplicity, we should first implement it as an xattr and consider memory 
> optimizations (such as moving it to file header) as a follow-on. We should 
> also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones

Reply via email to