[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

Zhe Zhang (JIRA) Tue, 14 Apr 2015 17:36:18 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495442#comment-14495442
 ]


Zhe Zhang commented on HDFS-7859:
---------------------------------

[~szetszwo] / [~drankye]: The [phasing plan | 
https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14391207&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391207]
 I posted might be a little confusing in regards of schemas. My apologies.

In the offline meetup on 03/31, we didn't reach a clear conclusion on how much 
of schema work to include before merging. Therefore I left it in phase I, but 
marked it as optional. My thought was that we could make a better decision 
after observing how fast the work could proceed. Up to this point I think this 
thread is going pretty well and it seems we can have a multi-schema 
implementation when other HDFS-7285 tasks are done (see details below).

Good [questions | 
https://issues.apache.org/jira/browse/HDFS-7859?focusedCommentId=14494933&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494933]
 on schema design. I think we eventually need to answer them in the broader 
scope of HDFS-7337. IIUC HDFS-7859 / HDFS-7866 are not touching most of the 
tricky scenarios. Based on Kai's latest [comment | 
https://issues.apache.org/jira/browse/HDFS-7866?focusedCommentId=14494050&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494050],
 HDFS-7866 will mostly handle _default_ schemas embedded in the 
{{ECSchemaManager}} code. 

The patch under this JIRA handles saving / loading these default schemas in 
fsimage. I think this is necessary even without loading custom schemas from 
XML. Otherwise we cannot guarantee the NameNode which loads the fsimage has the 
same default schemas as the NameNode which saved it. It is obviously even more 
necessary when we add custom schemas. The logic in the patch is quite 
straightforward; it's mostly about serialize / deserialize schemas.

So here's my proposal:
# Shrink this patch to get rid of logics on modifying and removing schemas 
({{ECSchemaManager#modifyECSchema}} and {{OP_MODIFY_EC_SCHEMA}}). 
# Repurpose HDFS-7866 to focus on loading custom schemas from site xml files.

[~szetszwo], [~drankye], [~vinayrpet]: let me know if you agree with the above. 
If we are all synced on this, how about moving this JIRA back to HDFS-7285 and 
keeping HDFS-7866 under HDFS-8031?

> Erasure Coding: Persist EC schemas in NameNode
> ----------------------------------------------
>
>                 Key: HDFS-7859
>                 URL: https://issues.apache.org/jira/browse/HDFS-7859
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Xinwei Qin 
>         Attachments: HDFS-7859.001.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
> persist EC schemas in NameNode centrally and reliably, so that EC zones can 
> reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

Reply via email to