[ 
https://issues.apache.org/jira/browse/HDFS-11082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113552#comment-16113552
 ] 

Andrew Wang commented on HDFS-11082:
------------------------------------

Hi Sammi, this looks good overall, thanks for working on this! A few review 
comments:

* We should add documentation and javadocs describing this new special policy 
so users and admins can be aware
* Also need to think about the behavior of {{getErasureCodingPolicy}}. Right 
now it returns "null" to mean replication. With this patch, a user would have 
to check both for "null" and "replication-1-2-64K" to know if it's replicated. 
It'd be good to choose one or the other to make it simpler for downstreams. 
"null" would be more compatible, and it'd hide the special replicated EC policy 
from non-admin users which I like.
* Please add messages to the asserts in the tests to help with later debugging
* Is this policy enabled by default? I think it should be if not.
* Would be nice to rename the paths in the test cases to be more descriptive. 
As an example, right now we have:

{code}
723         final Path rootPath = new Path("/striped");
724         final Path childPath = new Path(rootPath, "replica");
725         final Path subChildPath = new Path(childPath, "replica");
726         final Path filePath = new Path(childPath, "file");
727         final Path filePath2 = new Path(subChildPath, "file");
{code}

Instead, perhaps something more like:

{code}
723         final Path rootPath = new Path("/striped");
724         final Path replicaPath = new Path(rootPath, "replica");
725         final Path subReplicaPath = new Path(replicaPath, "subreplica");
726         final Path replicaFilePath = new Path(replicaPath, "file");
727         final Path subReplicaFilePath = new Path(subReplicaPath, "file");
{code}

This is not directly related (and I think we discussed this a bit on another 
JIRA) but I'm not happy with our getECPolicy API right now. Right now it 
returns the effective EC policy. Without being able to query the actual EC 
policy, the behavior when setting/unsetting is kind of tricky. Should we add an 
"getActualECPolicy" API? Can be a follow-on JIRA.

If you don't mind, one immediate improvement we could make is documenting in 
the {{getECPolicy}} javadoc that it returns the effective EC policy.

> Erasure Coding : Provide replicated EC policy to just replicating the files
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-11082
>                 URL: https://issues.apache.org/jira/browse/HDFS-11082
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>            Reporter: Rakesh R
>            Assignee: SammiChen
>            Priority: Critical
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11082.001.patch
>
>
> The idea of this jira is to provide a new {{replicated EC policy}} so that we 
> can override the EC policy on a parent directory and go back to just 
> replicating the files based on replication factors.
> Thanks [~andrew.wang] for the 
> [discussions|https://issues.apache.org/jira/browse/HDFS-11072?focusedCommentId=15620743&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15620743].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to