[ https://issues.apache.org/jira/browse/HDFS-11082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113552#comment-16113552 ]
Andrew Wang commented on HDFS-11082: ------------------------------------ Hi Sammi, this looks good overall, thanks for working on this! A few review comments: * We should add documentation and javadocs describing this new special policy so users and admins can be aware * Also need to think about the behavior of {{getErasureCodingPolicy}}. Right now it returns "null" to mean replication. With this patch, a user would have to check both for "null" and "replication-1-2-64K" to know if it's replicated. It'd be good to choose one or the other to make it simpler for downstreams. "null" would be more compatible, and it'd hide the special replicated EC policy from non-admin users which I like. * Please add messages to the asserts in the tests to help with later debugging * Is this policy enabled by default? I think it should be if not. * Would be nice to rename the paths in the test cases to be more descriptive. As an example, right now we have: {code} 723 final Path rootPath = new Path("/striped"); 724 final Path childPath = new Path(rootPath, "replica"); 725 final Path subChildPath = new Path(childPath, "replica"); 726 final Path filePath = new Path(childPath, "file"); 727 final Path filePath2 = new Path(subChildPath, "file"); {code} Instead, perhaps something more like: {code} 723 final Path rootPath = new Path("/striped"); 724 final Path replicaPath = new Path(rootPath, "replica"); 725 final Path subReplicaPath = new Path(replicaPath, "subreplica"); 726 final Path replicaFilePath = new Path(replicaPath, "file"); 727 final Path subReplicaFilePath = new Path(subReplicaPath, "file"); {code} This is not directly related (and I think we discussed this a bit on another JIRA) but I'm not happy with our getECPolicy API right now. Right now it returns the effective EC policy. Without being able to query the actual EC policy, the behavior when setting/unsetting is kind of tricky. Should we add an "getActualECPolicy" API? Can be a follow-on JIRA. If you don't mind, one immediate improvement we could make is documenting in the {{getECPolicy}} javadoc that it returns the effective EC policy. > Erasure Coding : Provide replicated EC policy to just replicating the files > --------------------------------------------------------------------------- > > Key: HDFS-11082 > URL: https://issues.apache.org/jira/browse/HDFS-11082 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding > Reporter: Rakesh R > Assignee: SammiChen > Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11082.001.patch > > > The idea of this jira is to provide a new {{replicated EC policy}} so that we > can override the EC policy on a parent directory and go back to just > replicating the files based on replication factors. > Thanks [~andrew.wang] for the > [discussions|https://issues.apache.org/jira/browse/HDFS-11072?focusedCommentId=15620743&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15620743]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org