[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210790#comment-16210790 ]
SammiChen commented on HDFS-12682: ---------------------------------- Hi [~xiaochen], thanks for reporting this issue. Inspired by your discovery, I found the same issue exists in system EC persist into and load from fsImage (HFDS-12686). The current convertErasureCodingPolicy function is perfect in most cases. For special cases, like get all erasure coding policy and persist policy into fsImage, I think we need a new edition for full convert. {quote} The problem I see from HDFS-12258's implementation though, is the mutable ECPS is saved on the immutable ECP, breaking assumptions such as shared single instance policy. At the same time the policy is still not persisted independently. I think ECPS is highly dependent on the missing piece from HDFS-7337: policies are not persisted to NN metadata. The state of whether a policy is enabled could be persisted together with the policy, without impacting HDFSFileStatus. {quote} Persist ec policies is implemented in HDFS-7337. {quote} I think this bug (HDFS-12682) and HDFS-12258 would make more sense if we could first persist policies to NN metadata. Would also be helpful to separate out something like ErasureCodingPolicyAndState for the policy-specific APIs, so the state isn't deserialized onto HDFSFileStatus. {quote} For HDFS-12258, [~zhouwei], [~drankye] and I, we discussed and do have two different approaches when we first think about how to implement it. One is the current implemented approach, which add one extra "state" field in the existing ECP definition. Another is define a new class, something like {{ErasureCodingPolicyWithState}} to hold the EPC and new policy state field. They are almost equally good. The only concern is if we introduce the new {{ErasureCodingPolicyWithState}}, it may introduce complexity to API interfaces, and to end users. There are multiple EC related APIs. If we return {{ErasureCodingPolicyWithState}} for {{getAllErasureCodingPolicies}} , should we return {{ErasureCodingPolicyWithState}} or {{ErasureCodingPolicy}} for {{getErasureCodingPolicy}}? something like that. Also is it worth to introduce a new class definition in Hadoop which only has 1 extra new field? After all the considerations, the current approach is chosen to leverage the existing ECP. Please let me know if you have other concerns. Thanks! > ECAdmin -listPolicies will always show policy state as DISABLED > --------------------------------------------------------------- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding > Reporter: Xiao Chen > Assignee: Xiao Chen > Labels: hdfs-ec-3.0-must-do > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org