[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868948#comment-15868948 ] Ming Ma commented on HDFS-7877: --- ok. Will follow up the discussion in HDFS-11412. > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7877-2.patch, HDFS-7877.patch, > Supportmaintenancestatefordatanodes-2.pdf, > Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864638#comment-15864638 ] Manoj Govindassamy commented on HDFS-7877: -- Thanks [~mingma]. Got it, when you club this with Upgrade Domain, the impact is not that severe. I will make the following change for the Maintenance Min Replication range validation check. {noformat} --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java @@ -484,12 +484,12 @@ public BlockManager(final Namesystem namesystem, boolean haEnabled, + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY + " = " + minMaintenanceR + " < 0"); } -if (minMaintenanceR > minR) { +if (minMaintenanceR > defaultReplication) { throw new IOException("Unexpected configuration parameters: " + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY + " = " + minMaintenanceR + " > " - + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY - + " = " + minR); + + DFSConfigKeys.DFS_REPLICATION_DEFAULT + + " = " + defaultReplication); } {noformat} bq. the transition policy from ENTERING_MAINTENANCE to IN_MAINTENANCE will become the # of live replicas >= min(dfs.namenode.maintenance.replication.min, replication factor). But, the transition from ENTERIN_MM to IN_MM that is happening {{DecommissionManager#Monitor#check}} which in-turn calls {{DecommissionManager#isSufficient}} looks ok to me. Because, we allow files to be created with custom block replication count say 1, which can be lesser than the default dfs.replication=3. And, since we should not be counting in the Maintenance Replicas, the formula is, as it exists currently {noformat} expectedRedundancy = file_block_replication_count=1 or the default_replication_cont=3 Math.max( expectedRedundancy - numberReplicas.maintenanceReplicas(), getMinMaintenanceStorageNum(block)); {noformat} Let me know if I am missing something. Thanks. --- related code snippets {noformat} /** * Checks whether a block is sufficiently replicated/stored for * decommissioning. For replicated blocks or striped blocks, full-strength * replication or storage is not always necessary, hence "sufficient". * @return true if sufficient, else false. */ private boolean isSufficient(BlockInfo block, BlockCollection bc, NumberReplicas numberReplicas, boolean isDecommission) { if (blockManager.hasEnoughEffectiveReplicas(block, numberReplicas, 0)) { // Block has enough replica, skip LOG.trace("Block {} does not need replication.", block); return true; } .. .. .. // Check if the number of live + pending replicas satisfies // the expected redundancy. boolean hasEnoughEffectiveReplicas(BlockInfo block, NumberReplicas numReplicas, int pendingReplicaNum) { int required = getExpectedLiveRedundancyNum(block, numReplicas); int numEffectiveReplicas = numReplicas.liveReplicas() + pendingReplicaNum; return (numEffectiveReplicas >= required) && (pendingReplicaNum > 0 || isPlacementPolicySatisfied(block)); } // Exclude maintenance, but make sure it has minimal live replicas // to satisfy the maintenance requirement. public short getExpectedLiveRedundancyNum(BlockInfo block, NumberReplicas numberReplicas) { final short expectedRedundancy = getExpectedRedundancyNum(block); return (short) Math.max(expectedRedundancy - numberReplicas.maintenanceReplicas(), getMinMaintenanceStorageNum(block)); } {noformat} > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7877-2.patch, HDFS-7877.patch, > Supportmaintenancestatefordatanodes-2.pdf, > Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15862549#comment-15862549 ] Ming Ma commented on HDFS-7877: --- Thanks [~manojg]. Good point. What you suggested makes sense. The reason we don't have this requirement in our production is probably because we only put nodes in one upgrade domain into maintenance at a time; after one batch is done, move to the next upgrade domain. Thus no two replicas will be put to maintenance at the same time. To confirm, given we will still allow applications to create blocks with smaller replication factor than {{dfs.namenode.maintenance.replication.min}}, the transition policy from {{ENTERING_MAINTENANCE}} to {{IN_MAINTENANCE}} becomes the # of live replicas >= min({{dfs.namenode.maintenance.replication.min}}, replication factor). > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7877-2.patch, HDFS-7877.patch, > Supportmaintenancestatefordatanodes-2.pdf, > Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861802#comment-15861802 ] Manoj Govindassamy commented on HDFS-7877: -- [~mingma], [~dilaver] brought up a good point regarding the restrictions put for the range allowed for the configuration {{dfs.namenode.maintenance.replication.min}}. Currently the allowed range for Maintenance Min Replication is {{0 to dfs.namenode.replication.min(default=1)}}. Users wanting not to affect the performance of the cluster would wish to have the Maintenance Min Replication number greater than 1, say 2. In the current design, it is possible to have this Maintenance Min Replication configuration, but only after changing the NameNode level Block Min Replication to 2, and which could slowdown the overall latency for client writes. Technically speaking we should be allowing Maintenance Min Replication to be in range {{0 to dfs.replication.max}}. There is always config value of 0 for users not wanting any availability/performance during maintenance. And, performance centric workloads can still get maintenance done without major disruptions by having a bigger Maintenance Min Replication. So, any reasons why you wanted to have Maintenance Min Replication range to be restrictive and less than or equal to {{dfs.namenode.replication.min}} ? May be i am overlooking something here. Please clarify. {noformat} if (minMaintenanceR < 0) { throw new IOException("Unexpected configuration parameters: " + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY + " = " + minMaintenanceR + " < 0"); } if (minMaintenanceR > minR) { throw new IOException("Unexpected configuration parameters: " + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY + " = " + minMaintenanceR + " > " + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY + " = " + minR); {noformat} > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7877-2.patch, HDFS-7877.patch, > Supportmaintenancestatefordatanodes-2.pdf, > Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940676#comment-14940676 ] Ming Ma commented on HDFS-7877: --- Maybe we should try to support persistence for timeout. We can persist the maintenance expiration UTC time via some new mechanism discussed in HDFS-9005. The clock can be out of sync among NNs, but we can accept that given the maintenance timeout precision is in the order of minutes. [~ctrezzo] [~eddyxu], thought? > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Ming Ma > Attachments: HDFS-7877-2.patch, HDFS-7877.patch, > Supportmaintenancestatefordatanodes-2.pdf, > Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739108#comment-14739108 ] Ming Ma commented on HDFS-7877: --- For the open issues around timeout and persistence, [~ctrezzo] [~eddyxu] and I had some offline discussion. We also discussed with our admins. Appreciate input from others. * Timeout support. We should support it. * Persistence vs soft state. Persistence is desirable for some cases. But soft state is acceptable. From application's point of view, if it asks HDFS to timeout the maintenance state and ideally would like HDFS to honor the request (applications don't care failover and restart as long as HDFS is up). Soft state means HDFS wouldn't honor the timeout value if there are NN failover/restart. For some scenarios admins would prefer if HDFS can honor the request if there are any NN failover/restart; but they can also accept soft state approach. > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Ming Ma > Attachments: HDFS-7877-2.patch, HDFS-7877.patch, > Supportmaintenancestatefordatanodes-2.pdf, > Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636255#comment-14636255 ] Joep Rottinghuis commented on HDFS-7877: What do we need to do to get this going (again) in OSS? Just FYI, we're moving forward with this at Twitter on production clusters. Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877-2.patch, HDFS-7877.patch, Supportmaintenancestatefordatanodes-2.pdf, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594075#comment-14594075 ] Ming Ma commented on HDFS-7877: --- Thanks [~rajive] for your input! I also discussed with [~rawk]. * Support for timeout. Sounds like folks prefer to have HDFS support that. That makes sense. Value of -1 could mean no timeout. In addition, based on current scenarios it seems we don't need to support per-host timeout; instead we can use some global timeout value. * Support for persistence. If we don't put the maintenance files into some file, it will be lost after NN restart. In other words, the node will be transitioned out of maintenance state upon NN restart. So from admin point of view, the node could be transitioned out of maintenance state prior to the timeout. Are we ok with such possible inconsistency? * If the node should be taken of DECOMMISSIONING when the node becomes dead. Admin state is separate from the liveness state. The reason the node is kept in DECOMMISSIONING state is to address data reliability issue. HDFS-6791 has more details. Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877-2.patch, HDFS-7877.patch, Supportmaintenancestatefordatanodes-2.pdf, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588775#comment-14588775 ] Rajiv Chittajallu commented on HDFS-7877: - * It would be preferable to have a timeout of maintenance state, which would be higher than {{dfs.namenode.heartbeat.recheck-interval}}. * Instead of specifying hosts in a file, {{dfs.hosts.maintenance}}, can this be done via {{dfsadmin}} ? Maintenance mode is an temporary transient state and it would be simpler to not to track it via files. bq. That is why we have the case where if a node becomes dead when it is being decommissioned, it will remains in DECOMMISSION_IN_PROGRESS state until all the blocks are properly replicated. If a datanode goes offline while decommissioning, it should be treated as dead and not be in {{DECOMMISSION_IN_PROGRESS}} state. Re-replicating blocks for nodes in dead state should be treated with higher priority. Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877-2.patch, HDFS-7877.patch, Supportmaintenancestatefordatanodes-2.pdf, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588661#comment-14588661 ] Kihwal Lee commented on HDFS-7877: -- [~rajive] It will be nice if we can get your perspective on this. Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877-2.patch, HDFS-7877.patch, Supportmaintenancestatefordatanodes-2.pdf, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356375#comment-14356375 ] Ming Ma commented on HDFS-7877: --- Thanks Eddy for the review and suggestions. Please find my response below. Chris might have more to add. bq. Why is the node state the combination of live|dead and In service|Decommissioned|In maintenance..? There are two state machines for datanode. One is called liveness state. Another one is called admin state. HDFS-7521 has some discussion around that. So datanode can be in any combination of these two states. That is why we have the case where if a node becomes dead when it is being decommissioned, it will remains in {{DECOMMISSION_IN_PROGRESS}} state until all the blocks are properly replicated. bq. After NN re-starts, I think NN could not find out whether DN is in enter_maintenance or in_maintenance mode? The design handles the datanode state management for {{ENTERING_MAINTENANCE}} and {{IN_MAINTENANCE}} somewhat similar to {{DECOMMISSION_IN_PROGRESS}} and {{DECOMMISSIONED}} in the following ways. 1. When a node registers with NN ( could be datanode restart or NN restart ), it will first transition to DECOMMISSION_IN_PROGRESS if it is in exclude file; or ENTERING_MAINTENANCE if it is in maintenance file. 2. Only after target replication has been reached, it will be transitioned to the final state, DECOMMISSIONED or IN_MAINTENANCE. bq. Moreover, after NN restarts, if a DN is actually in the maintenance mode (DN is shutting down for maintenance), NN could not receive block reports from this DN. After NN restarts, if a DN in maintenance file doesn't register with NN, then it won't be in {{DatanodeManager}}'s {{datanodeMap}} and thus the state won't be tracked. So it should be similar to how decommission is handled. If the DN does register with NN, there is a bug in the patch that doesn't check if NN has received blockreport from the DN so that it doesn't prematurely transition the DN to {{in_maintenance}} state. bq. Is put the dead node into maintenance mode necessary? Good question, if it is ok to keep the node in {{dead, normal}} state when admins add the node to maintenance file. The intention is to make it consistent with the actual content in maintenance file. It is similar to how decommission is handled; if you add a dead node to exclude file, the node will go directly into {{DECOMMISSIONED}} state. For replicas processing, {{dead, in_maintenance}} - {{live, in_maintenance}} won't trigger excess blocks removal; {{live, in_maintenance}} - {{live, normal}} will. bq. Timeout support Good suggestion. We discussed this topic during the design discussion. We feel like the admin script can handle that outside HDFS; upon timeout, the admin script can remove the node from maintenance file and thus trigger replication. If we support timeout in HDFS, nodes in maintenance files won't necessarily be in maintenance states. Alternatively we can add another state called maintenance_timeout. But that might be too complicated. I can understand the benefit of having a timeout here. So we would like to hear others suggestion. There are two new topics we want to bring up. * The original design doc uses cluster default minimal replication factor to decide if the node can exit {{ENTERING_MAINTENANCE}} state. We might want to use a new config value so that we can set the value to two. For scenario like hadoop software upgrade, if used together with upgrade domain two replicas will be met right away for most blocks. For scenario like rack repair, two replicas can give us better data availability. At least we can test out different values independent of the cluster's minimal replication factor. * If read is allowed on node in {{ENTERING_MAINTENANCE}} state. Perhaps we should support that. That will handle the case where that is the only replica available. We can put such replica at the end of LocatedBlock. Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355589#comment-14355589 ] Lei (Eddy) Xu commented on HDFS-7877: - Hi, [~mingma]. This work looks great and more comprehensive than HDFS-6729. Especially I like the design that NN checks the single replica of blocks before setting DN to maintenance mode: it is safer than HDFS-6729. I have a few questions regarding the rest of your design. * Why is the node state the combination of {{live|dead}} and {{In service|Decommissioned|In maintenance..}}? Do we need to keep a DN in {{maintenance}} mode if it is dead? It makes the state machine very complex. * DN state (e.g., enter_maintenance or in_maintenance ) is kept in NN's memory? After NN re-starts, I think NN could not find out whether DN is in {{enter_maintenance}} or {{in_maintenance}} mode? Is there any default mode you will assume for a DN? Or is there a way for NN to decide which state the DN is in? * Moreover, after NN restarts, if a DN is actually in the maintenance mode (DN is shutting down for maintenance), NN could not receive block reports from this DN. If this is the case, would NN miscalculate the blockMap? * bq. put the dead node into maintenance mode Would it be necessary? As you mentioned, when a DN is dead, its blocks are already replicated to other nodes. In my understand, the maintenance mode is a way to let NN not to move data when the DN is actually offline. The logic, which brings back a {{dead IN_MAINTENANCE}} DN and removes replicas from block maps, looks very similar to restart a (dead) DN. Could it simply reuse that logic? * In HDFS-6729, I considered maintenance mode as a temporary soft state, because what I understand is that putting a DN into maintenance mode is risking the availability of data. It essentially asks NN to ignore one dead (in maintenance) replica. As a result, I did not put DNs into a persistent configure file and let user to specify a timeout for DN to be in maintenance mode. When the timeout expires (i.e., 1 hour maintenance window), NN considers this DN as dead and re-replicas blocks on this DN to somewhere else. Does it make sense to you? Could you address this concern in your design? Looking forward to hear from you, [~mingma]. Thanks again for this great work! Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353672#comment-14353672 ] Lei (Eddy) Xu commented on HDFS-7877: - Hey [~mingma] Thanks a lot for working on this. I am glad that this issue is being picked up! Please allow me sometime to go through your docs and patch. I will post comments shortly. Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352357#comment-14352357 ] Allen Wittenauer commented on HDFS-7877: Isn't this effectively a dupe of HDFS-6729? Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352471#comment-14352471 ] Ming Ma commented on HDFS-7877: --- Thanks Allen for pointing out. We didn't know about HDFS-6729 at all. Let me check out the approach in that jira and we can combine the effort. Support maintenance state for datanodes --- Key: HDFS-7877 URL: https://issues.apache.org/jira/browse/HDFS-7877 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ming Ma Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf This requirement came up during the design for HDFS-7541. Given this feature is mostly independent of upgrade domain feature, it is better to track it under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)