[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2017-02-15 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868948#comment-15868948
 ] 

Ming Ma commented on HDFS-7877:
---

ok. Will follow up the discussion in HDFS-11412.

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2017-02-13 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864638#comment-15864638
 ] 

Manoj Govindassamy commented on HDFS-7877:
--

Thanks [~mingma]. Got it, when you club this with Upgrade Domain, the impact is 
not that severe. 

I will make the following change for the Maintenance Min Replication range 
validation check.

{noformat}
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
@@ -484,12 +484,12 @@ public BlockManager(final Namesystem namesystem, boolean 
haEnabled,
   + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
   + " = " + minMaintenanceR + " < 0");
 }
-if (minMaintenanceR > minR) {
+if (minMaintenanceR > defaultReplication) {
   throw new IOException("Unexpected configuration parameters: "
   + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
   + " = " + minMaintenanceR + " > "
-  + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY
-  + " = " + minR);
+  + DFSConfigKeys.DFS_REPLICATION_DEFAULT
+  + " = " + defaultReplication);
 }

{noformat}


bq. the transition policy from ENTERING_MAINTENANCE to IN_MAINTENANCE will 
become the # of live replicas >= min(dfs.namenode.maintenance.replication.min, 
replication factor).

But, the transition from ENTERIN_MM to IN_MM that is happening 
{{DecommissionManager#Monitor#check}} which in-turn calls 
{{DecommissionManager#isSufficient}} looks ok to me. Because, we allow files to 
be created with custom block replication count say 1, which can be lesser than 
the default dfs.replication=3. And, since we should not be counting in the 
Maintenance Replicas, the formula is, as it exists currently 

{noformat}

expectedRedundancy = file_block_replication_count=1 or the 
default_replication_cont=3
Math.max(
expectedRedundancy - numberReplicas.maintenanceReplicas(),
getMinMaintenanceStorageNum(block));
{noformat}

Let me know if I am missing something. Thanks.


--- related code snippets 

{noformat}

  /**
   * Checks whether a block is sufficiently replicated/stored for
   * decommissioning. For replicated blocks or striped blocks, full-strength
   * replication or storage is not always necessary, hence "sufficient".
   * @return true if sufficient, else false.
   */
  private boolean isSufficient(BlockInfo block, BlockCollection bc,
  NumberReplicas numberReplicas, boolean isDecommission) {
if (blockManager.hasEnoughEffectiveReplicas(block, numberReplicas, 0)) {
  // Block has enough replica, skip
  LOG.trace("Block {} does not need replication.", block);
  return true;
}
..
..
..



  // Check if the number of live + pending replicas satisfies
  // the expected redundancy.
  boolean hasEnoughEffectiveReplicas(BlockInfo block,
  NumberReplicas numReplicas, int pendingReplicaNum) {
int required = getExpectedLiveRedundancyNum(block, numReplicas);
int numEffectiveReplicas = numReplicas.liveReplicas() + pendingReplicaNum;
return (numEffectiveReplicas >= required) &&
(pendingReplicaNum > 0 || isPlacementPolicySatisfied(block));
  }


  // Exclude maintenance, but make sure it has minimal live replicas
  // to satisfy the maintenance requirement.
  public short getExpectedLiveRedundancyNum(BlockInfo block,
  NumberReplicas numberReplicas) {
final short expectedRedundancy = getExpectedRedundancyNum(block);
return (short) Math.max(expectedRedundancy -
numberReplicas.maintenanceReplicas(),
getMinMaintenanceStorageNum(block));
  }
{noformat}

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2017-02-11 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15862549#comment-15862549
 ] 

Ming Ma commented on HDFS-7877:
---

Thanks [~manojg]. Good point. What you suggested makes sense. The reason we 
don't have this requirement in our production is probably because we only put 
nodes in one upgrade domain into maintenance at a time; after one batch is 
done, move to the next upgrade domain. Thus no two replicas will be put to 
maintenance at the same time.

To confirm, given we will still allow applications to create blocks with 
smaller replication factor than {{dfs.namenode.maintenance.replication.min}}, 
the transition policy from {{ENTERING_MAINTENANCE}} to {{IN_MAINTENANCE}} 
becomes the # of live replicas >= 
min({{dfs.namenode.maintenance.replication.min}}, replication factor).

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2017-02-10 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861802#comment-15861802
 ] 

Manoj Govindassamy commented on HDFS-7877:
--

[~mingma],

[~dilaver] brought up a good point regarding the restrictions put for the range 
allowed for the configuration {{dfs.namenode.maintenance.replication.min}}. 
Currently the allowed range for Maintenance Min Replication is {{0 to 
dfs.namenode.replication.min(default=1)}}. Users wanting not to affect the 
performance of the cluster would wish to have the Maintenance Min Replication 
number greater than 1, say 2. In the current design, it is possible to have 
this Maintenance Min Replication configuration, but only after changing the 
NameNode level Block Min Replication to 2, and which could slowdown the overall 
latency for client writes.

Technically speaking we should be allowing Maintenance Min Replication to be in 
range {{0 to dfs.replication.max}}. There is always config value of 0 for users 
not wanting any availability/performance during maintenance. And, performance 
centric workloads can still get maintenance done without major disruptions by 
having a bigger Maintenance Min Replication.  So, any reasons why you wanted to 
have Maintenance Min Replication range to be restrictive and less than or equal 
to {{dfs.namenode.replication.min}} ? May be i am overlooking something here. 
Please clarify. 

{noformat}
if (minMaintenanceR < 0) {
  throw new IOException("Unexpected configuration parameters: "
  + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
  + " = " + minMaintenanceR + " < 0");
}
if (minMaintenanceR > minR) {
  throw new IOException("Unexpected configuration parameters: "
  + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
  + " = " + minMaintenanceR + " > "
  + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY
  + " = " + minR);
{noformat}

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-10-01 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940676#comment-14940676
 ] 

Ming Ma commented on HDFS-7877:
---

Maybe we should try to support persistence for timeout. We can persist the 
maintenance expiration UTC time via some new mechanism discussed in HDFS-9005. 
The clock can be out of sync among NNs, but we can accept that given the 
maintenance timeout precision is in the order of minutes. [~ctrezzo] [~eddyxu], 
thought?

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-09-10 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739108#comment-14739108
 ] 

Ming Ma commented on HDFS-7877:
---

For the open issues around timeout and persistence, [~ctrezzo] [~eddyxu] and I 
had some offline discussion. We also discussed with our admins. Appreciate 
input from others.

* Timeout support. We should support it.
* Persistence vs soft state. Persistence is desirable for some cases. But soft 
state is acceptable. From application's point of view, if it asks HDFS to 
timeout the maintenance state and ideally would like HDFS to honor the request 
(applications don't care failover and restart as long as HDFS is up). Soft 
state means HDFS wouldn't honor the timeout value if there are NN 
failover/restart. For some scenarios admins would prefer if HDFS can honor the 
request if there are any NN failover/restart; but they can also accept soft 
state approach.

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-07-21 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636255#comment-14636255
 ] 

Joep Rottinghuis commented on HDFS-7877:


What do we need to do to get this going (again) in OSS? Just FYI, we're moving 
forward with this at Twitter on production clusters.

 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
 Supportmaintenancestatefordatanodes-2.pdf, 
 Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-06-19 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594075#comment-14594075
 ] 

Ming Ma commented on HDFS-7877:
---

Thanks [~rajive] for your input! I also discussed with [~rawk].

* Support for timeout. Sounds like folks prefer to have HDFS support that. That 
makes sense. Value of -1 could mean no timeout. In addition, based on current 
scenarios it seems we don't need to support per-host timeout; instead we can 
use some global timeout value.
* Support for persistence. If we don't put the maintenance files into some 
file, it will be lost after NN restart. In other words, the node will be 
transitioned out of maintenance state upon NN restart. So from admin point of 
view, the node could be transitioned out of maintenance state prior to the 
timeout. Are we ok with such possible inconsistency?
* If the node should be taken of DECOMMISSIONING when the node becomes dead. 
Admin state is separate from the liveness state. The reason the node is kept in 
DECOMMISSIONING state is to address data reliability issue. HDFS-6791 has more 
details.

 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
 Supportmaintenancestatefordatanodes-2.pdf, 
 Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-06-16 Thread Rajiv Chittajallu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588775#comment-14588775
 ] 

Rajiv Chittajallu commented on HDFS-7877:
-

* It would be preferable to have a timeout of maintenance state, which would be 
higher than {{dfs.namenode.heartbeat.recheck-interval}}.
* Instead of specifying hosts in a file, {{dfs.hosts.maintenance}}, can this be 
done via {{dfsadmin}} ? Maintenance mode is an temporary transient state and it 
would be simpler to not to track it via files.

bq. That is why we have the case where if a node becomes dead when it is being 
decommissioned, it will remains in DECOMMISSION_IN_PROGRESS state until all the 
blocks are properly replicated.

If a datanode goes offline while decommissioning, it should be treated as dead 
and not be in {{DECOMMISSION_IN_PROGRESS}} state. Re-replicating blocks for 
nodes in dead state should be treated with higher priority.

 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
 Supportmaintenancestatefordatanodes-2.pdf, 
 Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-06-16 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588661#comment-14588661
 ] 

Kihwal Lee commented on HDFS-7877:
--

[~rajive] It will be nice if we can get your perspective on this. 

 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
 Supportmaintenancestatefordatanodes-2.pdf, 
 Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-03-11 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356375#comment-14356375
 ] 

Ming Ma commented on HDFS-7877:
---

Thanks Eddy for the review and suggestions. Please find my response below. 
Chris might have more to add.

bq. Why is the node state the combination of live|dead and In 
service|Decommissioned|In maintenance..?
There are two state machines for datanode. One is called liveness state. 
Another one is called admin state. HDFS-7521 has some discussion around that. 
So datanode can be in any combination of these two states. That is why we have 
the case where if a node becomes dead when it is being decommissioned, it will 
remains in {{DECOMMISSION_IN_PROGRESS}} state until all the blocks are properly 
replicated.
 

bq. After NN re-starts, I think NN could not find out whether DN is in 
enter_maintenance or in_maintenance mode? 
The design handles the datanode state management for {{ENTERING_MAINTENANCE}} 
and {{IN_MAINTENANCE}} somewhat similar to {{DECOMMISSION_IN_PROGRESS}} and 
{{DECOMMISSIONED}} in the following ways.

1. When a node registers with NN ( could be datanode restart or NN restart ), 
it will first transition to DECOMMISSION_IN_PROGRESS if it is in exclude file; 
or  ENTERING_MAINTENANCE if it is in maintenance file.
2. Only after target replication has been reached, it will be transitioned to 
the final state, DECOMMISSIONED or IN_MAINTENANCE.

bq. Moreover, after NN restarts, if a DN is actually in the maintenance mode 
(DN is shutting down for maintenance), NN could not receive block reports from 
this DN.
After NN restarts, if a DN in maintenance file doesn't register with NN, then 
it won't be in {{DatanodeManager}}'s {{datanodeMap}} and thus the state won't 
be tracked. So it should be similar to how decommission is handled.

If the DN does register with NN, there is a bug in the patch that doesn't check 
if NN has received blockreport from the DN so that it doesn't prematurely 
transition the DN to {{in_maintenance}} state.

bq. Is put the dead node into maintenance mode necessary?
Good question, if it is ok to keep the node in {{dead, normal}} state when 
admins add the node to maintenance file.

The intention is to make it consistent with the actual content in maintenance 
file. It is similar to how decommission is handled; if you add a dead node to 
exclude file, the node will go directly into {{DECOMMISSIONED}} state. For 
replicas processing, {{dead, in_maintenance}} - {{live, in_maintenance}} won't 
trigger excess blocks removal; {{live, in_maintenance}} - {{live, normal}} 
will.

bq. Timeout support
Good suggestion. We discussed this topic during the design discussion. We feel 
like the admin script can handle that outside HDFS; upon timeout, the admin 
script can remove the node from maintenance file and thus trigger replication. 
If we support timeout in HDFS, nodes in maintenance files won't necessarily be 
in maintenance states. Alternatively we can add another state called 
maintenance_timeout. But that might be too complicated. I can understand the 
benefit of having a timeout here. So we would like to hear others suggestion.


There are two new topics we want to bring up.

* The original design doc uses cluster default minimal replication factor to 
decide if the node can exit {{ENTERING_MAINTENANCE}} state. We might want to 
use a new config value so that we can set the value to two. For scenario like 
hadoop software upgrade, if used together with upgrade domain two replicas 
will be met right away for most blocks. For scenario like rack repair, two 
replicas can give us better data availability. At least we can test out 
different values independent of the cluster's minimal replication factor.

* If read is allowed on node in {{ENTERING_MAINTENANCE}} state. Perhaps we 
should support that. That will handle the case where that is the only replica 
available. We can put such replica at the end of LocatedBlock.



 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-03-10 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355589#comment-14355589
 ] 

Lei (Eddy) Xu commented on HDFS-7877:
-

Hi, [~mingma]. This work looks great and more comprehensive than HDFS-6729.  
Especially I like the design that NN checks the single replica of blocks before 
setting DN to maintenance mode: it is safer than HDFS-6729.  

I have a few questions regarding the rest of your design.

* Why is the node state the combination of {{live|dead}} and {{In 
service|Decommissioned|In maintenance..}}? Do we need to keep a DN in 
{{maintenance}} mode if it is dead? It makes the state machine very complex. 
* DN state (e.g., enter_maintenance or in_maintenance ) is kept in NN's memory? 
After NN re-starts, I think NN could not find out whether DN is in 
{{enter_maintenance}} or {{in_maintenance}} mode? Is there any default mode you 
will assume for a DN? Or is there a way for NN to decide which state the DN is 
in?
* Moreover, after NN restarts, if a DN is actually in the maintenance mode (DN 
is shutting down for maintenance), NN could not receive block reports from this 
DN. If this is the case, would NN miscalculate the blockMap?
* bq. put the dead node into maintenance mode
Would it be necessary? As you mentioned, when a DN is dead, its blocks are 
already replicated to other nodes. In my understand, the maintenance mode is a 
way to let NN not to move data when the DN is actually offline. The logic, 
which brings back a {{dead IN_MAINTENANCE}} DN and removes replicas from block 
maps, looks very similar to restart a (dead) DN. Could it simply reuse that 
logic?
* In HDFS-6729, I considered maintenance mode as a temporary soft state, 
because what I understand is that putting a DN into maintenance mode is risking 
the availability of data. It essentially asks NN to ignore one dead (in 
maintenance) replica. As a result, I did not put DNs into a persistent 
configure file and let user to specify a timeout for DN to be in maintenance 
mode. When the timeout expires (i.e., 1 hour maintenance window), NN considers 
this DN as dead and re-replicas blocks on this DN to somewhere else. Does it 
make sense to you? Could you address this concern in your design?

Looking forward to hear from you, [~mingma]. Thanks again for this great work!



 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-03-09 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353672#comment-14353672
 ] 

Lei (Eddy) Xu commented on HDFS-7877:
-

Hey [~mingma] Thanks a lot for working on this. I am glad that this issue is 
being picked up! 

Please allow me sometime to go through your docs and patch. I will post 
comments shortly.

 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-03-08 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352357#comment-14352357
 ] 

Allen Wittenauer commented on HDFS-7877:


Isn't this effectively a dupe of HDFS-6729?

 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-03-08 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352471#comment-14352471
 ] 

Ming Ma commented on HDFS-7877:
---

Thanks Allen for pointing out. We didn't know about HDFS-6729 at all. Let me 
check out the approach in that jira and we can combine the effort.

 Support maintenance state for datanodes
 ---

 Key: HDFS-7877
 URL: https://issues.apache.org/jira/browse/HDFS-7877
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ming Ma
 Attachments: HDFS-7877.patch, Supportmaintenancestatefordatanodes.pdf


 This requirement came up during the design for HDFS-7541. Given this feature 
 is mostly independent of upgrade domain feature, it is better to track it 
 under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)