[
https://issues.apache.org/jira/browse/HDFS-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064408#comment-14064408
]
Andrew Wang commented on HDFS-6626:
---
Hi Ming,
I think the main goal of decommissioning is to shift blocks off of a DN, which
is done at low priority to avoid disrupting the cluster. However, if a DN dies
while decommissioning, HDFS is forced to immediately re-replicate all of its
blocks at a high priority. Thus, the end result of a successful decommission vs
an aborted decommission as you term it is the same: no blocks on that DN.
What additional actions would the admin be able to take if we also had a
decommission aborted state? If you're interested in process / host health,
that's typically handled by dedicated monitoring tools like CM or ganglia.
Node is marked decommissioned if it becomes dead when it is being
decommissioned
Key: HDFS-6626
URL: https://issues.apache.org/jira/browse/HDFS-6626
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Ming Ma
Not sure if it is by design. But it isn't intuitive. The scenario is like
this, you try to decommission a node; when the node is being decommissioned,
the node becomes dead from NN's point of view; right after that NN will mark
this node decommissioned. On the webUI, administrators will consider the
decommission has completed successfully. That is because when there is no
block left for the DN, decommission is considered done.
{noformat}
BlockManager.java
boolean isReplicationInProgress(DatanodeDescriptor srcNode) {
boolean status = false;
...
final Iterator? extends Block it = srcNode.getBlockIterator();
while(it.hasNext()) {
...
// set status if there is block under replication
}
...
return status;
}
{noformat}
The question is whether we should mark the dead node as decommission
completed (the current behavior), or mark the dead node decommission
aborted. From administrators' point of view, when they are doing decomm,
they want to know the status of decomm and the health of those
decomm-in-progress nodes. If they can detect decommission failure earlier,
they might be able to take actions earlier; for example if the TOR switch has
issues during decomm, administrators will be able to quickly find out a bunch
of decommission aborted nodes from the same rack. People can still find
this information by doing the join between decomm node list and recent dead
node list on the webUI; just not as convenient.
Suggestions?
--
This message was sent by Atlassian JIRA
(v6.2#6252)