[ https://issues.apache.org/jira/browse/HDFS-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867975#comment-16867975 ]
Kihwal Lee edited comment on HDFS-14563 at 6/19/19 8:27 PM: ------------------------------------------------------------ {quote} It may make sense to have an API like {{setNodeStatus(nodeId, state, param)}}. {quote} +1 {quote}Kihwal Lee, I think this is a good chance to revisit this and make this per cluster and not per namenode. {quote} It sounds good in concept, but I am worried about additional moving piece to setup. However, we can add it as an option in addition to the existing host file-based approach. Today, we (Yahoo!) use a network-shared directory to store include/exclude. was (Author: kihwal): bq. It may make sense to have an API like \{{setNodeStatus(nodeId, state, param)}}. +1 bq. Kihwal Lee, I think this is a good chance to revisit this and make this per cluster and not per namenode. It sounds good in concept, but I am worried about additional moving piece to setup. However, we can add it as an option in addition to the existing host file-based approach. > Enhance interface about recommissioning/decommissioning > ------------------------------------------------------- > > Key: HDFS-14563 > URL: https://issues.apache.org/jira/browse/HDFS-14563 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode > Reporter: He Xiaoqiao > Assignee: He Xiaoqiao > Priority: Major > Attachments: HDFS-14563.001.patch, HDFS-14563.002.patch, mt_mode-2.txt > > > In current implementation, if we need to decommissioning or recommissioning > one datanode, the only way is add the datanode to include or exclude file > under namenode configuration path then execute command `bin/hadoop dfsadmin > -refreshNodes` and trigger namenode to reload include/exclude and start to > recommissioning or decommissioning datanode. > The shortcomings of this approach is that: > a. namenode reload include/exclude configuration file from devices, if I/O > load is high, handler may be blocked. > b. namenode has to process every datnodes in include and exclude > configurations, if there are many datanodes (very common for large cluster) > pending to process, namenode will be hung for hundred seconds to wait > recommision/decommision finish at the worst since holding write lock. > I think we should expose one lightweight interface to support recommissioning > or decommissioning single datanode, thus we can operate datanode using > dfsadmin more smooth. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org