[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jianfei Jiang updated HDFS-12935: --------------------------------- Attachment: HDFS-12935.008.patch > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > --------------------------------------------------------------------------------- > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0 > Reporter: Jianfei Jiang > Assignee: Jianfei Jiang > Priority: Major > Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, > HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, > HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, > HDFS-12935.008.patch, HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org