[ https://issues.apache.org/jira/browse/AMBARI-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194338#comment-14194338 ]
jaehoon ko commented on AMBARI-7753: ------------------------------------ Can anyone help with test? I cannot find out how I can test this feature without actually performing decommission from a secured cluster. Of course, it worked well on my cluster. > DataNode decommision error in secured cluster > --------------------------------------------- > > Key: AMBARI-7753 > URL: https://issues.apache.org/jira/browse/AMBARI-7753 > Project: Ambari > Issue Type: Bug > Components: ambari-server, stacks > Affects Versions: 1.6.1 > Environment: Ambari-1.6.1 with HDP-2.1.5 > Reporter: jaehoon ko > Labels: patch > Attachments: AMBARI-7753.patch > > > Decommissioning a DataNode from a secured cluster returns errors with the > following messages > {code} > STDERR: > 2014-10-13 10:37:31,896 - Error while executing command 'decommission': > Traceback (most recent call last): > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 111, in execute > method(env) > File > "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/namenode.py", > line 66, in decommission > namenode(action="decommission") > File > "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py", > line 70, in namenode > decommission() > File > "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py", > line 145, in > decommission > user=hdfs_user > File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", > line 148, in __init__ > self.env.run() > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 149, in run > self.run_action(resource, action) > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 115, in run_action > provider_action() > File > "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", > line 239, in action_run > raise ex > Fail: Execution of '/usr/bin/kinit -kt > /etc/security/keytabs/dn.service.keytab dn/master- > 6.amber.gbcl....@amber.gbcluster.net;' returned 1. kinit: Client not found in > Kerberos database while getting initial > credentials > {code} > {code} > STDOUT: > 2014-10-13 10:37:31,793 - File['/etc/hadoop/conf/dfs.exclude'] {'owner': > 'hdfs', 'content': Template > ('exclude_hosts_list.j2'), 'group': 'hadoop'} > 2014-10-13 10:37:31,796 - Writing File['/etc/hadoop/conf/dfs.exclude'] > because contents don't match > 2014-10-13 10:37:31,797 - Execute['/usr/bin/kinit -kt > /etc/security/keytabs/dn.service.keytab dn/master- > 6.amber.gbcl....@amber.gbcluster.net;'] {'user': 'hdfs'} > 2014-10-13 10:37:31,896 - Error while executing command 'decommission': > Traceback (most recent call last): > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 111, in execute > method(env) > File > "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/namenode.py", > line 66, in decommission > namenode(action="decommission") > File > "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py", > line 70, in namenode > decommission() > File > "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py", > line 145, in > decommission > user=hdfs_user > File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", > line 148, in __init__ > self.env.run() > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 149, in run > self.run_action(resource, action) > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 115, in run_action > provider_action() > File > "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", > line 239, in action_run > raise ex > Fail: Execution of '/usr/bin/kinit -kt > /etc/security/keytabs/dn.service.keytab dn/master- > 6.amber.gbcl....@amber.gbcluster.net;' returned 1. kinit: Client not found in > Kerberos database while getting initial > credentials > {code} > The reason is that Ambar-agent uses DataNode principal to perform HDFS > refresh, which should be done as NameNode. This error can be solved by > letting Ambari-agent uses NameNode kerberos principal and keytab. Note that > [AMBARI-5729|https://issues.apache.org/jira/browse/AMBARI-5729] solves > similar issue for NodeManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)