[ https://issues.apache.org/jira/browse/AMBARI-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Hurley updated AMBARI-11941: ------------------------------------- Attachment: AMBARI-11941.patch > RU: Pre Upgrade HDFS Fails Due To Kerberos Security Exception > ------------------------------------------------------------- > > Key: AMBARI-11941 > URL: https://issues.apache.org/jira/browse/AMBARI-11941 > Project: Ambari > Issue Type: Bug > Affects Versions: 2.1.0 > Reporter: Jonathan Hurley > Assignee: Jonathan Hurley > Priority: Blocker > Fix For: 2.1.0 > > Attachments: AMBARI-11941.patch > > > During an upgrade from HDP 2.2 to HDP 2.3, the pre-upgrade of the NameNode > fails: > {code:title=cat > ip-172-31-41-15.ec2.internal/var/lib/ambari-agent/data/output-3113.txt} > 2015-06-13 14:50:17,703 - call['hdfs dfsadmin -safemode get'] {'user': 'hdfs'} > 2015-06-13 14:50:22,142 - call returned (255, '15/06/13 14:50:21 WARN > ipc.Client: Exception encountered while connecting to the server : > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)]\nsafemode: Failed on local exception: java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)]; Host Details : local host is: > "ip-172-31-41-15.ec2.internal/172.31.41.15"; destination host is: > "ip-172-31-41-15.ec2.internal":8020; ') > 2015-06-13 14:50:22,143 - Command: hdfs dfsadmin -safemode get > Code: 255. > Traceback (most recent call last): > File > "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", > line 311, in <module> > NameNode().execute() > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 216, in execute > method(env) > File > "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", > line 110, in prepare_rolling_upgrade > namenode_upgrade.prepare_rolling_upgrade() > File > "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode_upgrade.py", > line 100, in prepare_rolling_upgrade > raise Fail("Could not transition to safemode state %s. Please check logs > to make sure namenode is up." % str(SafeMode.OFF)) > resource_management.core.exceptions.Fail: Could not transition to safemode > state OFF. Please check logs to make sure namenode is up. > {code} > With the heart of the issue being a kerberos issue: > {code} > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)]\nsafemode: Failed on local exception: java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)]; > {code} > It looks like kinit was called right before this: > {code} > 2015-06-13 14:50:17,473 - Execute['/usr/bin/kinit -kt > /etc/security/keytabs/hdfs.headless.keytab h...@example.com'] {}2015-06-13 > 14:50:17,702 - Prepare to transition into safemode state OFF > 2015-06-13 14:50:17,703 - call['hdfs dfsadmin -safemode get'] {'user': 'hdfs'} > 2015-06-13 14:50:22,142 - call returned (255, '15/06/13 14:50:21 WARN > ipc.Client: Exception encountered while connecting to the server : > javax.security.sasl.SaslException:... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)