Robert Levas created AMBARI-20349: ------------------------------------- Summary: When SPNEGO authentication is enabled for Hadoop in a cluster with NN HA, PXF Process alert fails Key: AMBARI-20349 URL: https://issues.apache.org/jira/browse/AMBARI-20349 Project: Ambari Issue Type: Bug Components: ambari-server Affects Versions: 2.2.2 Reporter: Robert Levas Assignee: Robert Levas Fix For: 2.5.0
When SPNEGO authentication is enabled for Hadoop in a cluster where NN HA is enabled, PXF Process alert fails with the following errors in the ambari-agent.log file {noformat} ERROR 2017-03-07 18:03:58,417 jmx.py:44 - Getting jmx metrics from NN failed. URL: http://c6401.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesy stem Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 41, in get_value_from_jmx data_dict = json.loads(data) File "/usr/lib/python2.6/site-packages/ambari_simplejson/__init__.py", line 307, in loads return _default_decoder.decode(s) File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 335, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 353, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded INFO 2017-03-07 18:04:02,769 logger.py:71 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://c6402.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmphTXg76 2>/tmp/tmp5bm2nM''] {'quiet': False} INFO 2017-03-07 18:04:02,797 logger.py:71 - call returned (0, '') ERROR 2017-03-07 18:04:02,798 jmx.py:44 - Getting jmx metrics from NN failed. URL: http://c6402.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 41, in get_value_from_jmx data_dict = json.loads(data) File "/usr/lib/python2.6/site-packages/ambari_simplejson/__init__.py", line 307, in loads return _default_decoder.decode(s) File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 335, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 353, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded {noformat} *Cause* During the test for the {{PXF Process}} alert, the Active NN is found using a JMX call. This call requires SPNEGO authentication since SPNEGO authentication is turned on for the Hadoop web interfaces. However, a valid Kerberos ticket is not found in the configured user's Kerberos ticket cache. In this case, the configured users is the HDFS user - which technically is not necessary. This occurs in {code:title=common-services/PXF/3.0.0/package/alerts/api_status.py:137} if CLUSTER_ENV_SECURITY in configurations and configurations[CLUSTER_ENV_SECURITY].lower() == "true": if 'dfs.nameservices' in configurations[HDFS_SITE]: namenode_address = get_active_namenode(ConfigDictionary(configurations[HDFS_SITE]), configurations[CLUSTER_ENV_SECURITY], configurations[HADOOP_ENV_HDFS_USER])[1] else: namenode_address = configurations[HDFS_SITE]['dfs.namenode.http-address'] token = _get_delegation_token(namenode_address, configurations[HADOOP_ENV_HDFS_USER], configurations[HADOOP_ENV_HDFS_USER_KEYTAB], configurations[HADOOP_ENV_HDFS_PRINCIPAL_NAME], None) commonPXFHeaders.update({"X-GP-TOKEN": token}) {code} Inside the call at {code} namenode_address = get_active_namenode(ConfigDictionary(configurations[HDFS_SITE]), configurations[CLUSTER_ENV_SECURITY], configurations[HADOOP_ENV_HDFS_USER])[1] {code} *Solution* Ensure the configured user's Kerberos ticket cache contains a valid ticket before querying for the active NN. Possibly change the acting user to one executing the PXF component. -- This message was sent by Atlassian JIRA (v6.3.15#6346)