I'm diagnosing an issue, and I think I found a bug with the ambari-agent code:

https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/Controller.py#L390

If 'cluster_name' has spaces in it, this request fails because it fails to 
URL-encode value.  This causes all of the agents to go to HEARTBEAT_LOST state 
and everything fails, but the error it spits out in the agent log is hugely 
misleading:

ERROR 2015-04-08 18:30:20,312 Controller.py:140 - Unable to connect to: 
https://ambari.local:8441/agent/v1/register/ambari.local
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 128, 
in registerWithServer
    self.addToStatusQueue(ret['statusCommands'])
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 172, 
in addToStatusQueue
    self.updateComponents(commands[0]['clusterName'])
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 360, 
in updateComponents
    response = self.sendRequest(self.componentsUrl + cluster_name, None)
  File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 353, 
in sendRequest
    + '; Response: ' + str(response))
IOError: Response parsing failed! Request data: None; Response:

It connected fine, and parsed the response fine, but then died during 
processing of the response.  Probably shouldn't be trapping every Exception 
here:

https://github.com/apache/ambari/blob/trunk/ambari-agent/src/main/python/ambari_agent/Controller.py#L170

I assume that this is a bug and we want to allow cluster names to be whatever 
the customer would like.

I'll open a JIRA unless someone can disconfirm that this is a bug.

Greg

Reply via email to