Zack Marsh created AMBARI-12148:
-----------------------------------
Summary: Falcon server intermittently fails to start
Key: AMBARI-12148
URL: https://issues.apache.org/jira/browse/AMBARI-12148
Project: Ambari
Issue Type: Bug
Environment: ambari-2.1.0-1249, hdp-2.3.0.0-2469 , sles11sp3
Reporter: Zack Marsh
Priority: Critical
The Falcon server is intermittently failing to start when starting all Hadoop
services.
Looking at the Ambari ops log, Falcon is failing to start with the following
output:
{code}
Traceback (most recent call last):
File
"/var/lib/ambari-agent/cache/common-services/FALCON/0.5.0.2.1/package/scripts/falcon_server.py",
line 164, in <module>
FalconServer().execute()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 216, in execute
method(env)
File
"/var/lib/ambari-agent/cache/common-services/FALCON/0.5.0.2.1/package/scripts/falcon_server.py",
line 46, in start
self.configure(env)
File
"/var/lib/ambari-agent/cache/common-services/FALCON/0.5.0.2.1/package/scripts/falcon_server.py",
line 41, in configure
falcon('server', action='config')
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
line 89, in thunk
return fn(*args, **kwargs)
File
"/var/lib/ambari-agent/cache/common-services/FALCON/0.5.0.2.1/package/scripts/falcon.py",
line 141, in falcon
source = params.local_data_mirroring_dir)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 157, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 390, in action_create_on_execute
self.action_delayed("create")
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 387, in action_delayed
self.get_hdfs_resource_executor().action_delayed(action_name, self)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 246, in action_delayed
self._create_resource()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 263, in _create_resource
self._copy_from_local_directory(self.main_resource.resource.target,
self.main_resource.resource.source)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 271, in _copy_from_local_directory
self._create_directory(new_target)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 280, in _create_directory
self.util.run_command(target, 'MKDIRS', method='PUT')
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 201, in run_command
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w
'%{http_code}' -X PUT
'http://zeus1.labs.teradata.com:50070/webhdfs/v1/apps/data-mirroring/workflows?op=MKDIRS&user.name=hdfs''
returned status_code=403.
{
"RemoteException": {
"exception": "RetriableException",
"javaClassName": "org.apache.hadoop.ipc.RetriableException",
"message": "org.apache.hadoop.hdfs.server.namenode.SafeModeException:
Cannot create directory /apps/data-mirroring/workflows. Name node is in safe
mode.\nThe reported blocks 0 needs additional 392 blocks to reach the threshold
0.9990 of total blocks 392.\nThe number of live datanodes 3 has reached the
minimum number 0. Safe mode will be turned off automatically once the
thresholds have been reached."
}
}
{code}
This seems to bea race condition in which the Falcon Server is attempting to
start prior to the successful start of the HDFS services.
The same error is also intermittently occurring when the HDFS Service Check is
being executed in the "Start and Test All Services" step of the Ambari Enable
Kerberos Wizard.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)