vitthal (Suhas) Gogate created AMBARI-517:
---------------------------------------------
Summary: Dashboard shows HDFS is down though it's still running
Key: AMBARI-517
URL: https://issues.apache.org/jira/browse/AMBARI-517
Project: Ambari
Issue Type: Bug
Affects Versions: ambari-186
Reporter: vitthal (Suhas) Gogate
Attachments: AMBARI-517.patch
This defect occasionally occurs when the jmx over http call to respective
service master times out due to load on the service master or network problem.
e.g. error message in the httpd/error_log file is,
error_log:Mon Jun 04 12:56:07 2012 error client 24.7.53.89
hdp_mon_jmx_helpers.inc:522hdp_mon_jmx_get_jmx_data: Error when accessing jmx
info: , url=http://ec2-72-44-58-186.compute-1.amazonaws.com:60010/jmx,
errno=28, error=Operation timed out after 1 seconds with 0 bytes received,
referer:
http://ec2-72-44-58-186.compute-1.amazonaws.com/hdp/dashboard/ui/home.html
Solution for this is to increase the timeout (currently 1 sec) to 2 or 3
seconds. Although drawback of increasing timeout is, when one of the services
(HDFS, MR, HBASE) is down, backend call will always timeout for that service
and so will take that much time to load the page. Although this is not a common
scenario to have one of the installed services down for long time, I would
recommend to increase the timeout to 3 secs to lower the chances of happening
this problem on the slower nodes and network.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira