vitthal (Suhas) Gogate created AMBARI-517:
---------------------------------------------

             Summary: Dashboard shows HDFS is down though it's still running
                 Key: AMBARI-517
                 URL: https://issues.apache.org/jira/browse/AMBARI-517
             Project: Ambari
          Issue Type: Bug
    Affects Versions: ambari-186
            Reporter: vitthal (Suhas) Gogate
         Attachments: AMBARI-517.patch

This defect occasionally occurs when the jmx over http call to respective 
service master times out due to load on the service master or network problem. 
e.g. error message in the httpd/error_log file is,

error_log:Mon Jun 04 12:56:07 2012 error client 24.7.53.89 
hdp_mon_jmx_helpers.inc:522hdp_mon_jmx_get_jmx_data: Error when accessing jmx 
info: , url=http://ec2-72-44-58-186.compute-1.amazonaws.com:60010/jmx, 
errno=28, error=Operation timed out after 1 seconds with 0 bytes received, 
referer: 
http://ec2-72-44-58-186.compute-1.amazonaws.com/hdp/dashboard/ui/home.html

Solution for this is to increase the timeout (currently 1 sec) to 2 or 3 
seconds. Although drawback of increasing timeout is, when one of the services 
(HDFS, MR, HBASE) is down, backend call will always timeout for that service 
and so will take that much time to load the page. Although this is not a common 
scenario to have one of the installed services down for long time, I would 
recommend to increase the timeout to 3 secs to lower the chances of happening 
this problem on the slower nodes and network. 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to