Ming Ma created HADOOP-11000:
--------------------------------

             Summary: HAServiceProtocol's health state is incorrectly 
transitioned to SERVICE_NOT_RESPONDING
                 Key: HADOOP-11000
                 URL: https://issues.apache.org/jira/browse/HADOOP-11000
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Ming Ma


When HAServiceProtocol.monitorHealth throws a HealthCheckFailedException, the 
actual exception from protocol buffer RPC is a RemoteException that wraps the 
real exception. Thus the state is incorrectly transitioned to 
SERVICE_NOT_RESPONDING

{noformat}
HealthMonitor.java
doHealthChecks

      try {
        status = proxy.getServiceStatus();
        proxy.monitorHealth();
        healthy = true;
      } catch (HealthCheckFailedException e) {
        .....
        enterState(State.SERVICE_UNHEALTHY);
      } catch (Throwable t) {
        .....
        enterState(State.SERVICE_NOT_RESPONDING);
        .....
      }

{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to