[
https://issues.apache.org/jira/browse/AMBARI-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Sen updated AMBARI-2928:
-------------------------------
Attachment: AMBARI-2928.patch
> Add a Nagios alert to check state of NN HA
> ------------------------------------------
>
> Key: AMBARI-2928
> URL: https://issues.apache.org/jira/browse/AMBARI-2928
> Project: Ambari
> Issue Type: Improvement
> Components: agent
> Affects Versions: 1.4.0
> Reporter: Dmitry Sen
> Assignee: Dmitry Sen
> Fix For: 1.4.0
>
> Attachments: AMBARI-2928.patch
>
>
> Add Nagios alert
> Title: "NameNode HA Healthy"
> Check if one NN has tag.HAState = active and second NN has tag.HAState =
> standby.
> Scenarios:
> 1.
> Active + Standby NN are up
> OK: NameNode HA healthy true; Active<dev01.hortonworks.com>,
> Standby<dev02.hortonworks.com>, Unavailable<>
> 2.
> Two Standby NNs are up
> CRITICAL: No Active NN available; Active<>,
> Standby<dev01.hortonworks.com:dev02.hortonworks.com>, Unavailable<>
> 3.
> Two Active NN are up
> CRITICAL: No Active NN available; No failover NN available;
> Active<dev01.hortonworks.com:dev02.hortonworks.com>, Standby<>, Unavailable<>
> 4.
> Both NN unavailable
> CRITICAL: No Active NN available; No failover NN available: Active<>,
> Standby<>, Unavailable<dev01.hortonworks.com:dev02.hortonworks.com>
> 5.
> Only one NameNode in cluster (no additional/standby NameNode configured)
> CRITICAL: No failover NN available: Active<dev01.hortonworks.com>, Standby<>,
> Unavailable<>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira