Hurely might be able to shed more light on the finer details.
>
> BR,
> Sid
>
>
> --
> *From:* Jonathan Hurley
> *Sent:* Friday, October 28, 2016 1:44 PM
> *To:* Ganesh Viswanathan
> *Cc:* user@ambari.apache.org
> *Subject:* Re: Am
1:44 PM
To: Ganesh Viswanathan
Cc: user@ambari.apache.org
Subject: Re: Ambari Metrics Collector Process alert - CRITICAL threshold rule
In your version of Ambari, the alert will trigger right away. In Ambari 2.4, we
have the notion of "soft" and "hard" alerts. You can configure it so
In your version of Ambari, the alert will trigger right away. In Ambari 2.4, we
have the notion of "soft" and "hard" alerts. You can configure it so that it
doesn't trigger alert notifications until n number of CRITICAL alerts have been
received in a row.
On Oct 28, 2016, at 4:07 PM, Ganesh Vis
> *From:* Ganesh Viswanathan
> *Sent:* Friday, October 28, 2016 1:07 PM
> *To:* Jonathan Hurley
> *Cc:* user@ambari.apache.org
> *Subject:* Re: Ambari Metrics Collector Process alert - CRITICAL
> threshold rule
>
> Thanks Jonathan, that explains some of
Thanks Jonathan, that explains some of the behavior I'm seeing.
Two additional questions:
1) How do I make sure the Ambari "Metrics Collector Process" does not
alert immediately when the process is down? I am using Ambari 2.2.1.0 and
it has a bug [1] which can trigger restarts of the process. The
on AMS. Other alerts
either are port/pid or jmx based.
- Sid
From: Ganesh Viswanathan
Sent: Friday, October 28, 2016 1:07 PM
To: Jonathan Hurley
Cc: user@ambari.apache.org
Subject: Re: Ambari Metrics Collector Process alert - CRITICAL threshold rule
Thanks Jon
It sounds like you're asking two different questions here. Let me see if I can
address them:
Most "CRITICAL" thresholds do contain different text then their OK/WARNING
counterparts. This is because there is different information which needs to be
conveyed when an alert has gone CRITICAL. In the
Hello,
The Ambari "Metrics Collector Process" Alert has a different defintion for
CRITICAL threshold vs. OK and WARNING thresholds. What is the reason for
this?
In my tests, CRITICAL seems like a "point-in-time" alert and the value of
that field is not being used. When the metrics collector proce