wu-sheng commented on issue #13492:
URL: https://github.com/apache/skywalking/issues/13492#issuecomment-3290687773

   > In our use case, the recovery detection mechanism relies on the existing 
silence-period(configured in minutes) and SkyWalking's inherent metric 
aggregation window (which typically operates on a minute-based cycle). After 
the silencePeriodends, we wait for an ​​additional metric collection cycle​​ 
(typically one minute) to confirm no new alarms are triggered, before 
considering the alarm recovered.
   
   This is a possible way, but it should not be an official way. Sometimes, 
silence period lasts for a while, there is no point to wait for its end.
   
   > If this approach is acceptable, I'll implement the corresponding code to 
enhance the alarm kernel with recovered status notification capability for 
alarm rules.
   
   There are two things about this. 
   1. We need to keep the alarm kernel aware of the triggered status, then when 
it is recovered, this status should be reset to normal, and a new message 
should be sent out. A new recovery API should be created.
   2. AlarmRecord is an immutable row. How will you keep this recovery status? 
And mark triggered the alarm. We were hesitant about how to implement this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to