youjie23 commented on code in PR #13581:
URL: https://github.com/apache/skywalking/pull/13581#discussion_r2539003357


##########
docs/en/setup/backend/backend-alarm.md:
##########
@@ -40,7 +40,8 @@ The metrics names in the expression could be found in the 
[list of all potential
 - **Silence period**. After the alarm is triggered at Time-N (TN), there will 
be silence during the **TN -> TN + period**.
 By default, it works in the same manner as **period**. The same Alarm (having 
the same ID in the same metrics name) may only be triggered once within a 
period. 
 - **Recovery observation period**. Defines the number of consecutive periods 
that the alarm condition must remain false before the alarm is considered 
recovered. When the alarm condition becomes false, the system enters an 
observation period. If the condition remains false for the specified number of 
periods, a recovery notification is sent. If the condition becomes true again 
during the observation period, the alarm returns to the FIRING state. 
-The default value is 0, which means immediate recovery notification when the 
condition becomes false.
+The default value is 0, which means immediate recovery notification when the 
condition becomes false. 
+**Notice:** because the alarm will not be triggered again during the silence 
period, recovery won't be triggered during the silence period after an alarm is 
fired. It will be in the OBSERVING_RECOVERY state, the recovery will be 
triggered only after the silence period is over and the condition remains false 
for the specified observation periods.

Review Comment:
   Sorry for the delay. @wu-sheng 
   
   > Because during the silence period, the alarm will not trigger again.
   
   Thank you for your review and patience. @wankai123 
   
   Our point is that the alarm should have been firing and sent to the webhooks 
beforeit transitions to the silenced-firing state.
   In the previous code, the transition from silenced-firingto recovered would 
only occur when the `recovery-observation-periodis` set to `0`. This means an 
immediate recovery notification is sent when the condition becomes false, as 
shown in the following code snippet:
   ```
   public void onMismatch() {
                   //other code
                   recoveryObservationCountdown--; 
                   silenceCountdown--;
                   switch (currentState) {
                       case FIRING:
                       case SILENCED:
                           if (this.recoveryObservationCountdown < 0) {   // 
This condition would only be met if the recovery-observation-period is set to 0.
                               transitionTo(State.RECOVERED);
                           } else {
                               transitionTo(State.OBSERVING_RECOVERY);
                           }
                           break;
                 //other code
   }
   ```
   If we remove the condition, the recovery notification will be delayed by one 
minute longer than expected.
   **I want to confirm if we are aligned on this point**:​ Should the silent 
period indeed have  effect on the timing of the recovery notification?
   I am not entirely sure that we have the same understanding here. Are we?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to