Thanks for your response.

Another question - there is a way to delete expired silences via the api ?

ב-יום שני, 13 בינואר 2025 בשעה 18:16:36 UTC+2, Brian Candler כתב/ה:

> > from a look in the AlertManager UI no silence was created, and i got 
> resolved notification after 5 minutes since the fired notification.
> ...
> > I wonder why the silence wasn't able to create? (not the first time it 
> happens) 
> > Maybe it's some kind of a race condition? we can't silence alerts which 
> are not in fired state right?
>
> That's not true - you can certainly create silences which don't match any 
> active alerts.  This allows you, for example, to create silences before 
> maintenance starts, to suppress the alerts you expect.
>
> If the silences aren't being created (i.e. not visible in the GUI), then 
> you need to look deeper into the code which creates them, and perhaps 
> tcpdump the API to alertmanager to see if you're passing valid parameters.
>
> On Monday, 13 January 2025 at 15:22:54 UTC Saar Zur wrote:
>
>> Hi,
>>
>> i am using the amtool client in a Job inside my cluster.
>>
>> An alert was fired and we got notification in our slack channel, i used 
>> the cli (in code that runs inside docker image from the Job) to create a 
>> silence according to `alertname` matcher and there was no failure.
>>
>> from a look in the AlertManager UI no silence was created, and i got 
>> resolved notification after 5 minutes since the fired notification.
>>
>> After ~10 minutes the alert was fired and resolved again (5 minutes 
>> difference).
>>
>> I wonder why the silence wasn't able to create? (not the first time it 
>> happens) 
>> Maybe it's some kind of a race condition? we can't silence alerts which 
>> are not in fired state right? (although the alert was in fired state while 
>> i tried to create the silence)
>>
>> The Alert rule:
>> name: Orchestrator GRPC Failures for ExternalProcessor Service 
>> <http://localhost:9090/graph?g0.expr=ALERTS%7Balertname%3D%22Orchestrator%20GRPC%20Failures%20for%20ExternalProcessor%20Service%22%7D&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0.g0.range_input=1h.>
>> expr: 
>> sum(increase(grpc_server_handled_total{grpc_code!~"OK|Canceled",grpc_service="envoy.service.ext_proc.v3.ExternalProcessor"}[5m]))
>>  
>> > 0 
>> <http://localhost:9090/graph?g0.expr=sum(increase(grpc_server_handled_total%7Bgrpc_code!~%22OK%7CCanceled%22%2Cgrpc_service%3D%22envoy.service.ext_proc.v3.ExternalProcessor%22%7D%5B5m%5D))%20%3E%200&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0.g0.range_input=1h.>
>> for: 5m
>> labels:
>> severity: WARNING
>> annotations:
>> dashboard_url: p-R7Hw1Iz
>> runbook_url: extension-orchestrator-dashboard
>> summary: Failed gRPC calls detected in the Envoy External Processor 
>> within the last 5 minutes. <!subteam^S06E0CPPC5S>
>>
>> The code for creating the silence:
>> func postSilence(amCli amclient.Client, matchers []*models.Matcher) error 
>> {
>> startsAt := strfmt.DateTime(silenceStart)
>> endsAt := strfmt.DateTime(silenceStart.Add(silenceDuration))
>> createdBy := creatorType
>> comment := silenceComment
>> silenceParams := silence.NewPostSilencesParams().WithSilence(
>> &models.PostableSilence{
>> Silence: models.Silence{
>> Matchers:  matchers,
>> StartsAt:  &startsAt,
>> EndsAt:    &endsAt,
>> CreatedBy: &createdBy,
>> Comment:   &comment,
>> },
>> },
>> )
>>
>> err := amCli.PostSilence(silenceParams)
>> if err != nil {
>> return fmt.Errorf("failed on post silence: %w", err)
>> }
>> log.Print("Silence posted successfully")
>>
>> return nil
>> }
>>
>> Thank in advance,
>> Saar Zur SAP Labs
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-users/8648cb55-e90e-47b6-8fb2-fdfa0def0f95n%40googlegroups.com.

Reply via email to