Hi, i am using the amtool client in a Job inside my cluster.
An alert was fired and we got notification in our slack channel, i used the cli (in code that runs inside docker image from the Job) to create a silence according to `alertname` matcher and there was no failure. from a look in the AlertManager UI no silence was created, and i got resolved notification after 5 minutes since the fired notification. After ~10 minutes the alert was fired and resolved again (5 minutes difference). I wonder why the silence wasn't able to create? (not the first time it happens) Maybe it's some kind of a race condition? we can't silence alerts which are not in fired state right? (although the alert was in fired state while i tried to create the silence) The Alert rule: name: Orchestrator GRPC Failures for ExternalProcessor Service <http://localhost:9090/graph?g0.expr=ALERTS%7Balertname%3D%22Orchestrator%20GRPC%20Failures%20for%20ExternalProcessor%20Service%22%7D&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0.g0.range_input=1h.> expr: sum(increase(grpc_server_handled_total{grpc_code!~"OK|Canceled",grpc_service="envoy.service.ext_proc.v3.ExternalProcessor"}[5m])) > 0 <http://localhost:9090/graph?g0.expr=sum(increase(grpc_server_handled_total%7Bgrpc_code!~%22OK%7CCanceled%22%2Cgrpc_service%3D%22envoy.service.ext_proc.v3.ExternalProcessor%22%7D%5B5m%5D))%20%3E%200&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0.g0.range_input=1h.> for: 5m labels: severity: WARNING annotations: dashboard_url: p-R7Hw1Iz runbook_url: extension-orchestrator-dashboard summary: Failed gRPC calls detected in the Envoy External Processor within the last 5 minutes. <!subteam^S06E0CPPC5S> The code for creating the silence: func postSilence(amCli amclient.Client, matchers []*models.Matcher) error { startsAt := strfmt.DateTime(silenceStart) endsAt := strfmt.DateTime(silenceStart.Add(silenceDuration)) createdBy := creatorType comment := silenceComment silenceParams := silence.NewPostSilencesParams().WithSilence( &models.PostableSilence{ Silence: models.Silence{ Matchers: matchers, StartsAt: &startsAt, EndsAt: &endsAt, CreatedBy: &createdBy, Comment: &comment, }, }, ) err := amCli.PostSilence(silenceParams) if err != nil { return fmt.Errorf("failed on post silence: %w", err) } log.Print("Silence posted successfully") return nil } Thank in advance, Saar Zur SAP Labs -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/prometheus-users/60b275a6-f9b2-4bae-a9d2-95460f6b8cf0n%40googlegroups.com.

