Re: [prometheus-users] Messages are dropping because too many are queued in AlertManager

2022-01-13 Thread shivakumar sajjan
Thanks Matthias. Sure I will troubleshoot each component and get back to you if there are any issues. Thanks, Shiva On Fri, Jan 14, 2022 at 2:31 AM Matthias Rampke wrote: > From these logs, it's not clear. Try increasing the log level > (--log.level=debug) on Alertmanager and Prometheus. >

Re: [prometheus-users] Messages are dropping because too many are queued in AlertManager

2022-01-13 Thread Matthias Rampke
>From these logs, it's not clear. Try increasing the log level (--log.level=debug) on Alertmanager and Prometheus. We do not know enough about your setup and the receiving service to solve this for you. You will have to systematically troubleshoot every part of the chain. It seems that there are

Re: [prometheus-users] Messages are dropping because too many are queued in AlertManager

2022-01-06 Thread shivakumar sajjan
Hi Matthias, Thanks for responding my questions It is a service where I added an API to post alert information(firing/resolved) by alertmanager whenever alerts are triggered. *There are below warnings in AlertManager pod logs:* level=warn ts=2022-01-06T20:27:41.726Z caller=delegate.go:272

Re: [prometheus-users] Messages are dropping because too many are queued in AlertManager

2022-01-06 Thread Matthias Rampke
What is your webhook receiver? Are any of the resolve messages getting through? Are the requests succeeding? I think Alertmanager will retry failed webhooks, not sure for how long. This would keep them in the queue, leading to what you observe in Alertmanager. /MR On Thu, Jan 6, 2022, 07:14

[prometheus-users] Messages are dropping because too many are queued in AlertManager

2022-01-05 Thread shivakumar sajjan
Hi, I have single instance cluster for AlertManager and I see below warning in AlertManager *container level=warn ts=2021-11-03T08:50:44.528Z caller=delegate.go:272 component=cluster msg="dropping messages because too many are queued" current=4125 limit=4096* *Alert Manager Version