I must admit that I am a beginner in Prometheus operator.
I have deployed "*kube-prometheus-stack-14.5.0*" Helm chart on the
Kubernetes cluster.
Hence the version of Alertmanager is also deployed together.( "
*alertmanager:v0.21.0*")
https://github.com/prometheus-community/helm-charts/releases/tag/kube-prometheus-stack-14.5.0
It all works fine except alertmanager sends too much notification for each
activity on the cluster.
Feel free to correct me if I am wrong:
After debugging I realized that there are so many Rules configured in the
Prometheus Rules file, therefore, Alertmanager is triggering an alert for
each rule.
*Rules are configured under Prometheus deployment -->*
/etc/prometheus/rules/prometheus-prometheus-kube-prometheus-prometheus-rulefiles-0
$ ls -la
monitoring-prometheus-kube-prometheus-kubelet.rules.yaml -> ..data
monitoring-prometheus-kube-prometheus-node.rules.yaml -> ..data
etc.......
*Is it possible to override these rules?*
*Is it possible to delete these default rules?*
E-mail is configured in Alertmanager and now I get every day more than 1000
emails.
This is definitely over alerting. I want to send notifications only for
specific activities on the cluster.
Such as Pod/Applications in pending state or crashloop, or HostHighCpuLoad,
HostOutOfDiskSpace, etc, etc.
In a simple way, I want to send notifications to 3 different
people(Developer, Tester, and Teamlead)
Also,
*Is it possible to write my own custom rule file?*
If yes, then where I can configure rules in values.yaml file.
*or How can I deploy my own rule files together with Helm chart?*
*Alertmanager config:*
alertmanager:
enabled: true
config:
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.devops.logix.net:25'
smtp_from: '[email protected]'
smtp_require_tls: false
route:
group_by: ['job']
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: 'null'
routes:
- match:
namespace: fbk-r4-dev
receiver: kkot
- match:
namespace: fbk-dev
receiver: kkot
receivers:
- name: 'kkot'
email_configs:
- to: '[email protected]'
require_tls: false
templates:
- '/etc/alertmanager/config/*.tmpl'
I tried to describe the issue in all possible ways.
Looking for support since I am really struggling for a couple of days.
Thanks in advance.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/cd858c2e-0740-4d0f-b5e5-92a953af388en%40googlegroups.com.