[prometheus-users] Re: Prometheus same host multiple endpoints
Can you give an specific example of "same metrics are published on two different endpoints" ? You might mean: - two different metric names - the same metric name, but different labels And it might be that you're scraping the same target twice, or you're scraping one target but that target is (for some reason) returning duplicates in the scrape results. Or you might have a more complex scenario, e.g. multiple prometheus servers scraping for redundancy, and then you're combining the results together somehow. > Is it possible to pick one endpoint and discard the other while writing a PromQL query ? Sure. Just filter in the PromQL query. For example, if you have foo{aaa="bbb",ccc="ddd"} 123.0 foo{aaa="bbb",ccc="fff"} 123.0 and you consider the one with ccc="fff" to be a "duplicate" metric, then foo{ccc!="fff"} might be what you want. Otherwise, you can avoid ingesting the duplicate metrics: - by not scraping the second set in the first place - if they all come from the same scrape, then using metric_relabel_configs to drop the metrics that you don't want to keep On Monday, 23 January 2023 at 14:40:40 UTC kishore...@gmail.com wrote: > Hi, > We have a situation where same metrics are published on two different > endpoints. Is it possible to pick one endpoint and discard the other while > writing a PromQL query ? > Is it possible to configure Prometheus to collect metrics from only one > endpoint? > > / Kishore > > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c777e2ab-0089-4fa8-8fcd-efe49b95e2een%40googlegroups.com.
[prometheus-users] Re: Can 2 prometheus agent scraping same targets have difference in prometheus_agent_active_series metric
That looks fine to me. The "head block" contains all timeseries which were active in the last two hours; if you have some series churn (i.e. new series being created, old ones which stop) then the head block will grow as new series are added, but you'll only see the old ones drop out when it makes a new block. You can clearly see those dips in the green graph; I'd expect the same in the yellow one at 12:45 and 16:45 but they're just out of frame. Since the two prometheus instances don't change head block at the same time, the number of timeseries in each are different. - green swapped at about 14:08, then starts to grow - yellow swapped at about 14:45, dropping to the level where green was at 14:08, and then starts to grow from there On Monday, 23 January 2023 at 14:41:24 UTC vikas@gmail.com wrote: > We have 2 replicas of prometheus agent setup in kubernetes cluster. > > We noticed that both these prometheus agents are not scraping same number > of metrics. this we verified using prometheus_agent_active_series metrics. > > However when checked for prometheus agent targets its exactly same for > both agents. the difference in prometheus_agent_active_series is consistent > (attached is prometheus_agent_active_series graph) > > [image: Screenshot 2023-01-18 at 7.05.59 PM.png] > > We were expecting prometheus_agent_active_series should match for both > agents. not sure if the prometheus_agent_active_series is not reliable or > something else. > > please let me know what else can I check to drill this down? > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/afe0298f-e1db-4288-a1df-4d59698d0f8an%40googlegroups.com.
Re: [prometheus-users] Prometheus same host multiple endpoints
On 2023-01-23 05:52, Kishore Chopne wrote: Hi, We have a situation where same metrics are published on two different endpoints. Is it possible to pick one endpoint and discard the other while writing a PromQL query ? Is it possible to configure Prometheus to collect metrics from only one endpoint? If they are literally the same metrics available in multiple endpoints then I'd suggest only scraping one. That's controlled via your scrape configuration. Depending on which mechanism you are using to manage that it could mean changes to prometheus.yaml, removing an endpoint from a YAML/JSON file or changing AWS/Kubernetes tags. -- Stuart Clark -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/6da844283f2fdfc77f7ecebc157bef7d%40Jahingo.com.
[prometheus-users] Prometheus agent, hashmod sharding issue
We have 4 prom-agents having scrape config as below regrex is different for each agent(0, 1, 2, and 3) ` - job_name: 'kube-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__address__] modulus: 4 target_label: __tmp_hash action: hashmod - source_labels: [__tmp_hash] regex: ^0$ action: keep - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name` our expectations was same metrics should not be scraped by 2 prometheus agent, which is not working and we are seeing same metrics being scraped by more than 1 agents. I verified this using query count(count({job="kube-pods"}) by (prometheus_agent,kubernetes_pod_name, *name*,instance)) by (kubernetes_pod_name,*name*,instance) > 1 there are some metrics with exactly 1 count as well which means issue is not consistent for all metrics. not sure if its something to do with pods or their config or somethings else. Interestingly we don't see such duplicated for below scrape job ` - job_name: 'hlo-pods' kubernetes_sd_configs: - role: pod namespaces: names: - hlo - relabel_configs: - source_labels: [__address__] modulus: 4 target_label: __tmp_hash action: hashmod - source_labels: [__tmp_hash] regex: ^2$ action: keep - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name - source_labels: [ __meta_kubernetes_pod_container_name ] action: replace target_label: kubernetes_container_name ` Any clue what should I check? what could be possible cause? -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4b460268-5bd0-41af-88ef-d42ba7315785n%40googlegroups.com.
[prometheus-users] Prometheus same host multiple endpoints
Hi, We have a situation where same metrics are published on two different endpoints. Is it possible to pick one endpoint and discard the other while writing a PromQL query ? Is it possible to configure Prometheus to collect metrics from only one endpoint? / Kishore -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/A14DF96E-61F3-4463-B13D-D49B85BCB1EF%40gmail.com.
[prometheus-users] Re: postgres_db_exporter not able find metrics|| pg_postmaster_start_time_seconds
Also, if you *are* talking about prometheus-community/postgres_exporter, then note that pg_postmaster_start_time_seconds comes from the sample queries.yaml, which won't be used unless you're using the -extend.query-path flag or the PG_EXPORTER_EXTEND_QUERY_PATH environment variable to point to a copy of that file. See: https://github.com/prometheus-community/postgres_exporter#adding-new-metrics-via-a-config-file https://github.com/prometheus-community/postgres_exporter/blob/v0.11.1/queries.yaml#L9-L15 On Monday, 23 January 2023 at 08:17:13 UTC Brian Candler wrote: > Do you really mean "postgres_db_exporter", or do you mean > "postgres_exporter", i.e. this one? > https://github.com/prometheus-community/postgres_exporter > > Maybe you have the wrong values for 'release' and 'instance' labels. What > happens if you query: > > pg_postmaster_start_time_seconds > > without any label filters? (You can do this in prometheus's own web > interface). > > If that doesn't return anything, then are you getting *any* metrics from > the exporter at all? For example, if your scrape job has > > - job_name: 'postgres' > > then you can try querying > > {job="postgres"} > > If that query shows nothing, then probably your scrapes aren't working at > all. Look in the Prometheus web interface under Status > Targets. > > On Monday, 23 January 2023 at 06:22:25 UTC prashan...@gmail.com wrote: > >> hello , >> >> I am not able find metrics having * pg_postmaster_start_time_seconds.* >> >> *postres_exporter version *0.11.1 >> *DB version : 12.12.0* >> >> >> pg_postmaster_start_time_seconds{release="$release", >> instance="$instance"} * 1000 >> >> thanks >> >> prashant >> > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e6317fa7-224b-4b0f-97c3-e2ca8fca0f1cn%40googlegroups.com.
[prometheus-users] Re: An alert fires twice even an event occurres only once
>But if you've totally lost connectivity from this region, then even if you try to send a message to PagerDuty or OpsGenie or whatever, won't that fail too? That is true. Nevertheless what I did so far reduced number of nodes in cluster from 8 to 4 - in every region we have one alertmanager node now. After one week no duplication observed. I will keep this config for the next few weeks. On Monday, January 16, 2023 at 7:19:37 PM UTC+1 Brian Candler wrote: > > (1) We would like avoid such architecture. In this scenario we keep one > region without local alertmanager. It means that we could lost alerts in > case lost connection between that region and regions where alertmanager > cluster is configured. > > But if you've totally lost connectivity from this region, then even if you > try to send a message to PagerDuty or OpsGenie or whatever, won't that fail > too? > > On Monday, 16 January 2023 at 14:32:12 UTC LukaszSz wrote: > >> Hi , >> >> (1) We would like avoid such architecture. In this scenario we keep one >> region without local alertmanager. It means that we could lost alerts in >> case lost connection between that region and regions where alertmanager >> cluster is configured. >> >> (2) It looks very promising. Currently one blocking point is lack of >> fronted where we can set a silence. I saw your previous posts about Karma. >> We are going to test this direction. >> >> Our other ideas are: >> >> (3) Reduce current AM cluster from 8 to 4 nodes (1 AM per region) >> (4) If (3) not help we want tweak/play with gossip to improve AM nodes >> communication. Do you or anyone has experience with gossip and some best >> practice in AM HA ? >> >> Thanks >> >> >> On Sunday, January 15, 2023 at 11:58:47 AM UTC+1 Brian Candler wrote: >> >>> I wouldn't have thought that a few hundred ms of latency would make any >>> difference. >>> >>> I am however worried about the gossiping. If this is one monster-sized >>> cluster, then all 8 nodes should be communicating with every other 7 nodes. >>> >>> I'd say this is a bad design. Either: >>> >>> 1. Have a single global alertmanager cluster, with 2 nodes - that will >>> give you excellent high availability for your alerting. (How often do >>> expect two regions to go offline simultaneously?) Or 3 nodes if your >>> management absolutely insists on it. (But this isn't the sort of cluster >>> that needs to maintain a quorum). >>> >>> Or: >>> >>> 2. Completely separate the regions. Have one alertmanager cluster in >>> region A, one cluster in region B, one cluster in region C. Have the >>> prometheus instances in region A only talking to the alertmanager instances >>> in region A, and so on. In this case, each region sends its alerts >>> completely independently. >>> >>> There is little benefit in option (2) unless there are tight >>> restrictions on inter-region communication; it gives you a lot more stuff >>> to manage. If you need to go this route, then having a frontend like Karma >>> or alerta.io may be helpful. >>> >>> On Friday, 13 January 2023 at 14:53:14 UTC LukaszSz wrote: >>> Interesting. Seems that the alertmanagers are spread over 3 different regions ( 2xAsia, 2xUSA,4xEurope). Maybe there is some latency problem between them like latency in gossip messages ? On Friday, January 13, 2023 at 3:28:57 PM UTC+1 Brian Candler wrote: > That's a lot of alertmanagers. Are they all fully meshed? (But I'd > say 2 or 3 would be better - spread over different regions) > > On Friday, 13 January 2023 at 14:16:27 UTC LukaszSz wrote: > >> Yes. The prometheus server is configured to communicate with all >> alertmanagers ( sorry there are 8 alertmanagers ): >> >> alerting: >> alert_relabel_configs: >> - action: labeldrop >> regex: "^prometheus_server$" >> alertmanagers: >> - static_configs: >> - targets: >> - alertmanager1:9093 >> - alertmanager2:9093 >> - alertmanager3:9093 >> - alertmanager4:9093 >> - alertmanager5:9093 >> - alertmanager6:9093 >> - alertmanager7:9093 >> - alertmanager8:9093 >> >> On Friday, January 13, 2023 at 2:02:14 PM UTC+1 Brian Candler wrote: >> >>> Yes, but have you configured the prometheus (the one which has >>> alerting rules) to have all four alertmanagers as its destination? >>> >>> On Friday, 13 January 2023 at 12:55:49 UTC LukaszSz wrote: >>> Yes Brian. As I mentioned in my post the Alertmangers are in cluster and this event is visible on my 4 alertmanagers. Problem which I described is that an alerts are firing twice and it generates duplication. On Friday, January 13, 2023 at 1:34:52 PM UTC+1 Brian Candler wrote: > Are the alertmanagers clustered? Then you should configure > prometheus to
[prometheus-users] Re: Unable to Scrape prometheus from Postgres DB - Error "Conn busy"
On Sunday, 22 January 2023 at 19:25:27 UTC irtizahs...@gmail.com wrote: Hi Team, I have integrated prometheus with external RDS postgres DB to store the data permanently. But i am trying to read the 3 week data inside prometheus using read handler , data is not reflecting in the prometheus dashboard showing "conn busy error" or sometimes "timeline exceeded". Prometheus doesn't talk to Postgres directly, so presumably you are using some sort of third-party remote-write/remote-read adapter, and most likely you're having a problem with that adapter. Therefore, your starting point should be to look at logs of that adapter. You haven't said anything about what adapter you're using, nor even what version of Prometheus you're using. But the best place to get help with the adapter will be wherever it came from. There are many other long-term storage solutions available, which may work better for you if you don't necessarily require Postgres. Could someone please look into this on priority basis. http://www.catb.org/~esr/faqs/smart-questions.html#urgent (the whole document is well worth reading). -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b4d89c35-cc72-4c50-8b02-b65a517e4c35n%40googlegroups.com.
[prometheus-users] Re: postgres_db_exporter not able find metrics|| pg_postmaster_start_time_seconds
Do you really mean "postgres_db_exporter", or do you mean "postgres_exporter", i.e. this one? https://github.com/prometheus-community/postgres_exporter Maybe you have the wrong values for 'release' and 'instance' labels. What happens if you query: pg_postmaster_start_time_seconds without any label filters? (You can do this in prometheus's own web interface). If that doesn't return anything, then are you getting *any* metrics from the exporter at all? For example, if your scrape job has - job_name: 'postgres' then you can try querying {job="postgres"} If that query shows nothing, then probably your scrapes aren't working at all. Look in the Prometheus web interface under Status > Targets. On Monday, 23 January 2023 at 06:22:25 UTC prashan...@gmail.com wrote: > hello , > > I am not able find metrics having * pg_postmaster_start_time_seconds.* > > *postres_exporter version *0.11.1 > *DB version : 12.12.0* > > > pg_postmaster_start_time_seconds{release="$release", instance="$instance"} > * 1000 > > thanks > > prashant > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a8172ab0-df23-466b-ad6b-8f6eff42b432n%40googlegroups.com.
Re: [prometheus-users] prometheus alermanager not sending alert CC user list
You've mangled the YAML so it definitely won't work with what you've shown. You'll need something like this: - name: 'MoniDashboard' email_configs: - send_resolved: true to: 'vinoth.sundaram@domain1,PrashantKumar.Singh1@domain2' headers: Subject: "{{ .CommonAnnotations.summary }}" To: 'vinoth.sundaram@domain1' Cc: 'PrashantKumar.Singh1@domain2' require_tls: no The important thing is that you need the 'to' value to be a comma-separated list of recipient addresses. See: https://github.com/prometheus/alertmanager/blob/v0.25.0/notify/email/email.go#L230-L238 https://pkg.go.dev/net/mail#ParseAddressList If it still doesn't work how you want, then you need to be more specific about your problem than "this is not working". For example: - If the message isn't delivered to both receipients, you'll need to look at the log output of alertmanager and the logs from your E-mail relay (SMTP host) - If the message is delivered to both receipients, but the "To" and "Cc" headers don't appear in the way that you want, then you'll need to show the actual message headers that you receive, and explain how they differ from what you expected. On Monday, 23 January 2023 at 06:17:05 UTC prashan...@gmail.com wrote: > Hi , > > this is not working , i am not able to receive alerts in having user CC > > - name: 'MoniDashboard' > email_configs: > - send_resolved: true > to: 'vinoth.sundaram@ > headers: > subject: "{{ .CommonAnnotations.summary }}" > to: 'vinoth.sundaram@ > CC: 'PrashantKumar.Singh1@ > require_tls: no > > > Thanks > prashant > > > > > On Friday, January 20, 2023 at 7:33:08 PM UTC+5:30 juliu...@promlabs.com > wrote: > Ah, thanks for that addition. Indeed, "to" is handled in a special way > here: > https://github.com/prometheus/alertmanager/blob/f59460bfd4bf883ca66f4391e7094c0c1794d158/notify/email/email.go#LL230C28-L230C28 > > Which makes sense given that in SMTP, there's always separate "RCPT TO: > <...>" lines (which I guess this translates into) before the body of the > email that contains the headers. > > On Fri, Jan 20, 2023 at 2:34 PM Julien Pivotto > wrote: > To send an email with a CC: in alertmanager, it is not sufficient to add a > CC: header. > > receivers: > - name: my-receiver > email_configs: > - to: 'o...@foo.com' > headers: > subject: 'my subject' > CC: 't...@bar.com' > > You also need to add the CC: address to the to: field and explicitely add > a to: field. > > receivers: > - name: my-receiver > email_configs: > - to: 'o...@foo.com,t...@bar.com' > headers: > subject: 'my subject' > To: 'o...@foo.com' > CC: 't...@bar.com' > > To send a mail in BCC:, overwrite the To: header and add the BCC addresses > to the to: field: > > receivers: > - name: my-receiver > email_configs: > - to: 'o...@foo.com,th...@foo.com' > headers: > subject: 'my subject' > To: 'o...@foo.com' > > On 20 Jan 14:28, Julius Volz wrote: > > Hi Prashant, > > > > Looking at the email implementation in Alertmanager, "to" should be > treated > > exactly as "cc" internally (just optionally supplied through a dedicated > > YAML field). It's just another header: > > > https://github.com/prometheus/alertmanager/blob/f59460bfd4bf883ca66f4391e7094c0c1794d158/notify/email/email.go#L53-L70 > > > > So I would be surprised if the problem is in Alertmanager itself, rather > > than something in the email pipeline after it. > > > > Kind regards, > > Julius > > > > On Fri, Jan 20, 2023 at 1:38 PM Prashant Singh > > wrote: > > > > > Dear team, > > > > > > prometheus alertmanager CC is not working even as add TO_ and CC_ . > > > > > > TO: it is working > > > CC: it is not working. > > > > > > > > > - name: 'MoniDashboard' > > > email_configs: > > > - send_resolved: true > > > to: 'vinoth.sundaram > > > headers: > > > cc: 'PrashantKumar.Singh1' > > > require_tls: no > > > > > > thanks > > > Prashant > > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > > "Prometheus Users" group. > > > To unsubscribe from this group and stop receiving emails from it, send > an > > > email to prometheus-use...@googlegroups.com. > > > To view this discussion on the web visit > > > > https://groups.google.com/d/msgid/prometheus-users/9987573c-ed69-4fb2-94aa-4c37870c540dn%40googlegroups.com > > > < > https://groups.google.com/d/msgid/prometheus-users/9987573c-ed69-4fb2-94aa-4c37870c540dn%40googlegroups.com?utm_medium=email_source=footer > > > > > . > > > > > > > > > -- > > Julius Volz > > PromLabs - promlabs.com > > > > -- > > You received this message because you are subscribed to the Google > Groups "Prometheus Users" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to prometheus-use...@googlegroups.com. > > To view this discussion on the web visit >