[prometheus-users] Re: Prometheus same host multiple endpoints

2023-01-23 Thread Brian Candler
Can you give an specific example of "same metrics are published on two 
different endpoints" ?

You might mean:
- two different metric names
- the same metric name, but different labels

And it might be that you're scraping the same target twice, or you're 
scraping one target but that target is (for some reason) returning 
duplicates in the scrape results.  Or you might have a more complex 
scenario, e.g. multiple prometheus servers scraping for redundancy, and 
then you're combining the results together somehow.

> Is it possible to pick one endpoint and discard the other while writing a 
PromQL query ?

Sure.  Just filter in the PromQL query.  For example, if you have

foo{aaa="bbb",ccc="ddd"} 123.0
foo{aaa="bbb",ccc="fff"} 123.0

and you consider the one with ccc="fff" to be a "duplicate" metric, then

foo{ccc!="fff"}

might be what you want.

Otherwise, you can avoid ingesting the duplicate metrics:
- by not scraping the second set in the first place
- if they all come from the same scrape, then using metric_relabel_configs 
to drop the metrics that you don't want to keep

On Monday, 23 January 2023 at 14:40:40 UTC kishore...@gmail.com wrote:

> Hi,
> We have a situation where same metrics are published on two different 
> endpoints. Is it possible to pick one endpoint and discard the other while 
> writing a PromQL query ?
> Is it possible to configure Prometheus to collect metrics from only one 
> endpoint?
>
> / Kishore
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c777e2ab-0089-4fa8-8fcd-efe49b95e2een%40googlegroups.com.


[prometheus-users] Re: Can 2 prometheus agent scraping same targets have difference in prometheus_agent_active_series metric

2023-01-23 Thread Brian Candler
That looks fine to me. The "head block" contains all timeseries which were 
active in the last two hours; if you have some series churn (i.e. new 
series being created, old ones which stop) then the head block will grow as 
new series are added, but you'll only see the old ones drop out when it 
makes a new block.  You can clearly see those dips in the green graph; I'd 
expect the same in the yellow one at 12:45 and 16:45 but they're just out 
of frame.

Since the two prometheus instances don't change head block at the same 
time, the number of timeseries in each are different.
- green swapped at about 14:08, then starts to grow
- yellow swapped at about 14:45, dropping to the level where green was at 
14:08, and then starts to grow from there

On Monday, 23 January 2023 at 14:41:24 UTC vikas@gmail.com wrote:

> We have 2 replicas of prometheus agent setup in kubernetes cluster.
>
> We noticed that both these prometheus agents are not scraping same number 
> of metrics. this we verified using prometheus_agent_active_series metrics.
>
> However when checked for prometheus agent targets its exactly same for 
> both agents. the difference in prometheus_agent_active_series is consistent 
> (attached is prometheus_agent_active_series graph)
>
> [image: Screenshot 2023-01-18 at 7.05.59 PM.png]
>
> We were expecting prometheus_agent_active_series should match for both 
> agents. not sure if the prometheus_agent_active_series is not reliable or 
> something else.
>
> please let me know what else can I check to drill this down?
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/afe0298f-e1db-4288-a1df-4d59698d0f8an%40googlegroups.com.


Re: [prometheus-users] Prometheus same host multiple endpoints

2023-01-23 Thread Stuart Clark

On 2023-01-23 05:52, Kishore Chopne wrote:

Hi,
 We have a situation where same metrics are published on two
different endpoints. Is it possible to pick one endpoint and discard
the other while writing a PromQL query ?
Is it possible to configure Prometheus to collect metrics from only
one endpoint?



If they are literally the same metrics available in multiple endpoints 
then I'd suggest only scraping one. That's controlled via your scrape 
configuration. Depending on which mechanism you are using to manage that 
it could mean changes to prometheus.yaml, removing an endpoint from a 
YAML/JSON file or changing AWS/Kubernetes tags.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/6da844283f2fdfc77f7ecebc157bef7d%40Jahingo.com.


[prometheus-users] Prometheus agent, hashmod sharding issue

2023-01-23 Thread Vikas Budhvant


We have 4 prom-agents having scrape config as below
regrex is different for each agent(0, 1, 2, and 3)

` - job_name: 'kube-pods'
kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: 
[__address__] modulus: 4 target_label: __tmp_hash action: hashmod - 
source_labels: [__tmp_hash] regex: ^0$ action: keep - source_labels: 
[__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: 
true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] 
action: replace target_label: __metrics_path__ regex: (.+) - source_labels: 
[__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: 
replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target_label: 
__address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - 
source_labels: [__meta_kubernetes_namespace] action: replace target_label: 
kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: 
replace target_label: kubernetes_pod_name` 

our expectations was same metrics should not be scraped by 2 prometheus 
agent, which is not working and we are seeing same metrics being scraped by 
more than 1 agents.
I verified this using query
count(count({job="kube-pods"}) by (prometheus_agent,kubernetes_pod_name,
*name*,instance)) by (kubernetes_pod_name,*name*,instance) > 1

there are some metrics with exactly 1 count as well which means issue is 
not consistent for all metrics. not sure if its something to do with pods 
or their config or somethings else.

Interestingly we don't see such duplicated for below scrape job
` - job_name: 'hlo-pods'
kubernetes_sd_configs: - role: pod namespaces: names: - hlo - 
relabel_configs: - source_labels: [__address__] modulus: 4 target_label: 
__tmp_hash action: hashmod - source_labels: [__tmp_hash] regex: ^2$ action: 
keep - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - 
source_labels: [__meta_kubernetes_namespace] action: replace target_label: 
kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: 
replace target_label: kubernetes_pod_name - source_labels: [ 
__meta_kubernetes_pod_container_name ] action: replace target_label: 
kubernetes_container_name ` 

Any clue what should I check? what could be possible cause?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4b460268-5bd0-41af-88ef-d42ba7315785n%40googlegroups.com.


[prometheus-users] Prometheus same host multiple endpoints

2023-01-23 Thread Kishore Chopne
Hi,
 We have a situation where same metrics are published on two different 
endpoints. Is it possible to pick one endpoint and discard the other while 
writing a PromQL query ?
Is it possible to configure Prometheus to collect metrics from only one 
endpoint?

/ Kishore


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/A14DF96E-61F3-4463-B13D-D49B85BCB1EF%40gmail.com.


[prometheus-users] Re: postgres_db_exporter not able find metrics|| pg_postmaster_start_time_seconds

2023-01-23 Thread Brian Candler
Also, if you *are* talking about prometheus-community/postgres_exporter, 
then note that pg_postmaster_start_time_seconds comes from the sample 
queries.yaml, which won't be used unless you're using the 
-extend.query-path flag or the PG_EXPORTER_EXTEND_QUERY_PATH environment 
variable to point to a copy of that file. See:

https://github.com/prometheus-community/postgres_exporter#adding-new-metrics-via-a-config-file
https://github.com/prometheus-community/postgres_exporter/blob/v0.11.1/queries.yaml#L9-L15

On Monday, 23 January 2023 at 08:17:13 UTC Brian Candler wrote:

> Do you really mean "postgres_db_exporter", or do you mean 
> "postgres_exporter", i.e. this one?
> https://github.com/prometheus-community/postgres_exporter
>
> Maybe you have the wrong values for 'release' and 'instance' labels.  What 
> happens if you query:
>
> pg_postmaster_start_time_seconds
>
> without any label filters? (You can do this in prometheus's own web 
> interface).
>
> If that doesn't return anything, then are you getting *any* metrics from 
> the exporter at all?  For example, if your scrape job has
>
>   - job_name: 'postgres'
>
> then you can try querying
>
> {job="postgres"}
>
> If that query shows nothing, then probably your scrapes aren't working at 
> all.  Look in the Prometheus web interface under Status > Targets.
>
> On Monday, 23 January 2023 at 06:22:25 UTC prashan...@gmail.com wrote:
>
>> hello ,
>>
>> I am not able find metrics having * pg_postmaster_start_time_seconds.*
>>
>> *postres_exporter version *0.11.1
>> *DB version : 12.12.0*
>>
>>
>> pg_postmaster_start_time_seconds{release="$release", 
>> instance="$instance"} * 1000
>>
>> thanks 
>>
>> prashant 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e6317fa7-224b-4b0f-97c3-e2ca8fca0f1cn%40googlegroups.com.


[prometheus-users] Re: An alert fires twice even an event occurres only once

2023-01-23 Thread LukaszSz
>But if you've totally lost connectivity from this region, then even if you 
try to send a message to PagerDuty or OpsGenie or whatever, won't that fail 
too?
That is true. 

Nevertheless what I did so far reduced number of nodes in cluster from 8 to 
4 - in every region we have one alertmanager node now. After one week no 
duplication observed. I will keep this config for the next few weeks.

On Monday, January 16, 2023 at 7:19:37 PM UTC+1 Brian Candler wrote:

> > (1) We would like avoid such architecture. In this scenario we keep one 
> region without local alertmanager. It means that we could lost alerts in 
> case lost connection between that region and regions where alertmanager 
> cluster is configured.
>
> But if you've totally lost connectivity from this region, then even if you 
> try to send a message to PagerDuty or OpsGenie or whatever, won't that fail 
> too?
>
> On Monday, 16 January 2023 at 14:32:12 UTC LukaszSz wrote:
>
>> Hi ,
>>
>> (1) We would like avoid such architecture. In this scenario we keep one 
>> region without local alertmanager. It means that we could lost alerts in 
>> case lost connection between that region and regions where alertmanager 
>> cluster is configured.
>>
>> (2) It looks very promising. Currently one blocking point is lack of 
>> fronted  where we can set a silence. I saw your previous posts about Karma. 
>> We are going to test this direction.
>>
>> Our other ideas are:
>>
>> (3) Reduce current AM cluster from 8 to 4 nodes (1 AM per region) 
>> (4) If (3) not help we want tweak/play with gossip to improve AM nodes 
>> communication. Do you or anyone has experience with gossip and some best 
>> practice in AM HA ?
>>
>> Thanks 
>>
>>
>> On Sunday, January 15, 2023 at 11:58:47 AM UTC+1 Brian Candler wrote:
>>
>>> I wouldn't have thought that a few hundred ms of latency would make any 
>>> difference.
>>>
>>> I am however worried about the gossiping.  If this is one monster-sized 
>>> cluster, then all 8 nodes should be communicating with every other 7 nodes.
>>>
>>> I'd say this is a bad design.  Either:
>>>
>>> 1. Have a single global alertmanager cluster, with 2 nodes - that will 
>>> give you excellent high availability for your alerting.  (How often do 
>>> expect two regions to go offline simultaneously?)  Or 3 nodes if your 
>>> management absolutely insists on it.  (But this isn't the sort of cluster 
>>> that needs to maintain a quorum).
>>>
>>> Or:
>>>
>>> 2. Completely separate the regions.  Have one alertmanager cluster in 
>>> region A, one cluster in region B, one cluster in region C.  Have the 
>>> prometheus instances in region A only talking to the alertmanager instances 
>>> in region A, and so on.  In this case, each region sends its alerts 
>>> completely independently.
>>>
>>> There is little benefit in option (2) unless there are tight 
>>> restrictions on inter-region communication; it gives you a lot more stuff 
>>> to manage.  If you need to go this route, then having a frontend like Karma 
>>> or alerta.io may be helpful.
>>>
>>> On Friday, 13 January 2023 at 14:53:14 UTC LukaszSz wrote:
>>>
 Interesting. Seems that the alertmanagers are spread over 3 different 
 regions ( 2xAsia, 2xUSA,4xEurope).
 Maybe there is some latency problem between them like latency in gossip 
 messages ?

 On Friday, January 13, 2023 at 3:28:57 PM UTC+1 Brian Candler wrote:

> That's a lot of alertmanagers.  Are they all fully meshed?  (But I'd 
> say 2 or 3 would be better - spread over different regions)
>
> On Friday, 13 January 2023 at 14:16:27 UTC LukaszSz wrote:
>
>> Yes. The prometheus server is configured to communicate with  all 
>> alertmanagers ( sorry there are 8 alertmanagers ):
>>
>> alerting:
>>   alert_relabel_configs:
>>   - action: labeldrop
>> regex: "^prometheus_server$"
>>   alertmanagers:
>>   - static_configs:
>> - targets:
>>   - alertmanager1:9093
>>   - alertmanager2:9093
>>   - alertmanager3:9093
>>   - alertmanager4:9093
>>   - alertmanager5:9093
>>   - alertmanager6:9093
>>   - alertmanager7:9093
>>   - alertmanager8:9093 
>>
>> On Friday, January 13, 2023 at 2:02:14 PM UTC+1 Brian Candler wrote:
>>
>>> Yes, but have you configured the prometheus (the one which has 
>>> alerting rules) to have all four alertmanagers as its destination?
>>>
>>> On Friday, 13 January 2023 at 12:55:49 UTC LukaszSz wrote:
>>>
 Yes Brian. As I mentioned in my post the Alertmangers are in 
 cluster and this event is visible on my 4 alertmanagers.
 Problem which I described is that an alerts are firing twice and it 
 generates duplication. 

 On Friday, January 13, 2023 at 1:34:52 PM UTC+1 Brian Candler wrote:

> Are the alertmanagers clustered?  Then you should configure 
> prometheus to 

[prometheus-users] Re: Unable to Scrape prometheus from Postgres DB - Error "Conn busy"

2023-01-23 Thread Brian Candler
On Sunday, 22 January 2023 at 19:25:27 UTC irtizahs...@gmail.com wrote:
Hi Team,
I have integrated prometheus with external RDS postgres DB to store the 
data permanently. 
But i am trying to read the 3 week data inside prometheus using read 
handler , data is not reflecting in the prometheus dashboard showing "conn 
busy error" or sometimes "timeline exceeded".

Prometheus doesn't talk to Postgres directly, so presumably you are using 
some sort of third-party remote-write/remote-read adapter, and most likely 
you're having a problem with that adapter.  Therefore, your starting point 
should be to look at logs of that adapter.

You haven't said anything about what adapter you're using, nor even what 
version of Prometheus you're using.  But the best place to get help with 
the adapter will be wherever it came from.

There are many other long-term storage solutions available, which may work 
better for you if you don't necessarily require Postgres.
 

Could someone please look into this on priority basis.

http://www.catb.org/~esr/faqs/smart-questions.html#urgent 

(the whole document is well worth reading).

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/b4d89c35-cc72-4c50-8b02-b65a517e4c35n%40googlegroups.com.


[prometheus-users] Re: postgres_db_exporter not able find metrics|| pg_postmaster_start_time_seconds

2023-01-23 Thread Brian Candler
Do you really mean "postgres_db_exporter", or do you mean 
"postgres_exporter", i.e. this one?
https://github.com/prometheus-community/postgres_exporter

Maybe you have the wrong values for 'release' and 'instance' labels.  What 
happens if you query:

pg_postmaster_start_time_seconds

without any label filters? (You can do this in prometheus's own web 
interface).

If that doesn't return anything, then are you getting *any* metrics from 
the exporter at all?  For example, if your scrape job has

  - job_name: 'postgres'

then you can try querying

{job="postgres"}

If that query shows nothing, then probably your scrapes aren't working at 
all.  Look in the Prometheus web interface under Status > Targets.

On Monday, 23 January 2023 at 06:22:25 UTC prashan...@gmail.com wrote:

> hello ,
>
> I am not able find metrics having * pg_postmaster_start_time_seconds.*
>
> *postres_exporter version *0.11.1
> *DB version : 12.12.0*
>
>
> pg_postmaster_start_time_seconds{release="$release", instance="$instance"} 
> * 1000
>
> thanks 
>
> prashant 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a8172ab0-df23-466b-ad6b-8f6eff42b432n%40googlegroups.com.


Re: [prometheus-users] prometheus alermanager not sending alert CC user list

2023-01-23 Thread Brian Candler
You've mangled the YAML so it definitely won't work with what you've shown.

You'll need something like this:

- name: 'MoniDashboard'
  email_configs:
  - send_resolved: true
to: 'vinoth.sundaram@domain1,PrashantKumar.Singh1@domain2'
headers:
  Subject: "{{ .CommonAnnotations.summary }}"
  To: 'vinoth.sundaram@domain1'
  Cc: 'PrashantKumar.Singh1@domain2'
require_tls: no

The important thing is that you need the 'to' value to be a comma-separated 
list of recipient addresses.  See:
https://github.com/prometheus/alertmanager/blob/v0.25.0/notify/email/email.go#L230-L238
https://pkg.go.dev/net/mail#ParseAddressList

If it still doesn't work how you want, then you need to be more specific 
about your problem than "this is not working". For example:
- If the message isn't delivered to both receipients, you'll need to look 
at the log output of alertmanager and the logs from your E-mail relay (SMTP 
host)
- If the message is delivered to both receipients, but the "To" and "Cc" 
headers don't appear in the way that you want, then you'll need to show the 
actual message headers that you receive, and explain how they differ from 
what you expected.

On Monday, 23 January 2023 at 06:17:05 UTC prashan...@gmail.com wrote:

> Hi ,
>
> this is not working , i am not able to receive alerts  in having user  CC 
>
> - name: 'MoniDashboard'
>   email_configs:
>   - send_resolved: true
> to: 'vinoth.sundaram@
> headers:
>   subject: "{{ .CommonAnnotations.summary }}"
>   to: 'vinoth.sundaram@
>   CC: 'PrashantKumar.Singh1@
> require_tls: no
>
>
> Thanks 
> prashant 
>
>
>
>
> On Friday, January 20, 2023 at 7:33:08 PM UTC+5:30 juliu...@promlabs.com 
> wrote:
> Ah, thanks for that addition. Indeed, "to" is handled in a special way 
> here: 
> https://github.com/prometheus/alertmanager/blob/f59460bfd4bf883ca66f4391e7094c0c1794d158/notify/email/email.go#LL230C28-L230C28
>
> Which makes sense given that in SMTP, there's always separate "RCPT TO: 
> <...>" lines (which I guess this translates into) before the body of the 
> email that contains the headers.
>
> On Fri, Jan 20, 2023 at 2:34 PM Julien Pivotto  
> wrote:
> To send an email with a CC: in alertmanager, it is not sufficient to add a 
> CC: header.
>
> receivers:
> - name: my-receiver
>   email_configs:
>   - to: 'o...@foo.com'
> headers:
>   subject: 'my subject'
>   CC: 't...@bar.com'
>
> You also need to add the CC: address to the to: field and explicitely add 
> a to: field.
>
> receivers:
> - name: my-receiver
>   email_configs:
>   - to: 'o...@foo.com,t...@bar.com'
> headers:
>   subject: 'my subject'
>   To: 'o...@foo.com'
>   CC: 't...@bar.com'
>
> To send a mail in BCC:, overwrite the To: header and add the BCC addresses 
> to the to: field:
>
> receivers:
> - name: my-receiver
>   email_configs:
>   - to: 'o...@foo.com,th...@foo.com'
> headers:
>   subject: 'my subject'
>   To: 'o...@foo.com'
>
> On 20 Jan 14:28, Julius Volz wrote:
> > Hi Prashant,
> > 
> > Looking at the email implementation in Alertmanager, "to" should be 
> treated
> > exactly as "cc" internally (just optionally supplied through a dedicated
> > YAML field). It's just another header:
> > 
> https://github.com/prometheus/alertmanager/blob/f59460bfd4bf883ca66f4391e7094c0c1794d158/notify/email/email.go#L53-L70
> > 
> > So I would be surprised if the problem is in Alertmanager itself, rather
> > than something in the email pipeline after it.
> > 
> > Kind regards,
> > Julius
> > 
> > On Fri, Jan 20, 2023 at 1:38 PM Prashant Singh 
> > wrote:
> > 
> > > Dear team,
> > >
> > > prometheus alertmanager CC is not working even as add TO_ and CC_ .
> > >
> > > TO:  it is working
> > > CC: it is not working.
> > >
> > >
> > > - name: 'MoniDashboard'
> > >   email_configs:
> > >   - send_resolved: true
> > > to: 'vinoth.sundaram
> > > headers:
> > >   cc: 'PrashantKumar.Singh1'
> > > require_tls: no
> > >
> > > thanks
> > > Prashant
> > >
> > > --
> > > You received this message because you are subscribed to the Google 
> Groups
> > > "Prometheus Users" group.
> > > To unsubscribe from this group and stop receiving emails from it, send 
> an
> > > email to prometheus-use...@googlegroups.com.
> > > To view this discussion on the web visit
> > > 
> https://groups.google.com/d/msgid/prometheus-users/9987573c-ed69-4fb2-94aa-4c37870c540dn%40googlegroups.com
> > > <
> https://groups.google.com/d/msgid/prometheus-users/9987573c-ed69-4fb2-94aa-4c37870c540dn%40googlegroups.com?utm_medium=email_source=footer
> >
> > > .
> > >
> > 
> > 
> > -- 
> > Julius Volz
> > PromLabs - promlabs.com
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "Prometheus Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to prometheus-use...@googlegroups.com.
> > To view this discussion on the web visit 
>