[prometheus-users] Re: Promteheus HA different metrics

2023-09-04 Thread Brian Candler
Note that setting the scrape timeout longer than the scrape interval won't 
achieve anything.

I'd suggest you investigate by looking at the history of the "up" metric: 
this will go to zero on scrape failures.  Can you discern a pattern?  Is it 
only on a certain type of target, or targets running on a particular k8s 
node?  Is it intermittent across all targets, or some targets which fail 
100% of the time?

If you compare the Targets page on both servers, are they scraping exactly 
the same URLs?  (That is, check whether service discovery is giving 
different results)

On Tuesday, 5 September 2023 at 06:09:55 UTC+1 Анастасия Зель wrote:

> yes, i see errors on targets page in web interface.
> I tried to increase timeout to 5 minutes and it changes nothing. 
> Its strange because prometheus 2 always get this error on similar pods. 
> And prometheus 1 never get this errors on this pods. 
> понедельник, 4 сентября 2023 г. в 19:00:32 UTC+4, Brian Candler: 
>
>> On Monday, 4 September 2023 at 15:49:25 UTC+1 Анастасия Зель wrote:
>>
>> Hello, we use HA prometheus with two servers.
>>
>> You mean, two Prometheus servers with the same config, both scraping the 
>> same targets?
>>
>>  
>>
>> The problem is we get different metrics in dashboards from this two 
>> servers.
>>
>> Small differences are to be expected.  That's because the two servers 
>> won't be scraping the targets at the same points in time.  If you see more 
>> significant differences, then please provide some examples.
>>
>>  
>>
>> And we also scrape metrics from k8s, and some pods are not scraping 
>> because of error context deadline exceeded
>>
>> That basically means "scrape timed out".  The scrape hadn't completed 
>> within the "scrape_timeout:" value that you've set.  You'll need to look at 
>> your individual exporters and the failing scrape URLs: either the target is 
>> not reachable at all (e.g. firewalling or network configuration issue), or 
>> the target is taking too long to respond.
>>  
>>
>> Its differents pods on each server. In prometheus logs we dont see any of 
>> errors.
>>
>> Where *do* you see the "context deadline exceeded" errors then?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ff7ed768-c75b-462d-be60-7c2d47773751n%40googlegroups.com.


[prometheus-users] Re: Promteheus HA different metrics

2023-09-04 Thread Анастасия Зель
yes, i see errors on targets page in web interface.
I tried to increase timeout to 5 minutes and it changes nothing. 
Its strange because prometheus 2 always get this error on similar pods. And 
prometheus 1 never get this errors on this pods. 
понедельник, 4 сентября 2023 г. в 19:00:32 UTC+4, Brian Candler: 

> On Monday, 4 September 2023 at 15:49:25 UTC+1 Анастасия Зель wrote:
>
> Hello, we use HA prometheus with two servers.
>
> You mean, two Prometheus servers with the same config, both scraping the 
> same targets?
>
>  
>
> The problem is we get different metrics in dashboards from this two 
> servers.
>
> Small differences are to be expected.  That's because the two servers 
> won't be scraping the targets at the same points in time.  If you see more 
> significant differences, then please provide some examples.
>
>  
>
> And we also scrape metrics from k8s, and some pods are not scraping 
> because of error context deadline exceeded
>
> That basically means "scrape timed out".  The scrape hadn't completed 
> within the "scrape_timeout:" value that you've set.  You'll need to look at 
> your individual exporters and the failing scrape URLs: either the target is 
> not reachable at all (e.g. firewalling or network configuration issue), or 
> the target is taking too long to respond.
>  
>
> Its differents pods on each server. In prometheus logs we dont see any of 
> errors.
>
> Where *do* you see the "context deadline exceeded" errors then?
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/3718ef76-392f-4af8-b7b9-bb371813c76dn%40googlegroups.com.


Re: [prometheus-users] Re: Promteheus HA different metrics

2023-09-04 Thread Ben Kochie
On Mon, Sep 4, 2023 at 5:00 PM Brian Candler  wrote:

> On Monday, 4 September 2023 at 15:49:25 UTC+1 Анастасия Зель wrote:
>
> Hello, we use HA prometheus with two servers.
>
> You mean, two Prometheus servers with the same config, both scraping the
> same targets?
>
>
>
> The problem is we get different metrics in dashboards from this two
> servers.
>
> Small differences are to be expected.  That's because the two servers
> won't be scraping the targets at the same points in time.  If you see more
> significant differences, then please provide some examples.
>
>
>
> And we also scrape metrics from k8s, and some pods are not scraping
> because of error context deadline exceeded
>
> That basically means "scrape timed out".  The scrape hadn't completed
> within the "scrape_timeout:" value that you've set.  You'll need to look at
> your individual exporters and the failing scrape URLs: either the target is
> not reachable at all (e.g. firewalling or network configuration issue), or
> the target is taking too long to respond.
>
>
> Its differents pods on each server. In prometheus logs we dont see any of
> errors.
>
> Where *do* you see the "context deadline exceeded" errors then?
>

Usually on the `/targets` page.

Prometheus does not log scrape errors by default. I would love this to be a
configuration option, or even better, a per-job `scrape_configs` option.


> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/34cf1354-9e58-4517-8c3d-3301d4fc0236n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmrsM%3DDjSu2Mjvkmhzo%3D5XNJbmNvDFPN3fScuVRBOkzs%3Dg%40mail.gmail.com.


[prometheus-users] Re: Promteheus HA different metrics

2023-09-04 Thread Brian Candler
On Monday, 4 September 2023 at 15:49:25 UTC+1 Анастасия Зель wrote:

Hello, we use HA prometheus with two servers.

You mean, two Prometheus servers with the same config, both scraping the 
same targets?

 

The problem is we get different metrics in dashboards from this two servers.

Small differences are to be expected.  That's because the two servers won't 
be scraping the targets at the same points in time.  If you see more 
significant differences, then please provide some examples.

 

And we also scrape metrics from k8s, and some pods are not scraping because 
of error context deadline exceeded

That basically means "scrape timed out".  The scrape hadn't completed 
within the "scrape_timeout:" value that you've set.  You'll need to look at 
your individual exporters and the failing scrape URLs: either the target is 
not reachable at all (e.g. firewalling or network configuration issue), or 
the target is taking too long to respond.
 

Its differents pods on each server. In prometheus logs we dont see any of 
errors.

Where *do* you see the "context deadline exceeded" errors then?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/34cf1354-9e58-4517-8c3d-3301d4fc0236n%40googlegroups.com.


[prometheus-users] Promteheus HA different metrics

2023-09-04 Thread Анастасия Зель


Hello, we use HA prometheus with two servers.
The problem is we get different metrics in dashboards from this two servers.
And we also scrape metrics from k8s, and some pods are not scraping because 
of error context deadline exceeded
Its differents pods on each server. In prometheus logs we dont see any of 
errors. How is that possible? What we can do for debug this?
prometheus, version 2.40.7 (branch: HEAD, revision: 
ab239ac5d43f6c1068f0d05283a0544576aaecf8) build user: root@afba4a8bd7cc 
build date: 20221214-08:49:43 go version: go1.19.4 platform: linux/amd64

prometheus config file
# This file is managed by ansible. Please don't edit it by hand or your 
changes would be overwritten.
#
# http://prometheus.io/docs/operating/configuration/

global:
  evaluation_interval: 30s
  scrape_interval: 30s
  scrape_timeout: 15s

  external_labels:
null




rule_files:
  - /etc/prometheus/rules/*.rules

  - job_name: 'k8s_pods'
scrape_interval: 5m
scrape_timeout: 1m
kubernetes_sd_configs:
  - role: pod
api_server: https://x.x.x.x:6443
tls_config:
  insecure_skip_verify: true
bearer_token_file: "/etc/prometheus/kubernetes_bearer_token"
relabel_configs:
  - source_labels: 
[__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
  - source_labels: [__address__, 
__meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: (.+):(?:\d+);(\d+)
replacement: ${1}:${2}
target_label: __address__
  - action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
  - source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_pod_node_name 

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5562ad53-4827-458d-885b-a206ca19c4a2n%40googlegroups.com.


[prometheus-users] Re: usage of multiple mysql db instances with openshift

2023-09-04 Thread Brian Candler
On Monday, 4 September 2023 at 10:15:06 UTC+1 sneha wrote:

i tried adding multiple db instance in .my.cnf file with mysql exporter 
version 0.15.0 but it only scrape DB instance only.
is there any sample to use mutliple DB instance for single exporter


Yes, there is an example Prometheus config at
https://github.com/prometheus/mysqld_exporter/#multi-target-support
(see under "On the prometheus side ... ")

It sets the `target` parameter for each scrape to the DB of interest, 
whilst setting __address__ to point to the exporter itself.

Before doing this, make sure you're able to scrape the exporter directly 
using curl, i.e.
curl 'http://localhost:9104/probe?target=server1:3306'
curl 'http://localhost:9104/probe?target=server2:3306'
should give you metrics for the two servers.

and how to distinguish scrape data is of which instance.


In a label.  The example config that I linked above copies "__param_target" 
to "instance", so the "instance" label tells you which DB target was being 
scraped.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/20b820fe-0e01-4908-8ec6-1629bfe430b2n%40googlegroups.com.


[prometheus-users] usage of multiple mysql db instances with openshift

2023-09-04 Thread sneha
hi,

i tried adding multiple db instance in .my.cnf file with mysql exporter 
version 0.15.0 but it only scrape DB instance only.
is there any sample to use mutliple DB instance for single exporter and how 
to distinguish scrape data is of which instance.

thanks,
sneha

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/90938402-72b8-45a6-a0de-9a56447e2d6en%40googlegroups.com.


[prometheus-users] Re: Sharing selected data between 2 Prometheis

2023-09-04 Thread Brian Candler
On Sunday, 3 September 2023 at 16:59:22 UTC+1 Brian Candler wrote:

I don't know if Prometheus itself implements the remote read protocol as a 
storage endpoint (I've never come across it).


To answer my own question, it does: 
see https://prometheus.io/docs/prometheus/latest/querying/remote_read_api/

However "*This is not currently considered part of the stable API and is 
subject to change even between non-major version releases of Prometheus.*"  
(although it has said this since May 2020)

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/238bdf87-fc44-4b6f-829d-29fd40c4395bn%40googlegroups.com.