Re: [prometheus-users] Extracting long queries from multiple histograms

2022-04-21 Thread Victor Sudakov
Julius Volz wrote:

[dd]
> >
> > The query `app1_response_duration_bucket{{le="0.75"}` will return a
> > list of endpoints which have responded faster than 0.75s.
> >
> 
> This is not quite correct - this query gives you the le="0.75" bucket
> counter for *all* endpoints, 

OK, I stand corrected.

> and the value of each bucket counter tells you
> how many requests that endpoint has handled that completed within 0.75s
> since the exposing process started tracking things.

What if I want to see how many requests each endpoint has handled that
DID NOT complete within 0.75s since the exposing process started
tracking things?
> 
> 
> > How do I invert the "le" and find the endpoints slower than "le"?
> >
> 
> Hmm, histograms are usually used to tell you about the *distribution* of
> request latencies to a given endpoint (or other label combination). So it's
> unclear what you mean with an endpoint being slower than some "le" value.

Please see above.

> Do you want to find out whether some endpoint has handled any requests *at
> all* that took longer than some duration? Or only if that happened in the
> last X amount of time? 

Yes, I think I can put it like this. I would like to be informed if any
endpoint has become "slow" and the details may vary.


> Or only if a certain percentage of requests were too
> slow?
> 
> One thing people frequently do is to calculate percentiles / quantiles from
> a histogram, for example:
> 
> histogram_quantile(0.9, rate(app1_response_duration_bucket[5m]))
> 
> ...would tell you the approximated 90th percentile latency in seconds as
> averaged over a moving 5-minute window for a given label combination, which
> you can then combine with a filter operator to find slow endpoints (e.g.
> "... > 10" would give you those endpoints that have a 90th percentile
> latency above 10s).

I've tried to graph "histogram_quantile(0.9, 
rate(app1_response_duration_bucket[5m])) > 3" 
but the result is very hard to interpret visually. It almost makes no sense.

It's slightly more understandable as a table/list.

-- 
Victor Sudakov VAS4-RIPE
http://vas.tomsk.ru/
2:5005/49@fidonet

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/YmHfziweOcQGpIjh%40admin.sibptus.ru.


[prometheus-users] Re: need create custom metrics

2022-04-21 Thread Brian Candler
On Thursday, 21 April 2022 at 19:40:07 UTC+1 deeshu...@gmail.com wrote:

> Thanks a lot for your valuable inputs and suggestions. 
> *query1:*
> I had created with python with  Prometheus-client library,  I want to 
> execute this lib as Pod so need to create docker 
> image(python,prometheus-client ).
> here size of the image is going up to 130+mb and also CPU core utilization 
> is more. Only for few 4 to 6 custom metrics its consuming approximate 1 
> core.
>
> Do you have any idea to minimize the CPU utilization
>

In principle it should use zero cores while it's idle.  Normally an 
exporter only does work when it's being scraped (i.e. handling an incoming 
http request).  If that's not the case, then it's doing whatever you told 
it to do.  Maybe you have an infinite loop or something in your code?

If you copied their sample code:

if __name__ == '__main__':
# Start up the server to expose the metrics.
start_http_server(8000)
# Generate some requests.
*while True:*
*process_request(random.random())*

then clearly you'll be using a whole CPU core as this loop spins as fast as 
it can.  This is not meant to be how a real exporter works.  Rather, when 
your application does some other work (e.g. processing an incoming HTTP 
request) it can also increment counters or whatever.

 

> *query2:*
> When I am using this lib , I can still see some unwanted metrics are 
> populating other than mu custom metrics. Do you have any idea like jow I 
> can remove it from my metrics list?
>
>
Can you show the metrics?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/cc5e5823-2055-4311-91bb-153cc75d299en%40googlegroups.com.


[prometheus-users] Re: need create custom metrics

2022-04-21 Thread Deekshith V
Thanks a lot for your valuable inputs and suggestions. 
*query1:*
I had created with python with  Prometheus-client library,  I want to 
execute this lib as Pod so need to create docker 
image(python,prometheus-client ).
here size of the image is going up to 130+mb and also CPU core utilization 
is more. Only for few 4 to 6 custom metrics its consuming approximate 1 
core.

Do you have any idea to minimize the CPU utilization

*query2:*
When I am using this lib , I can still see some unwanted metrics are 
populating other than mu custom metrics. Do you have any idea like jow I 
can remove it from my metrics list?

*Once again thanks a lot @*Brian Candler


On Thursday, April 21, 2022 at 10:48:03 PM UTC+5:30 Brian Candler wrote:

> See: https://prometheus.io/docs/instrumenting/clientlibs/
>
> Choose your programming language, write your exporter using the prometheus 
> client library for that language. There are various tutorials, e.g.
> https://prometheus.io/docs/guides/go-application/
>
> At the end of the day, an exporter is just a HTTP server that returns a 
> response body containing prometheus metrics exposition format 
> . The 
> client libraries can just make things a little easier, e.g. maintaining 
> counters for you.
>
> Some alternative approaches you could also consider:
> - use node_exporter's textfile collector. Then you just need to write 
> metrics to a file, and node_exporter will pick them up automatically.
> - use exporter_exporter 
> , which is able to 
> exec a script
> - use one of the other generic exporters like statsd_exporter or 
> pushgateway, and write your metrics to that (where they will persist, 
> waiting for prometheus to scrape them)
>
> On Thursday, 21 April 2022 at 16:36:30 UTC+1 deeshu...@gmail.com wrote:
>
>> How to write Custom - exporter for specific application? We need create 
>> custom metrics to full fill our application level metrics .
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f47c1fc5-463f-4f3f-84c3-3ca1b109463en%40googlegroups.com.


Re: [prometheus-users] Extracting long queries from multiple histograms

2022-04-21 Thread Julius Volz
On Wed, Apr 20, 2022 at 10:25 PM Victor Sudakov  wrote:

> Victor Sudakov wrote:
> >
> > There is a web app which exports its metrics as multiple histograms,
> > one histogram per Web endpoint. So each set of histogram data is also
> > labelled by the {endpoint} label. There are about 50 endpoints so
> > about 50 histograms.
> >
> > I would like to detect and graph slow endpoints, that is I would like
> > to know the value of {endpoint} when its {le} is over 1s or something
> > like that.
> >
> > Can you please help with a relevant PromQL query and an idea how to
> > represent it in Grafana?
> >
> > I don't actually want 50 heatmaps, there must be a clever way to make
> > an overview of all the slow endpoints, or all the endpoints with a
> > particular status code etc.
>
> An example. The PromQL query
> `app1_response_duration_bucket{external_endpoint="http://YY/XX
> ",status_code="200",method="GET"}`
> produces a histogram.
>
> The PromQL query
> `app1_response_duration_bucket{external_endpoint="http://YY/XX
> ",status_code="200",method="POST"}`
> produces another histogram.
>
> The query `app1_response_duration_bucket{{le="0.75"}` will return a
> list of endpoints which have responded faster than 0.75s.
>

This is not quite correct - this query gives you the le="0.75" bucket
counter for *all* endpoints, and the value of each bucket counter tells you
how many requests that endpoint has handled that completed within 0.75s
since the exposing process started tracking things.


> How do I invert the "le" and find the endpoints slower than "le"?
>

Hmm, histograms are usually used to tell you about the *distribution* of
request latencies to a given endpoint (or other label combination). So it's
unclear what you mean with an endpoint being slower than some "le" value.
Do you want to find out whether some endpoint has handled any requests *at
all* that took longer than some duration? Or only if that happened in the
last X amount of time? Or only if a certain percentage of requests were too
slow?

One thing people frequently do is to calculate percentiles / quantiles from
a histogram, for example:

histogram_quantile(0.9, rate(app1_response_duration_bucket[5m]))

...would tell you the approximated 90th percentile latency in seconds as
averaged over a moving 5-minute window for a given label combination, which
you can then combine with a filter operator to find slow endpoints (e.g.
"... > 10" would give you those endpoints that have a 90th percentile
latency above 10s).

See also https://prometheus.io/docs/practices/histograms/ for more details
on using histograms.

Regards,
Julius

-- 
Julius Volz
PromLabs - promlabs.com

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAObpH5y9Px7ruK7Zxfqh0iTwa-x9PnfDpWU%3DnuKyfXgmGj4R6w%40mail.gmail.com.


[prometheus-users] Re: need create custom metrics

2022-04-21 Thread Brian Candler
See: https://prometheus.io/docs/instrumenting/clientlibs/

Choose your programming language, write your exporter using the prometheus 
client library for that language. There are various tutorials, e.g.
https://prometheus.io/docs/guides/go-application/

At the end of the day, an exporter is just a HTTP server that returns a 
response body containing prometheus metrics exposition format 
. The client 
libraries can just make things a little easier, e.g. maintaining counters 
for you.

Some alternative approaches you could also consider:
- use node_exporter's textfile collector. Then you just need to write 
metrics to a file, and node_exporter will pick them up automatically.
- use exporter_exporter , 
which is able to exec a script
- use one of the other generic exporters like statsd_exporter or 
pushgateway, and write your metrics to that (where they will persist, 
waiting for prometheus to scrape them)

On Thursday, 21 April 2022 at 16:36:30 UTC+1 deeshu...@gmail.com wrote:

> How to write Custom - exporter for specific application? We need create 
> custom metrics to full fill our application level metrics .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4dd7dd4d-48a0-47da-99e3-060e3c9748c4n%40googlegroups.com.


[prometheus-users] Re: Run two node exporters on same server

2022-04-21 Thread Brian Candler
Could mean many things:
- node_exporter isn't running on the target host
- node_exporter is listening on a different port than the one you're trying 
to connect to
- it is listening on the wrong IP address or interface (e.g. if bound to 
127.0.0.1 then it won't accept connections from outside)

There are also some types of firewall which can block traffic in this way, 
making it look like connection refused.

On Thursday, 21 April 2022 at 16:03:23 UTC+1 chembakay...@gmail.com wrote:

> thanks for your reply. I think we fixed some firewall issues and now 
> working fine for most servers. But still we are facing new error like 
>
> Get "http://some_ip:port_number/metrics": dial tcp some_ip:port_number: 
> connect: connection refused
>
> what could be the reason for this error?
>
> thanks 
> Bharath
>
> On Thursday, 21 April 2022 at 15:19:49 UTC+5:30 Brian Candler wrote:
>
>> "Context deadline exceeded" simply means "timeout waiting to connect or 
>> receive data"
>>
>> It sounds to me like you have a network connectivity problem between the 
>> client (i.e. prometheus) and wherever the binary node exporter was 
>> installed.  Talk to a local network administrator or system administrator 
>> to help you find where the problem is.
>>
>> The best way to reproduce this would be to run the same curl command on 
>> the prometheus server itself.
>>
>> If prometheus is running inside a container, then run the curl command 
>> inside that container.  If prometheus is running inside a kubernetes pod, 
>> then see here 
>> 
>>  
>> - you may need to add an ephemeral debugging container to your pod to be 
>> able to use 'curl', if your container image doesn't already have it.
>>
>> On Thursday, 21 April 2022 at 10:00:25 UTC+1 chembakay...@gmail.com 
>> wrote:
>>
>>>
>>> If I do curl in the particular server where I installed n=binary node 
>>> exporter I am able to see metrics but in browser and grafana and in 
>>> prometheus UI I am not able to see metrics. It is showing CONTEXT DEADLINE 
>>> EXCEEDED.
>>>
>>> thanks 
>>> Bharath
>>> On Thursday, 21 April 2022 at 12:18:23 UTC+5:30 Brian Candler wrote:
>>>
 > we are getting error like context deadline exceeded

 You haven't show the actual error, nor where you saw it.

 The most likely explanation I can see is simply that prometheus cannot 
 communicate with node_exporter - for example, you've misconfigured the 
 target or there is some sort of firewalling in between.

 To prove this, login to the prometheus server (or container where 
 prometheus is running), and do:

 curl -h 'http://x.x.x.x:/metrics'

 where x.x.x.x: is the IP address and port that you've configured as 
 the target to scrape.

 On Thursday, 21 April 2022 at 07:00:24 UTC+1 chembakay...@gmail.com 
 wrote:

> Hiii
>
> Thanks for your reply. we were using two different ports for two node 
> exporters. Actually one node exporter is running in container/pod and 
> another node exporter is running as a binary file. we are not getting any 
> issue with container/pod one. But when we are running with binary one we 
> are getting error like context deadline exceeded.
>
> our prometheus config file is as follows:
>
> scrape_interval: 2m
> evaluation_interval: 15s
> scrape_timeout: 2m
>
> could anyone please help me out of this?
>
> regards,
> Bharath.
> On Wednesday, 20 April 2022 at 17:50:01 UTC+5:30 Brian Candler wrote:
>
>> You need to be clearer about what you're doing and what errors you 
>> see.
>>
>> Yes, you could run two node_exporters on the same server, bound to 
>> different ports.  However this is normally completely pointless, since 
>> they're both monitoring the same server, and exporting the same data.
>>
>> It is not recommended that you run node_exporter inside a container 
>> at all.  But if you do, see the instructions here:
>> https://github.com/prometheus/node_exporter#docker
>>
>> On Wednesday, 20 April 2022 at 11:59:27 UTC+1 chembakay...@gmail.com 
>> wrote:
>>
>>> Hii all,
>>>
>>> I want to run two node exporter on same server.. One is with 
>>> kubernetes and another one with binary file. each node exporter has 
>>> different port numbers. I am getting error like time out or context 
>>> deadline exceeded. Will it be possible if so sould you please tell me 
>>> the 
>>> solution?
>>>
>>> thanks 
>>> Bharath 
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

[prometheus-users] need create custom metrics

2022-04-21 Thread Deekshith V
How to write Custom - exporter for specific application? We need create 
custom metrics to full fill our application level metrics .


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/365199ba-c6e8-4e13-a42b-286659bd7c99n%40googlegroups.com.


[prometheus-users] Wants to pull metrics from running node(eclair)

2022-04-21 Thread 17_goutam verma
Hello Everyone,

I want to export metrics from the eclair node(A lighting network) I am 
currently trying to build a monitoring tool for it. I am not pretty much 
sure about how I will pull or export metrics from the eclair node running 
on a Linux system.

Please comment your points
Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8400f7fa-08d6-4634-96f0-f07aa9c0869an%40googlegroups.com.


Re: [prometheus-users] Re: Run two node exporters on same server

2022-04-21 Thread Stuart Clark

On 2022-04-21 16:03, BHARATH KUMAR wrote:

thanks for your reply. I think we fixed some firewall issues and now
working fine for most servers. But still we are facing new error like

Get "http://some_ip:port_number/metrics": dial tcp
some_ip:port_number: connect: connection refused

what could be the reason for this error?



That generally means that the connection is passing through the 
firewalls ok but the end server is then rejecting it. Usually because 
the port number is wrong or the service attached to that port isn't 
running. For containers it could mean the port hasn't been exposed to 
the outside host.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/bbd74dc8fa03688e6714b86ae1f79100%40Jahingo.com.


[prometheus-users] Re: Run two node exporters on same server

2022-04-21 Thread BHARATH KUMAR
thanks for your reply. I think we fixed some firewall issues and now 
working fine for most servers. But still we are facing new error like 

Get "http://some_ip:port_number/metrics": dial tcp some_ip:port_number: 
connect: connection refused

what could be the reason for this error?

thanks 
Bharath

On Thursday, 21 April 2022 at 15:19:49 UTC+5:30 Brian Candler wrote:

> "Context deadline exceeded" simply means "timeout waiting to connect or 
> receive data"
>
> It sounds to me like you have a network connectivity problem between the 
> client (i.e. prometheus) and wherever the binary node exporter was 
> installed.  Talk to a local network administrator or system administrator 
> to help you find where the problem is.
>
> The best way to reproduce this would be to run the same curl command on 
> the prometheus server itself.
>
> If prometheus is running inside a container, then run the curl command 
> inside that container.  If prometheus is running inside a kubernetes pod, 
> then see here 
> 
>  
> - you may need to add an ephemeral debugging container to your pod to be 
> able to use 'curl', if your container image doesn't already have it.
>
> On Thursday, 21 April 2022 at 10:00:25 UTC+1 chembakay...@gmail.com wrote:
>
>>
>> If I do curl in the particular server where I installed n=binary node 
>> exporter I am able to see metrics but in browser and grafana and in 
>> prometheus UI I am not able to see metrics. It is showing CONTEXT DEADLINE 
>> EXCEEDED.
>>
>> thanks 
>> Bharath
>> On Thursday, 21 April 2022 at 12:18:23 UTC+5:30 Brian Candler wrote:
>>
>>> > we are getting error like context deadline exceeded
>>>
>>> You haven't show the actual error, nor where you saw it.
>>>
>>> The most likely explanation I can see is simply that prometheus cannot 
>>> communicate with node_exporter - for example, you've misconfigured the 
>>> target or there is some sort of firewalling in between.
>>>
>>> To prove this, login to the prometheus server (or container where 
>>> prometheus is running), and do:
>>>
>>> curl -h 'http://x.x.x.x:/metrics'
>>>
>>> where x.x.x.x: is the IP address and port that you've configured as 
>>> the target to scrape.
>>>
>>> On Thursday, 21 April 2022 at 07:00:24 UTC+1 chembakay...@gmail.com 
>>> wrote:
>>>
 Hiii

 Thanks for your reply. we were using two different ports for two node 
 exporters. Actually one node exporter is running in container/pod and 
 another node exporter is running as a binary file. we are not getting any 
 issue with container/pod one. But when we are running with binary one we 
 are getting error like context deadline exceeded.

 our prometheus config file is as follows:

 scrape_interval: 2m
 evaluation_interval: 15s
 scrape_timeout: 2m

 could anyone please help me out of this?

 regards,
 Bharath.
 On Wednesday, 20 April 2022 at 17:50:01 UTC+5:30 Brian Candler wrote:

> You need to be clearer about what you're doing and what errors you see.
>
> Yes, you could run two node_exporters on the same server, bound to 
> different ports.  However this is normally completely pointless, since 
> they're both monitoring the same server, and exporting the same data.
>
> It is not recommended that you run node_exporter inside a container at 
> all.  But if you do, see the instructions here:
> https://github.com/prometheus/node_exporter#docker
>
> On Wednesday, 20 April 2022 at 11:59:27 UTC+1 chembakay...@gmail.com 
> wrote:
>
>> Hii all,
>>
>> I want to run two node exporter on same server.. One is with 
>> kubernetes and another one with binary file. each node exporter has 
>> different port numbers. I am getting error like time out or context 
>> deadline exceeded. Will it be possible if so sould you please tell me 
>> the 
>> solution?
>>
>> thanks 
>> Bharath 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/74f4ec0c-0a82-4dcd-b9fe-46871faae202n%40googlegroups.com.


[prometheus-users] Re: Run two node exporters on same server

2022-04-21 Thread Brian Candler
"Context deadline exceeded" simply means "timeout waiting to connect or 
receive data"

It sounds to me like you have a network connectivity problem between the 
client (i.e. prometheus) and wherever the binary node exporter was 
installed.  Talk to a local network administrator or system administrator 
to help you find where the problem is.

The best way to reproduce this would be to run the same curl command on the 
prometheus server itself.

If prometheus is running inside a container, then run the curl command 
inside that container.  If prometheus is running inside a kubernetes pod, 
then see here 
 
- you may need to add an ephemeral debugging container to your pod to be 
able to use 'curl', if your container image doesn't already have it.

On Thursday, 21 April 2022 at 10:00:25 UTC+1 chembakay...@gmail.com wrote:

>
> If I do curl in the particular server where I installed n=binary node 
> exporter I am able to see metrics but in browser and grafana and in 
> prometheus UI I am not able to see metrics. It is showing CONTEXT DEADLINE 
> EXCEEDED.
>
> thanks 
> Bharath
> On Thursday, 21 April 2022 at 12:18:23 UTC+5:30 Brian Candler wrote:
>
>> > we are getting error like context deadline exceeded
>>
>> You haven't show the actual error, nor where you saw it.
>>
>> The most likely explanation I can see is simply that prometheus cannot 
>> communicate with node_exporter - for example, you've misconfigured the 
>> target or there is some sort of firewalling in between.
>>
>> To prove this, login to the prometheus server (or container where 
>> prometheus is running), and do:
>>
>> curl -h 'http://x.x.x.x:/metrics'
>>
>> where x.x.x.x: is the IP address and port that you've configured as 
>> the target to scrape.
>>
>> On Thursday, 21 April 2022 at 07:00:24 UTC+1 chembakay...@gmail.com 
>> wrote:
>>
>>> Hiii
>>>
>>> Thanks for your reply. we were using two different ports for two node 
>>> exporters. Actually one node exporter is running in container/pod and 
>>> another node exporter is running as a binary file. we are not getting any 
>>> issue with container/pod one. But when we are running with binary one we 
>>> are getting error like context deadline exceeded.
>>>
>>> our prometheus config file is as follows:
>>>
>>> scrape_interval: 2m
>>> evaluation_interval: 15s
>>> scrape_timeout: 2m
>>>
>>> could anyone please help me out of this?
>>>
>>> regards,
>>> Bharath.
>>> On Wednesday, 20 April 2022 at 17:50:01 UTC+5:30 Brian Candler wrote:
>>>
 You need to be clearer about what you're doing and what errors you see.

 Yes, you could run two node_exporters on the same server, bound to 
 different ports.  However this is normally completely pointless, since 
 they're both monitoring the same server, and exporting the same data.

 It is not recommended that you run node_exporter inside a container at 
 all.  But if you do, see the instructions here:
 https://github.com/prometheus/node_exporter#docker

 On Wednesday, 20 April 2022 at 11:59:27 UTC+1 chembakay...@gmail.com 
 wrote:

> Hii all,
>
> I want to run two node exporter on same server.. One is with 
> kubernetes and another one with binary file. each node exporter has 
> different port numbers. I am getting error like time out or context 
> deadline exceeded. Will it be possible if so sould you please tell me the 
> solution?
>
> thanks 
> Bharath 
>


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/290b0907-3314-4945-8b38-84e1c6cd7ee6n%40googlegroups.com.


[prometheus-users] Re: black box exporter monitoring SSH and PING

2022-04-21 Thread Brian Candler
On Thursday, 21 April 2022 at 09:51:18 UTC+1 Brian Candler wrote:

> If I understand correctly: Prometheus doesn't explicitly "resolve" an 
> alert, rather it just stops sending that alert.
>

Sorry, I was wrong. To resolve the alert, prometheus posts an alert with 
endsAt equal to the time when the alert went away.  (Tested with tcpdump)

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/b4998a09-293c-4c18-a687-d80a027d76d7n%40googlegroups.com.


[prometheus-users] Re: Run two node exporters on same server

2022-04-21 Thread BHARATH KUMAR

If I do curl in the particular server where I installed n=binary node 
exporter I am able to see metrics but in browser and grafana and in 
prometheus UI I am not able to see metrics. It is showing CONTEXT DEADLINE 
EXCEEDED.

thanks 
Bharath
On Thursday, 21 April 2022 at 12:18:23 UTC+5:30 Brian Candler wrote:

> > we are getting error like context deadline exceeded
>
> You haven't show the actual error, nor where you saw it.
>
> The most likely explanation I can see is simply that prometheus cannot 
> communicate with node_exporter - for example, you've misconfigured the 
> target or there is some sort of firewalling in between.
>
> To prove this, login to the prometheus server (or container where 
> prometheus is running), and do:
>
> curl -h 'http://x.x.x.x:/metrics'
>
> where x.x.x.x: is the IP address and port that you've configured as 
> the target to scrape.
>
> On Thursday, 21 April 2022 at 07:00:24 UTC+1 chembakay...@gmail.com wrote:
>
>> Hiii
>>
>> Thanks for your reply. we were using two different ports for two node 
>> exporters. Actually one node exporter is running in container/pod and 
>> another node exporter is running as a binary file. we are not getting any 
>> issue with container/pod one. But when we are running with binary one we 
>> are getting error like context deadline exceeded.
>>
>> our prometheus config file is as follows:
>>
>> scrape_interval: 2m
>> evaluation_interval: 15s
>> scrape_timeout: 2m
>>
>> could anyone please help me out of this?
>>
>> regards,
>> Bharath.
>> On Wednesday, 20 April 2022 at 17:50:01 UTC+5:30 Brian Candler wrote:
>>
>>> You need to be clearer about what you're doing and what errors you see.
>>>
>>> Yes, you could run two node_exporters on the same server, bound to 
>>> different ports.  However this is normally completely pointless, since 
>>> they're both monitoring the same server, and exporting the same data.
>>>
>>> It is not recommended that you run node_exporter inside a container at 
>>> all.  But if you do, see the instructions here:
>>> https://github.com/prometheus/node_exporter#docker
>>>
>>> On Wednesday, 20 April 2022 at 11:59:27 UTC+1 chembakay...@gmail.com 
>>> wrote:
>>>
 Hii all,

 I want to run two node exporter on same server.. One is with kubernetes 
 and another one with binary file. each node exporter has different port 
 numbers. I am getting error like time out or context deadline exceeded. 
 Will it be possible if so sould you please tell me the solution?

 thanks 
 Bharath 

>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2975fabc-5382-4566-82cc-b2e537200f3fn%40googlegroups.com.


[prometheus-users] Re: black box exporter monitoring SSH and PING

2022-04-21 Thread Brian Candler
On Thursday, 21 April 2022 at 09:22:32 UTC+1 ninag...@gmail.com wrote:

> *blackbox exporter config:*
> icmp:
> prober: icmp
> icmp:
>   preferred_ip_protocol: "ip4"
> tcp:
> prober: tcp
> timeout: 5s
> tcp:
>   preferred_ip_protocol: "ip4"
>
> *Prometheus scrape config:*
>
... 

>   - job_name: SSH
> metrics_path: /probe
> params:
>*   module: [ssh_banner]*
> file_sd_configs:
> - files:
>   - '/etc/prometheus/targets/'
> relabel_configs:
>   - source_labels: [__address__]
> target_label: __param_target
> regex: '([^:]+)(:[0-9]+)?'
> replacement: '${1}:22'
>   - source_labels: [__param_target]
> target_label: instance
>   - target_label: __address__
> replacement: prometheus-blackbox-exporter:9115
>

In your scrape job you are setting parameter module=ssh_banner, but you 
have not defined a module called "ssh_banner" in your blackbox exporter 
config.

Therefore it will always result in a failure.  Test like this:
*curl -g 
'http://prometheus-blackbox-exporter:9115/probe?module=ssh_banner=blah.example.com=true'*

 

> *Alert rules:*
> - alert: TargetDown
>   expr: probe_success == 0
>   for: 5s
>   labels:
> severity: critical
>   annotations:
> description: Service {{ $labels.instance }} is unreachable.
> value: DOWN ({{ $value }})
> summary: "Target {{ $labels.instance }} is down."
>
>
You can leave out "for: 5s" since you're only scraping and evaluating rules 
every 60s.

If you don't want an immediate alert in the case of a single probe failure 
(like a single dropped packet), then set "for: 1m" or "for: 2m" as 
required.  This will then only alert if the alert is continuously present 
for that duration.

 

> *Alert manager config:*
> ...
> - name: email-me
>   email_configs:
>   - to: alert
> send_resolved: true
>
>
In your original post you said "but black box exporter detect the recover 
behavior after about 5mins". Are you talking about when you receive the 
"send_resolved" message from alertmanager?

There are various delays which can occur between prometheus making an alert 
and alertmanager sending it, and also with prometheus withdrawing an alert 
and alertmanager sending a resolved message.

If I understand correctly: Prometheus doesn't explicitly "resolve" an 
alert, rather it just stops sending that alert.  The alert comes with an 
"endsAt" time, which is explained here:
https://github.com/prometheus/prometheus/issues/5277
"3x 

 the 
greater of the evaluation_interval or resend-delay values"
Since you have an evaluation_interval of 60s, I believe this means there 
will be at least a 3 minute delay between an alert ceasing to fire, and the 
resolved message being sent.

See also:
https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html
https://prometheus.io/docs/alerting/latest/clients/
https://prometheus.io/docs/alerting/latest/configuration/#configuration-file

# ResolveTimeout is the default value used by alertmanager if the alert does
# not include EndsAt, after this time passes it can declare the alert as 
resolved if it has not been updated.
# This has no impact on alerts from Prometheus, as they always include 
EndsAt.
[ resolve_timeout:  
 | 
default = 5m ]

Really I think you need to separate your problem into two parts:
1. Making sure that blackbox_exporter is probing ICMP and SSH 
successfully.  Check "probe_status" is going to 0 or 1 at the correct 
times.  View the PromQL history of the probe_status metric to confirm 
this.  Ignore alerts.
2. Then look at your alerting configuration, as to exactly when it sends 
messages.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/cd3aa371-e968-4b44-98a5-326c3da1a487n%40googlegroups.com.


[prometheus-users] Re: black box exporter monitoring SSH and PING

2022-04-21 Thread nina guo
*blackbox exporter config:*
icmp:
prober: icmp
icmp:
  preferred_ip_protocol: "ip4"
tcp:
prober: tcp
timeout: 5s
tcp:
  preferred_ip_protocol: "ip4"

*Prometheus scrape config:*
global:
  scrape_interval: 60s
  evaluation_interval: 60s
- job_name: PING
metrics_path: /probe
params:
  module: [icmp]
file_sd_configs:
- files:
  - '/etc/prometheus/targets/'
relabel_configs:
  - source_labels: [__address__]
target_label: __param_target
regex: '([^:]+)(:[0-9]+)?'
replacement: '${1}'
  - source_labels: [__param_target]
target_label: instance
  - target_label: __address__
replacement: prometheus-blackbox-exporter:9115
  - job_name: SSH
metrics_path: /probe
params:
  module: [ssh_banner]
file_sd_configs:
- files:
  - '/etc/prometheus/targets/'
relabel_configs:
  - source_labels: [__address__]
target_label: __param_target
regex: '([^:]+)(:[0-9]+)?'
replacement: '${1}:22'
  - source_labels: [__param_target]
target_label: instance
  - target_label: __address__
replacement: prometheus-blackbox-exporter:9115

*Alert rules:*
- alert: TargetDown
  expr: probe_success == 0
  for: 5s
  labels:
severity: critical
  annotations:
description: Service {{ $labels.instance }} is unreachable.
value: DOWN ({{ $value }})
summary: "Target {{ $labels.instance }} is down."

*Alert manager config:*
config.yml: |-
global:
  resolve_timeout: 5m
  smtp_smarthost: mail
  smtp_from: alertmanager
  smtp_require_tls: false
route:
  receiver: email-me
  group_by: [instance, alertname, job]
  group_wait: 45s
  group_interval: 5m
  repeat_interval: 24h
receivers:
- name: email-me
  email_configs:
  - to: alert
send_resolved: true

On Wednesday, April 20, 2022 at 8:29:10 PM UTC+8 Brian Candler wrote:

> blackbox_exporter monitoring TCP ports (e.g. for SSH) and ICMP (ping) 
> works fine.
>
> "but black box exporter detect the recover behavior after about 5mins"
>
> Black box exporter only performs a single test when you scrape it.  It 
> does not by itself do any recovery detection.  The problem is therefore 
> most likely with your prometheus scrape config or your alertmanager config.
>
> If you're having a problem, you'll need to be more specific:
> * show your blackbox_exporter config, your prometheus scrape config which 
> scrapes it, your alerting rules, and your alertmanager config (if using 
> alertmanager)
> * describe more clearly the behaviour you're seeing, and what you expected 
> to see.  (For example, are you waiting for a "recovery" E-mail from 
> alertmanager?)
>
> "And after the IP table is recovered, the alert for Ping can be cleared 
> after about 20mins, but SSH is still there."
>
> Either SSH is working and reachable, or it is not.  You can check the 
> results of blackbox_exporter tests by hand using curl, and also get 
> additional debugging information, like this:
>
> curl -g 'http://127.0.0.1:9115/probe?module=xxx==true'
>
> Here is an example:
>
> # *curl -g 
> 'http://localhost:9115/probe?module=icmp=1.2.3.4=true 
> '*
> Logs for the probe:
> ts=2022-04-20T12:25:11.587855449Z caller=main.go:320 module=icmp 
> target=1.2.3.4 level=info msg="Beginning probe" probe=icmp timeout_seconds=3
> ts=2022-04-20T12:25:11.588014456Z caller=icmp.go:91 module=icmp 
> target=1.2.3.4 level=info msg="Resolving target address" ip_protocol=ip6
> ts=2022-04-20T12:25:11.588065658Z caller=icmp.go:91 module=icmp 
> target=1.2.3.4 level=info msg="Resolving target address" ip_protocol=ip4
> ts=2022-04-20T12:25:11.588098688Z caller=icmp.go:91 module=icmp 
> target=1.2.3.4 level=info msg="Resolved target address" ip=1.2.3.4
> ts=2022-04-20T12:25:11.588133368Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Creating socket"
> ts=2022-04-20T12:25:11.588188673Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=debug msg="Unable to do unprivileged listen on socket, 
> will attempt privileged" err="socket: permission denied"
> ts=2022-04-20T12:25:11.58829848Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Creating ICMP packet" seq=24581 id=190
> ts=2022-04-20T12:25:11.588348917Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Writing out packet"
> ts=2022-04-20T12:25:11.588470176Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=info msg="Waiting for reply packets"
> ts=2022-04-20T12:25:14.588761946Z caller=main.go:130 module=icmp 
> target=1.2.3.4 level=debug msg="Cannot get TTL from the received packet. 
> 'probe_icmp_reply_hop_limit' will be 

[prometheus-users] Re: Run two node exporters on same server

2022-04-21 Thread Brian Candler
> we are getting error like context deadline exceeded

You haven't show the actual error, nor where you saw it.

The most likely explanation I can see is simply that prometheus cannot 
communicate with node_exporter - for example, you've misconfigured the 
target or there is some sort of firewalling in between.

To prove this, login to the prometheus server (or container where 
prometheus is running), and do:

curl -h 'http://x.x.x.x:/metrics'

where x.x.x.x: is the IP address and port that you've configured as the 
target to scrape.

On Thursday, 21 April 2022 at 07:00:24 UTC+1 chembakay...@gmail.com wrote:

> Hiii
>
> Thanks for your reply. we were using two different ports for two node 
> exporters. Actually one node exporter is running in container/pod and 
> another node exporter is running as a binary file. we are not getting any 
> issue with container/pod one. But when we are running with binary one we 
> are getting error like context deadline exceeded.
>
> our prometheus config file is as follows:
>
> scrape_interval: 2m
> evaluation_interval: 15s
> scrape_timeout: 2m
>
> could anyone please help me out of this?
>
> regards,
> Bharath.
> On Wednesday, 20 April 2022 at 17:50:01 UTC+5:30 Brian Candler wrote:
>
>> You need to be clearer about what you're doing and what errors you see.
>>
>> Yes, you could run two node_exporters on the same server, bound to 
>> different ports.  However this is normally completely pointless, since 
>> they're both monitoring the same server, and exporting the same data.
>>
>> It is not recommended that you run node_exporter inside a container at 
>> all.  But if you do, see the instructions here:
>> https://github.com/prometheus/node_exporter#docker
>>
>> On Wednesday, 20 April 2022 at 11:59:27 UTC+1 chembakay...@gmail.com 
>> wrote:
>>
>>> Hii all,
>>>
>>> I want to run two node exporter on same server.. One is with kubernetes 
>>> and another one with binary file. each node exporter has different port 
>>> numbers. I am getting error like time out or context deadline exceeded. 
>>> Will it be possible if so sould you please tell me the solution?
>>>
>>> thanks 
>>> Bharath 
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9d9e37e8-88c9-4d7e-a75a-974c0af72ba0n%40googlegroups.com.


[prometheus-users] Re: Run two node exporters on same server

2022-04-21 Thread BHARATH KUMAR
Hiii

Thanks for your reply. we were using two different ports for two node 
exporters. Actually one node exporter is running in container/pod and 
another node exporter is running as a binary file. we are not getting any 
issue with container/pod one. But when we are running with binary one we 
are getting error like context deadline exceeded.

our prometheus config file is as follows:

scrape_interval: 2m
evaluation_interval: 15s
scrape_timeout: 2m

could anyone please help me out of this?

regards,
Bharath.
On Wednesday, 20 April 2022 at 17:50:01 UTC+5:30 Brian Candler wrote:

> You need to be clearer about what you're doing and what errors you see.
>
> Yes, you could run two node_exporters on the same server, bound to 
> different ports.  However this is normally completely pointless, since 
> they're both monitoring the same server, and exporting the same data.
>
> It is not recommended that you run node_exporter inside a container at 
> all.  But if you do, see the instructions here:
> https://github.com/prometheus/node_exporter#docker
>
> On Wednesday, 20 April 2022 at 11:59:27 UTC+1 chembakay...@gmail.com 
> wrote:
>
>> Hii all,
>>
>> I want to run two node exporter on same server.. One is with kubernetes 
>> and another one with binary file. each node exporter has different port 
>> numbers. I am getting error like time out or context deadline exceeded. 
>> Will it be possible if so sould you please tell me the solution?
>>
>> thanks 
>> Bharath 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/51faceba-0886-416a-a9ca-75fb3e40b9dbn%40googlegroups.com.