[prometheus-users] Re: up query

2022-08-09 Thread Brian Candler
Use the PromQL query browser (in the Prometheus web interface) to debug 
it.  I suggest you first need to look at the inner query:

up{instance=~"instance"}

and graph it, setting the "instance" regexp to match one or more instances 
of interest. What does it look like? Is it a mixture of 0's and 1's, or all 
0's, or all 1's, or is it absent entirely?  If it's absent entirely, then 
that's a different problem you need to investigate - your scrape job is 
completely broken.

If it's a mixture of 0's and 1's, then try this query:

max_over_time(up{instance=~"instance"}[2d])

It should show 1 for any instant where the server was up at any time over 
the previous 48 hours.  Does it not?

If it's all 0's for at least 48 hours, then

max_over_time(up{instance=~"instance"}[2d])

should show 0.

Once you've understood why your query wasn't working as you were expecting, 
then for partially reachable you can try a query like this:

avg_over_time(up{instance=~"instance"}[2d]) > 0 < 0.9

(setting thresholds as appropriate)

On Tuesday, 9 August 2022 at 13:23:33 UTC+1 chembakay...@gmail.com wrote:

> Hi all,
>
> *First Query :*
> I want to find the servers which have not been reachable for the last X 
> days. It should not be in a reachable state for the last X days. I tried 
> the following query, but it didn't work out.
>
> Query :  max_over_time(up{instance=~"instance"}[Xd]) == 0
>
> The above query gives me the info that servers are not reachable at least 
> for 1 minute. But I want to know the info like it should not be reachable 
> for the last X days.
>
> *Second Query :*
> I want to find the servers which are partially reachable for the last X 
> days and it should not include the info that is totally unreachable state 
> for the X days.
>
> Any leads?
>
> Thanks & regards,
> Bharath Kumar.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/223d9dfb-5ac2-4629-b36a-a305d029fc21n%40googlegroups.com.


[prometheus-users] up query

2022-08-09 Thread BHARATH KUMAR
Hi all,

*First Query :*
I want to find the servers which have not been reachable for the last X 
days. It should not be in a reachable state for the last X days. I tried 
the following query, but it didn't work out.

Query :  max_over_time(up{instance=~"instance"}[Xd]) == 0

The above query gives me the info that servers are not reachable at least 
for 1 minute. But I want to know the info like it should not be reachable 
for the last X days.

*Second Query :*
I want to find the servers which are partially reachable for the last X 
days and it should not include the info that is totally unreachable state 
for the X days.

Any leads?

Thanks & regards,
Bharath Kumar.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8b5580f7-1377-4601-aa7f-a02c3fe36a76n%40googlegroups.com.


[prometheus-users] Re: ssl cert monitoring with blackbox exporter

2022-08-09 Thread Brian Candler
Yes.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4ce27257-1783-4948-b25d-aad51c73eca7n%40googlegroups.com.


[prometheus-users] Re: ssl cert monitoring with blackbox exporter

2022-08-09 Thread nina guo
smtp_banner:
prober: tcp
timeout: 20s
tcp:
  preferred_ip_protocol: ip4
  query_response:
  - expect: "^220 ([^ ]+) ESMTP( .+)?$"
send: "EHLO prober"
  - expect: "^250 "
send: "QUIT\r"   
 
  smtp_starttls:
prober: tcp
timeout: 20s
tcp:
  tls_config:
insecure_skip_verify: true
  query_response:
- expect: "^220 ([^ ]+) ESMTP( .+)?$"
  send: "EHLO prober\r"
- expect: "^250-STARTTLS"
- expect: "^250 .*$"
  send: "STARTTLS\r"
- expect: "^220"
  starttls: true
- send: "EHLO prober\r"
- expect: "^250 .*$"
  send: "QUIT\r"

Can I design as above then 2 jobs

- job_name: Mail Server
metrics_path: /probe
params:
  module: [smtp_banner]
file_sd_configs:
- files:
  - '/etc/prometheus/mail' 
relabel_configs:
  - source_labels: [__address__]
target_label: __param_target
  - source_labels: [__param_target]
target_label: instance
  - target_label: __address__
replacement: prometheus-blackbox-exporter:9115

- job_name: Mail Server TLS
metrics_path: /probe
params:
  module: [smtp_starttls]
file_sd_configs:
- files:
  - '/etc/prometheus/mail' 
relabel_configs:
  - source_labels: [__address__]
target_label: __param_target
  - source_labels: [__param_target]
target_label: instance
  - target_label: __address__
replacement: prometheus-blackbox-exporter:9115
On Tuesday, August 9, 2022 at 5:13:17 PM UTC+8 Brian Candler wrote:

> Do you mean, you want one probe that tests TCP connection and the "SMTP" 
> banner only; and another job that tests further including STARTTLS and the 
> certificate?
>
> Then just make two blackbox tests, one for each of those cases.
>
> You want different scrape_interval for them?  Then just make two 
> prometheus scrape jobs, and put one test under the first job, and the other 
> test under the second job.
>
> You want to use different evaluation_interval? Then make two different 
> alerting rule groups, with different evaluation intervals, and put one 
> alerting rule under each.
>
> On Tuesday, 9 August 2022 at 09:25:18 UTC+1 ninag...@gmail.com wrote:
>
>> Thank you Brian. I know.
>>
>> I already use smtp_starttls to check the connection with mail server. I 
>> found smtp_starttls also exposed ssl cert related metrics.
>> But now I want to take ssl cert check out from the job which is for 
>> checking SMTP. Because I want to set different scrape_interval and 
>> evaluation inverval for the cert check.
>>
>> Any good suggestion?
>>
>> On Tuesday, August 9, 2022 at 4:17:34 PM UTC+8 Brian Candler wrote:
>>
>>> Sigh.  We've been through all this with you before in great detail.
>>> https://groups.google.com/g/prometheus-users/c/LZbihDncIig/m/cqAA-UtdAQAJ
>>>
>>> Port 587 is SMTP submission, and it does not perform TLS on connection. 
>>> Try "telnet dns 587" and you'll see it responds in plain text, which is 
>>> just what the error message told you: "tls: first record does not look like 
>>> a TLS handshake"
>>>
>>> To get it to do TLS, you need to send the "starttls" command before 
>>> starting the TLS negotiation, like this example 
>>> 
>>>  and 
>>> in the thread linked above.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/7da8063f-70a9-47ac-a55c-d18044382bb3n%40googlegroups.com.


[prometheus-users] Re: ssl cert monitoring with blackbox exporter

2022-08-09 Thread Brian Candler
Do you mean, you want one probe that tests TCP connection and the "SMTP" 
banner only; and another job that tests further including STARTTLS and the 
certificate?

Then just make two blackbox tests, one for each of those cases.

You want different scrape_interval for them?  Then just make two prometheus 
scrape jobs, and put one test under the first job, and the other test under 
the second job.

You want to use different evaluation_interval? Then make two different 
alerting rule groups, with different evaluation intervals, and put one 
alerting rule under each.

On Tuesday, 9 August 2022 at 09:25:18 UTC+1 ninag...@gmail.com wrote:

> Thank you Brian. I know.
>
> I already use smtp_starttls to check the connection with mail server. I 
> found smtp_starttls also exposed ssl cert related metrics.
> But now I want to take ssl cert check out from the job which is for 
> checking SMTP. Because I want to set different scrape_interval and 
> evaluation inverval for the cert check.
>
> Any good suggestion?
>
> On Tuesday, August 9, 2022 at 4:17:34 PM UTC+8 Brian Candler wrote:
>
>> Sigh.  We've been through all this with you before in great detail.
>> https://groups.google.com/g/prometheus-users/c/LZbihDncIig/m/cqAA-UtdAQAJ
>>
>> Port 587 is SMTP submission, and it does not perform TLS on connection. 
>> Try "telnet dns 587" and you'll see it responds in plain text, which is 
>> just what the error message told you: "tls: first record does not look like 
>> a TLS handshake"
>>
>> To get it to do TLS, you need to send the "starttls" command before 
>> starting the TLS negotiation, like this example 
>> 
>>  and 
>> in the thread linked above.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8a03bf7a-3ea4-4a62-9e2e-068d740728d3n%40googlegroups.com.


[prometheus-users] Re: ssl cert monitoring with blackbox exporter

2022-08-09 Thread nina guo
Thank you Brian. I know.

I already use smtp_starttls to check the connection with mail server. I 
found smtp_starttls also exposed ssl cert related metrics.
But now I want to take ssl cert check out from the job which is for 
checking SMTP. Because I want to set different scrape_interval and 
evaluation inverval for the cert check.

Any good suggestion?

On Tuesday, August 9, 2022 at 4:17:34 PM UTC+8 Brian Candler wrote:

> Sigh.  We've been through all this with you before in great detail.
> https://groups.google.com/g/prometheus-users/c/LZbihDncIig/m/cqAA-UtdAQAJ
>
> Port 587 is SMTP submission, and it does not perform TLS on connection. 
> Try "telnet dns 587" and you'll see it responds in plain text, which is 
> just what the error message told you: "tls: first record does not look like 
> a TLS handshake"
>
> To get it to do TLS, you need to send the "starttls" command before 
> starting the TLS negotiation, like this example 
> 
>  and 
> in the thread linked above.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/35965fe3-8e8a-44b1-9f8f-9ef311fc0986n%40googlegroups.com.


[prometheus-users] Re: ssl cert monitoring with blackbox exporter

2022-08-09 Thread Brian Candler
Sigh.  We've been through all this with you before in great detail.
https://groups.google.com/g/prometheus-users/c/LZbihDncIig/m/cqAA-UtdAQAJ

Port 587 is SMTP submission, and it does not perform TLS on connection. Try 
"telnet dns 587" and you'll see it responds in plain text, which is just 
what the error message told you: "tls: first record does not look like a 
TLS handshake"

To get it to do TLS, you need to send the "starttls" command before 
starting the TLS negotiation, like this example 

 and 
in the thread linked above.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/351bd2be-8e6c-451f-bea6-8c9e7b31639an%40googlegroups.com.


Re: [prometheus-users] node exporter text file collector

2022-08-09 Thread Brian Candler
I am stating the obvious here, but unless you update your variable 
"query_check" inside the loop, then all your metrics will be set to the 
same value.

Also, your write_to_textfile should move outside the end of the loop.  You 
only need to write the file once.

On Tuesday, 9 August 2022 at 01:05:40 UTC+1 ninag...@gmail.com wrote:

> Thank you. I moved this line before the loop, but still received the same 
> issue.
>
> On Tuesday, August 9, 2022 at 2:58:16 AM UTC+8 b...@ritcey.com wrote:
>
>> Move 
>>
>>   g1 = Gauge('ldap_query_success', 'LDAP query command', ['ldap_uri', 
>> 'ldap_search_base'], registry=registry)
>>
>> before the loop - you don't want to initialize it each time.
>> On Monday, August 8, 2022 at 7:00:42 AM UTC-4 ninag...@gmail.com wrote:
>>
>>> Thank you I have resolved the issue.
>>>
>>> I also tried to use the interfaces to create and record the metrics. I 
>>> have tested with following codes and found that the value of the metrcis 
>>> will be overrided by the last value of the metrics.
>>>
>>> For example:
>>> real situation is:
>>> service1 -> ldap_query_success{...}  0
>>> service2 -> ldap_query_success{...}  0
>>> service3 -> ldap_query_success{...}  1
>>>
>>> but with the following codes:
>>> service1 -> ldap_query_success{...}  1
>>> service2 -> ldap_query_success{...}  1
>>> service3 -> ldap_query_success{...}  1
>>>
>>>
>>>
>>> from prometheus_client import Gauge, write_to_textfile, CollectorRegistry
>>>
>>> for service in services:
>>>   g1 = Gauge('ldap_query_success', 'LDAP query command', ['ldap_uri', 
>>> 'ldap_search_base'], registry=registry)
>>>   
>>> g1.labels(service,ldap_search_base,ldap_default_bind_dn).set(query_check)
>>>   write_to_textfile("/var/log 
>>> node_exporter/filecollector/ldap_query.prom", registry)
>>>
>>>
>>>
>>> On Monday, August 8, 2022 at 5:31:29 PM UTC+8 Stuart Clark wrote:
>>>
 On 08/08/2022 09:58, nina guo wrote: 
 > But the following 3 lines should be appended to a file first, then 
 > next time override the old content. But how to make the old content 
 be 
 > overried by previous ones? 
 > 
 > print (" # HELP ldap_query_success LDAP query command", 
 > file=open("/var/log/node_exporter/filecollector/ldap_query.prom", 
 "a+")) 
 >  print (" # TYPE ldap_query_success gauge", 
 > file=open("/var/log/node_exporter/filecollector/ldap_query.prom", 
 "a+")) 
 > print 
 > 
 ('ldap_query_success'+'{'+'ldap_uri'+'='+service+','+'ldap_search_base'+'='+ldap_search_base+','+'}
  

 > '+str(query_check), 
 > file=open("/var/log/node_exporter/filecollector/ldap_query.prom", 
 "a+")) 
 > 
 File mode "a" will open for appending (so preserve anything already in 
 the file). Instead to fully replace the file you'd need to use file 
 mode 
 "w". 

 -- 
 Stuart Clark 



-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/620028ac-673f-44b4-982a-ef9d7bed0e5cn%40googlegroups.com.