[prometheus-users] Re: Graph Tab in Prometheus

2022-08-18 Thread kekr...@gmail.com
Thank you Brian.  This helps.

Kevin

On Thursday, August 18, 2022 at 4:27:01 AM UTC-5 Brian Candler wrote:

> BTW, I just did a quick test.  When setting my graph display range to 2w 
> in the Prometheus web interface, I found that adjacent data points were 
> just under 81 minutes apart.  So the query
>
> max_over_time(ALERTS[81m])
>
> was able to show lots of short-lived alerts, which the plain query
>
> ALERTS
>
> did not.  Setting it longer, e.g. to [3h], smears those alerts over 
> multiple graph points, as expected.
>
> On Thursday, 18 August 2022 at 09:46:40 UTC+1 Brian Candler wrote:
>
>> Presumably you are using the PromQL query browser built into prometheus? 
>> (Not some third party tool like Grafana etc?)
>>
>> When you draw a graph from time T1 to T2, you send the prometheus API a 
>> range 
>> query 
>> <https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries> 
>> to repeatedly evaluate an instant vector query over a time range from T1 to 
>> T2 with some step S.  The step S is chosen by the client so that it a 
>> suitable number fit in the display, e.g. if it wants 200 data points then 
>> it could chose step = (T2 - T1) / 200.  In the prometheus graph view you 
>> can see this by moving your mouse left and right over the graph; a pop-up 
>> shows you each data point, and you can see it switch from point to point as 
>> you move left to right.
>>
>> Therefore, it's showing the values of the timeseries at the instants T1, 
>> T1+S, T1+2S, ... T2-S,T2.
>>
>> When evaluating a timeseries at a given instant in time, it finds the 
>> closest value *at or before* that time (up to a maximum lookback interval, 
>> which by default is 5 minutes).
>>
>> Therefore, your graph is showing *samples* of the data in the TSDB.  If 
>> you zoom out too far, you may be missing "interesting" values.  For example:
>>
>> TSDB :  0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0  ...
>> Graph:   0 0 1 0 0 ...
>>
>> Counters make this less of a problem: you can get your graph to show how 
>> the counter has *increased* between two adjacent points (usually then 
>> divided by the step time, to get a rate).
>>
>> However, the problem for a metric like ALERTS is it's not a counter, and 
>> it doesn't even switch between 0 and 1, but the whole timeseries appears 
>> and disappears.  (In fact, it creates separate timeseries for when the 
>> alert is in state "pending" and "firing").  If you graph step is more than 
>> 5 minutes, you may not catch the alert's presence at all.
>>
>> What you could try is a query like this:
>>
>> max_over_time(ALERTS{alertname="CPUUtilization"}[1h])
>>
>> The inner query is a range vector: it returns all data points within a 1 
>> hour window, between 1 hour before the evaluation time up to the evaluation 
>> time.  Then if *any* data points exist in that window, the highest one 
>> returned, forming an instant vector again.  When your graph sweeps this 
>> expression over a time period from T1 to T2, then each data point will 
>> cover one hour. That should catch the "missing" samples.
>>
>> Of course, the time window is fixed to 1h in that query, and you may need 
>> to adjust it depending on your graph zoom level, to match the time period 
>> between adjacent points on the graph.  If you're using grafana, there's a 
>> magic 
>> variable 
>> <https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/#__interval>
>>  
>> $__interval you can use.  I vaguely remember seeing a proposal for PromQL 
>> to have a way of referring to "the current step interval" in a range vector 
>> expression, but I don't know what happened to that.
>>
>> HTH,
>>
>> Brian.
>>
>> On Wednesday, 17 August 2022 at 23:21:03 UTC+1 kekr...@gmail.com wrote:
>>
>>> I am currently looking for all CPU alerts using a query of 
>>> ALERTS{alertname="CPUUtilization"}
>>>
>>> I am stepping through the graph time frame one click at a time.  
>>>
>>> At the 12h time, I get one entry.  At 1d I get zero entries.  At 2d, I 
>>> get 4 entries but not the one I found at 12h.  I would expect to get 
>>> everything from 2d to now.
>>>
>>> At 1w, I get 8 entries but at 2w, I only get 5 entries.  I would expect 
>>> to get everything from 2w to now.
>>>
>>> Last week I ran this same query and found the alert I was l

[prometheus-users] Graph Tab in Prometheus

2022-08-17 Thread kekr...@gmail.com
I am currently looking for all CPU alerts using a query of 
ALERTS{alertname="CPUUtilization"}

I am stepping through the graph time frame one click at a time.  

At the 12h time, I get one entry.  At 1d I get zero entries.  At 2d, I get 
4 entries but not the one I found at 12h.  I would expect to get everything 
from 2d to now.

At 1w, I get 8 entries but at 2w, I only get 5 entries.  I would expect to 
get everything from 2w to now.

Last week I ran this same query and found the alert I was looking for back 
in April.  Today I ran the same query and I cannot find that alert from 
April.

I see this behavior in multiple Prometheus environments.

Is this a problem or the way the graphing works in Prometheus?

Prometheus version is 2.29.1
Prometheus retention period is 1y
DB is currently 1.2TB.  There are DBs as large as 5TB in other Prometheus 
environments.


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/aa306fae-380e-4b9f-aaf5-a3ab14525b19n%40googlegroups.com.


Re: [prometheus-users] Target Server - Which Prometheus Server Is Scraping

2022-02-16 Thread kekr...@gmail.com
Thanks Ben.  That is a much easier way to do it than my round about way.  
Will keep that in mind next time.

Stuart,  I see what you are getting at now.  We do not  parse through the 
application log files.  We wrote our exporters that can take in a list of 
scripts that generate metrics, translate into Prometheus format and get 
written to the DB. 

On Wednesday, February 16, 2022 at 3:35:09 AM UTC-6 Stuart Clark wrote:

> On 16/02/2022 01:11, kekr...@gmail.com wrote:
> > Stuart, I am not sure I understand the log files question.  I am not 
> > aware of any log files related to the scrape itself.  We do have log 
> > files related to the exporters running on the server but they do not 
> > capture the scrapes.  I am trying to get details of what is going on 
> > on the target server itself, not so concerned about what the 
> > Prometheus server has log wise.
> >
> I was talking about logs from the application being scraped.
>
> -- 
> Stuart Clark
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c541e63e-ab92-43c6-b11c-4df83ca2318en%40googlegroups.com.


Re: [prometheus-users] Target Server - Which Prometheus Server Is Scraping

2022-02-15 Thread kekr...@gmail.com
Stuart, I am not sure I understand the log files question.  I am not aware 
of any log files related to the scrape itself.  We do have log files 
related to the exporters running on the server but they do not capture the 
scrapes.  I am trying to get details of what is going on on the target 
server itself, not so concerned about what the Prometheus server has log 
wise.

If you will add some more detail around the question, I will be glad to 
answer.  

On Tuesday, February 15, 2022 at 4:41:41 PM UTC-6 kekr...@gmail.com wrote:

> I got the Grafana date problem solved and have the scrape time history for 
> the metric created - the info I needed.
>
> Thank you Stuart and Brian for the assistance.
>
> Kevin
>
> On Tuesday, February 15, 2022 at 4:32:02 PM UTC-6 Stuart Clark wrote:
>
>> On 15/02/2022 22:29, kekr...@gmail.com wrote: 
>> > I am not seeing the frequency is more often than I expect.  I am being 
>> > told a log file is being created by the scrapes in a temp directory 
>> > every minute.  I am saying it is not Prometheus. So now i have to 
>> > prove it is not Prometheus. 
>> > 
>> > As an alternate solution, I am trying to use the Prometheus timestamp 
>> > function on the metric being created by the scrape in Grafana to get 
>> > the time history of the metric as proof.  The thought being the time 
>> > difference between the metric history is 3 minutes.  But I am having 
>> > trouble getting the value of the timestamp function to act as an epoch 
>> > date.If I use the value returned in a web epoch translator, it 
>> > translate to the correct date.  If I multiple the value by 1000, as 
>> > you do every epoch date in Grafana, it actually multiplies the value 
>> > rather than putting it in human readable date format. 
>> I'm not clear if you are getting logs from these requests or not? I'd 
>> expect any request logs to include the path being requested, time & 
>> source IP. What do you see? 
>>
>> -- 
>> Stuart Clark 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4fdc7c08-bbbf-40bc-a26c-c791337155c2n%40googlegroups.com.


Re: [prometheus-users] Target Server - Which Prometheus Server Is Scraping

2022-02-15 Thread kekr...@gmail.com
I got the Grafana date problem solved and have the scrape time history for 
the metric created - the info I needed.

Thank you Stuart and Brian for the assistance.

Kevin

On Tuesday, February 15, 2022 at 4:32:02 PM UTC-6 Stuart Clark wrote:

> On 15/02/2022 22:29, kekr...@gmail.com wrote:
> > I am not seeing the frequency is more often than I expect.  I am being 
> > told a log file is being created by the scrapes in a temp directory 
> > every minute.  I am saying it is not Prometheus. So now i have to 
> > prove it is not Prometheus.
> >
> > As an alternate solution, I am trying to use the Prometheus timestamp 
> > function on the metric being created by the scrape in Grafana to get 
> > the time history of the metric as proof.  The thought being the time 
> > difference between the metric history is 3 minutes.  But I am having 
> > trouble getting the value of the timestamp function to act as an epoch 
> > date.If I use the value returned in a web epoch translator, it 
> > translate to the correct date.  If I multiple the value by 1000, as 
> > you do every epoch date in Grafana, it actually multiplies the value 
> > rather than putting it in human readable date format.
> I'm not clear if you are getting logs from these requests or not? I'd 
> expect any request logs to include the path being requested, time & 
> source IP. What do you see?
>
> -- 
> Stuart Clark
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/6017e329-45a9-4b0a-bfa5-02de4a74713fn%40googlegroups.com.


Re: [prometheus-users] Target Server - Which Prometheus Server Is Scraping

2022-02-15 Thread kekr...@gmail.com
I am not seeing the frequency is more often than I expect.  I am being told 
a log file is being created by the scrapes in a temp directory every 
minute.  I am saying it is not Prometheus.  So now i have to prove it is 
not Prometheus.

As an alternate solution, I am trying to use the Prometheus timestamp 
function on the metric being created by the scrape in Grafana to get the 
time history of the metric as proof.  The thought being the time difference 
between the metric history is 3 minutes.  But I am having trouble getting 
the value of the timestamp function to act as an epoch date.If I use 
the value returned in a web epoch translator, it translate to the correct 
date.  If I multiple the value by 1000, as you do every epoch date in 
Grafana, it actually multiplies the value rather than putting it in human 
readable date format.

Kevin
On Tuesday, February 15, 2022 at 3:47:10 PM UTC-6 Stuart Clark wrote:

> On 15/02/2022 18:41, kekr...@gmail.com wrote:
> > My end goal is to prove monitoring is not running every minute on the 
> > server.  My word saying  it is not, there's no way, the job is not 
> > configured to run every minute,  is not good enough.
> >
> > There is a possibility that the two Prometheus servers are scraping at 
> > the same time but there is no way the scrapes are happening every 
> > minute.  The scrape interval is 3m with a scrape time out of 2m45s.
>
> So are you seeing more frequent requests than you expect? How are you 
> telling this? Do you have request logs & what do they say/record?
>
> -- 
> Stuart Clark
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9188e918-6750-43c3-80f7-7570803cad5fn%40googlegroups.com.


Re: [prometheus-users] Target Server - Which Prometheus Server Is Scraping

2022-02-15 Thread kekr...@gmail.com
My end goal is to prove monitoring is not running every minute on the 
server.  My word saying  it is not, there's no way, the job is not 
configured to run every minute,  is not good enough.

There is a possibility that the two Prometheus servers are scraping at the 
same time but there is no way the scrapes are happening every minute.  The 
scrape interval is 3m with a scrape time out of 2m45s.

On Tuesday, February 15, 2022 at 1:47:37 AM UTC-6 Brian Candler wrote:

> You can add whatever query params you like in the scrape job, e.g. you 
> could add ?from=foo or ?from=bar as part of the URL being scraped.
>
>   - job_name: blah
> metrics_path: /metrics
> params:
>   from: [ foo ]
> ...
>
> However, I wonder how you intend to use this information.  The data which 
> is scraped *should* be independent of who is scraping it, and it should be 
> possible to do additional scrapes without affecting the data (e.g. hitting 
> the exporter with "curl" to test it shouldn't alter the data for anyone 
> else).
>
> Therefore, I wonder if there's a better way to achieve what you're trying 
> to achieve.  For example, if you are keeping a counter of "how many widgets 
> processed in the last minute", and resetting it to zero on each scrape, 
> then you should not be doing this; you should be keeping a counter which 
> just keeps incrementing. It's up  to the consumer of the data to look at 
> data and work out the number of widgets processed per minute, or per hour 
> or whatever.  Having the data in this format is much more useful anyway.
>
> If you can describe what it is you're doing, and why it matters where the 
> scrape is coming from, we may be able to give some alternative suggestions.
>
> On Tuesday, 15 February 2022 at 07:15:11 UTC Stuart Clark wrote:
>
>> On 15/02/2022 04:32, kekr...@gmail.com wrote: 
>> > 
>> > If you have multiple Prometheus servers using an identical target 
>> > list, is there a way on the target server to tell which Prometheus 
>> > server is scraping at the time the scrap occurs. 
>> > 
>> > For example,  Prom_server_a and Prom_server_b scrape 
>> > target_server_123.  On target_server_123 is there something on this 
>> > server that says Prom_server_a is scraping right now. Prom_server_b is 
>> > scraping right now? 
>> > 
>> The source IP address in any logs? 
>>
>> -- 
>> Stuart Clark 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4136dffc-56f4-480c-ab54-0358b86864f2n%40googlegroups.com.


[prometheus-users] Target Server - Which Prometheus Server Is Scraping

2022-02-14 Thread kekr...@gmail.com

If you have multiple Prometheus servers using an identical target list, is 
there a way on the target server to tell which Prometheus server is 
scraping at the time the scrap occurs.

For example,  Prom_server_a and Prom_server_b scrape target_server_123.  On 
target_server_123 is there something on this server that says Prom_server_a 
is scraping right now.  Prom_server_b is scraping right now?

Thanks,
Kevin

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/878a532a-283b-46ba-933e-0d4e8ea0c927n%40googlegroups.com.