Re: [prometheus-users] Facing 5m staleness issue even with 2.x

2022-04-19 Thread Brian Candler
It depends on:
1. How often Gatling sends its graphite metrics
2. How often Prometheus scrapes graphite-exporter

If Prometheus is scraping graphite-exporter every 15 seconds, then you'll 
need to keep --graphite.sample-expiry to at least 15 seconds; otherwise you 
may lose the last metric value written by Gatling.

On Tuesday, 19 April 2022 at 15:55:12 UTC+1 anik...@gmail.com wrote:

> Thanks a lot Brian..
> Setting --graphite.sample-expiry flag solved the issue.
> For now, I have kept it to 15 seconds... any guidance on how to decide 
> this correct value would be appreciated.
>
> On Tue, Apr 19, 2022, 4:56 PM Aniket Kulkarni  wrote:
>
>> Thanks for the response Stuart.. 
>>
>> To explain you more..
>> I am load testing an application through Gatling scripts (similar to 
>> jmeter).
>>
>> Now I want to have a real time monitoring of this load test.
>>
>> For this, Gatling supports graphite writer protocol(it can't directly 
>> talk with prometheus hence I have used graphite-exporter in between)
>>
>> Now Promotheus will collect these metrics sent by Gatling and provide to 
>> Grafana to plot the graphs.
>>
>> Now the problem is I am getting graphs but even after my load test is 
>> finished, I see the last value graph repeating for 5 minutes.
>>
>> Which is the known issue of prometheus... Hence I am confused on how to 
>> resolve this issue? Any configuration need to be added to prometheus.yml 
>> file?
>>
>> Please let me know if you need any further details..
>>
>>
>> On Tue, Apr 19, 2022, 4:44 PM Stuart Clark  wrote:
>>
>>> On 2022-04-19 08:58, Aniket Kulkarni wrote:
>>> > Hi,
>>> > 
>>> > I have referred below links:
>>> > 
>>> > I understand this was a problem with 1.x
>>> > https://github.com/prometheus/prometheus/issues/398
>>> > 
>>> > I also got this link as a solution
>>> > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/
>>> > 
>>> > No doubt it's a great session. But I am still not clear as to what
>>> > change I have to make and where?
>>> > 
>>> > I also couldn't find the prometheus docs useful for this.
>>> > 
>>> > I am using following tech stack:
>>> > Gatling -> graphite-exporter -> prometheus-> grafana.
>>> > 
>>> > I am still facing staleness issue. Please guide me on the solution or
>>> > any extra configuration needed?
>>> > 
>>> > I am using the default storage system by prometheus and not any
>>> > external one.
>>> > 
>>>
>>> Could you describe a bit more of the problem you are seeing and what you 
>>> are wanting to do?
>>>
>>> All time series will be marked as stale if they have not been scraped 
>>> for a while, which causes data to stop being returned by queries, which 
>>> is important as things like labels will change over time (especially for 
>>> things like Kubernetes which include pod names). It is expected that 
>>> targets will be regularly scraped, so things shouldn't otherwise 
>>> disapear (unless there is an error, which should be visible via 
>>> something like the "up" metric).
>>>
>>> As the standard staleness interval is 5 minutes it is recommended that 
>>> the maximum scrape period should be no more that 2 minutes (to allow for 
>>> a failed scrape without the time series being marked as stale).
>>>
>>> -- 
>>> Stuart Clark
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/96a0dd06-6ba8-400c-9838-2fd48eb54459n%40googlegroups.com.


Re: [prometheus-users] Facing 5m staleness issue even with 2.x

2022-04-19 Thread Aniket Kulkarni
Thanks a lot Brian..
Setting --graphite.sample-expiry flag solved the issue.
For now, I have kept it to 15 seconds... any guidance on how to decide this
correct value would be appreciated.

On Tue, Apr 19, 2022, 4:56 PM Aniket Kulkarni  wrote:

> Thanks for the response Stuart..
>
> To explain you more..
> I am load testing an application through Gatling scripts (similar to
> jmeter).
>
> Now I want to have a real time monitoring of this load test.
>
> For this, Gatling supports graphite writer protocol(it can't directly talk
> with prometheus hence I have used graphite-exporter in between)
>
> Now Promotheus will collect these metrics sent by Gatling and provide to
> Grafana to plot the graphs.
>
> Now the problem is I am getting graphs but even after my load test is
> finished, I see the last value graph repeating for 5 minutes.
>
> Which is the known issue of prometheus... Hence I am confused on how to
> resolve this issue? Any configuration need to be added to prometheus.yml
> file?
>
> Please let me know if you need any further details..
>
>
> On Tue, Apr 19, 2022, 4:44 PM Stuart Clark 
> wrote:
>
>> On 2022-04-19 08:58, Aniket Kulkarni wrote:
>> > Hi,
>> >
>> > I have referred below links:
>> >
>> > I understand this was a problem with 1.x
>> > https://github.com/prometheus/prometheus/issues/398
>> >
>> > I also got this link as a solution
>> > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/
>> >
>> > No doubt it's a great session. But I am still not clear as to what
>> > change I have to make and where?
>> >
>> > I also couldn't find the prometheus docs useful for this.
>> >
>> > I am using following tech stack:
>> > Gatling -> graphite-exporter -> prometheus-> grafana.
>> >
>> > I am still facing staleness issue. Please guide me on the solution or
>> > any extra configuration needed?
>> >
>> > I am using the default storage system by prometheus and not any
>> > external one.
>> >
>>
>> Could you describe a bit more of the problem you are seeing and what you
>> are wanting to do?
>>
>> All time series will be marked as stale if they have not been scraped
>> for a while, which causes data to stop being returned by queries, which
>> is important as things like labels will change over time (especially for
>> things like Kubernetes which include pod names). It is expected that
>> targets will be regularly scraped, so things shouldn't otherwise
>> disapear (unless there is an error, which should be visible via
>> something like the "up" metric).
>>
>> As the standard staleness interval is 5 minutes it is recommended that
>> the maximum scrape period should be no more that 2 minutes (to allow for
>> a failed scrape without the time series being marked as stale).
>>
>> --
>> Stuart Clark
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPYU55e%2BPTnnQeObYYBBzQhXuUsvnJXf-Mpmg978eGJJVi_Chw%40mail.gmail.com.


Re: [prometheus-users] Facing 5m staleness issue even with 2.x

2022-04-19 Thread Brian Candler
This is an issue with graphite-exporter, not prometheus or staleness.

The problem is this: if your application simply stops sending data to 
graphite-exporter, then graphite-exporter has no idea whether the time 
series has finished or not, so it keeps exporting it for a while.
See https://github.com/prometheus/graphite_exporter#usage
*"To avoid using unbounded memory, metrics will be garbage collected five 
minutes after they are last pushed to. This is configurable with the *
--graphite.sample-expiry* flag."*

Once graphite-exporter stops exporting the metric, then on the next scrape 
prometheus will see that the timeseries has gone, and it will immediately 
mark it as stale (i.e. has no more values), and everything is fine.

Therefore, reducing --graphite.sample-expiry may help, although you need to 
know how often your application sends graphite data; if you set this too 
short, then you'll get gaps in your graphs.

Another option you could try is to get your application to send a "NaN" 
value at the end of this run.  But technically this is a real NaN value, 
not a staleness marker (staleness markers are internally represented as a 
special kind of NaN, but that's an implementation detail that you can't 
rely on).  Still, a NaN may be enough to stop Grafana showing any values 
from this point onwards.

On Tuesday, 19 April 2022 at 12:26:31 UTC+1 anik...@gmail.com wrote:

> Thanks for the response Stuart.. 
>
> To explain you more..
> I am load testing an application through Gatling scripts (similar to 
> jmeter).
>
> Now I want to have a real time monitoring of this load test.
>
> For this, Gatling supports graphite writer protocol(it can't directly talk 
> with prometheus hence I have used graphite-exporter in between)
>
> Now Promotheus will collect these metrics sent by Gatling and provide to 
> Grafana to plot the graphs.
>
> Now the problem is I am getting graphs but even after my load test is 
> finished, I see the last value graph repeating for 5 minutes.
>
> Which is the known issue of prometheus... Hence I am confused on how to 
> resolve this issue? Any configuration need to be added to prometheus.yml 
> file?
>
> Please let me know if you need any further details..
>
>
> On Tue, Apr 19, 2022, 4:44 PM Stuart Clark  wrote:
>
>> On 2022-04-19 08:58, Aniket Kulkarni wrote:
>> > Hi,
>> > 
>> > I have referred below links:
>> > 
>> > I understand this was a problem with 1.x
>> > https://github.com/prometheus/prometheus/issues/398
>> > 
>> > I also got this link as a solution
>> > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/
>> > 
>> > No doubt it's a great session. But I am still not clear as to what
>> > change I have to make and where?
>> > 
>> > I also couldn't find the prometheus docs useful for this.
>> > 
>> > I am using following tech stack:
>> > Gatling -> graphite-exporter -> prometheus-> grafana.
>> > 
>> > I am still facing staleness issue. Please guide me on the solution or
>> > any extra configuration needed?
>> > 
>> > I am using the default storage system by prometheus and not any
>> > external one.
>> > 
>>
>> Could you describe a bit more of the problem you are seeing and what you 
>> are wanting to do?
>>
>> All time series will be marked as stale if they have not been scraped 
>> for a while, which causes data to stop being returned by queries, which 
>> is important as things like labels will change over time (especially for 
>> things like Kubernetes which include pod names). It is expected that 
>> targets will be regularly scraped, so things shouldn't otherwise 
>> disapear (unless there is an error, which should be visible via 
>> something like the "up" metric).
>>
>> As the standard staleness interval is 5 minutes it is recommended that 
>> the maximum scrape period should be no more that 2 minutes (to allow for 
>> a failed scrape without the time series being marked as stale).
>>
>> -- 
>> Stuart Clark
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/96842b37-bbd3-49f9-adb3-a24bc28cec79n%40googlegroups.com.


Re: [prometheus-users] Facing 5m staleness issue even with 2.x

2022-04-19 Thread Aniket Kulkarni
Thanks for the response Stuart..

To explain you more..
I am load testing an application through Gatling scripts (similar to
jmeter).

Now I want to have a real time monitoring of this load test.

For this, Gatling supports graphite writer protocol(it can't directly talk
with prometheus hence I have used graphite-exporter in between)

Now Promotheus will collect these metrics sent by Gatling and provide to
Grafana to plot the graphs.

Now the problem is I am getting graphs but even after my load test is
finished, I see the last value graph repeating for 5 minutes.

Which is the known issue of prometheus... Hence I am confused on how to
resolve this issue? Any configuration need to be added to prometheus.yml
file?

Please let me know if you need any further details..


On Tue, Apr 19, 2022, 4:44 PM Stuart Clark  wrote:

> On 2022-04-19 08:58, Aniket Kulkarni wrote:
> > Hi,
> >
> > I have referred below links:
> >
> > I understand this was a problem with 1.x
> > https://github.com/prometheus/prometheus/issues/398
> >
> > I also got this link as a solution
> > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/
> >
> > No doubt it's a great session. But I am still not clear as to what
> > change I have to make and where?
> >
> > I also couldn't find the prometheus docs useful for this.
> >
> > I am using following tech stack:
> > Gatling -> graphite-exporter -> prometheus-> grafana.
> >
> > I am still facing staleness issue. Please guide me on the solution or
> > any extra configuration needed?
> >
> > I am using the default storage system by prometheus and not any
> > external one.
> >
>
> Could you describe a bit more of the problem you are seeing and what you
> are wanting to do?
>
> All time series will be marked as stale if they have not been scraped
> for a while, which causes data to stop being returned by queries, which
> is important as things like labels will change over time (especially for
> things like Kubernetes which include pod names). It is expected that
> targets will be regularly scraped, so things shouldn't otherwise
> disapear (unless there is an error, which should be visible via
> something like the "up" metric).
>
> As the standard staleness interval is 5 minutes it is recommended that
> the maximum scrape period should be no more that 2 minutes (to allow for
> a failed scrape without the time series being marked as stale).
>
> --
> Stuart Clark
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPYU55fLc7rnq41B6aFdjc-GyNFq6em6rpw-xeYuE7LbZnhHbA%40mail.gmail.com.


Re: [prometheus-users] Facing 5m staleness issue even with 2.x

2022-04-19 Thread Stuart Clark

On 2022-04-19 08:58, Aniket Kulkarni wrote:

Hi,

I have referred below links:

I understand this was a problem with 1.x
https://github.com/prometheus/prometheus/issues/398

I also got this link as a solution
https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/

No doubt it's a great session. But I am still not clear as to what
change I have to make and where?

I also couldn't find the prometheus docs useful for this.

I am using following tech stack:
Gatling -> graphite-exporter -> prometheus-> grafana.

I am still facing staleness issue. Please guide me on the solution or
any extra configuration needed?

I am using the default storage system by prometheus and not any
external one.



Could you describe a bit more of the problem you are seeing and what you 
are wanting to do?


All time series will be marked as stale if they have not been scraped 
for a while, which causes data to stop being returned by queries, which 
is important as things like labels will change over time (especially for 
things like Kubernetes which include pod names). It is expected that 
targets will be regularly scraped, so things shouldn't otherwise 
disapear (unless there is an error, which should be visible via 
something like the "up" metric).


As the standard staleness interval is 5 minutes it is recommended that 
the maximum scrape period should be no more that 2 minutes (to allow for 
a failed scrape without the time series being marked as stale).


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a97e8140ebdc538c0553192e3dacaf71%40Jahingo.com.


[prometheus-users] Facing 5m staleness issue even with 2.x

2022-04-19 Thread Aniket Kulkarni
Hi,

I have referred below links:

I understand this was a problem with 1.x
https://github.com/prometheus/prometheus/issues/398

I also got this link as a solution
https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/

No doubt it's a great session. But I am still not clear as to what change I
have to make and where?

I also couldn't find the prometheus docs useful for this.

I am using following tech stack:
Gatling -> graphite-exporter -> prometheus-> grafana.

I am still facing staleness issue. Please guide me on the solution or any
extra configuration needed?

I am using the default storage system by prometheus and not any external
one.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPYU55eVY7OTrVKZ1ijJWt1WFURrkN1jR34G6e9q5Z9QxEs6bA%40mail.gmail.com.