Re: [prometheus-users] Facing 5m staleness issue even with 2.x
It depends on: 1. How often Gatling sends its graphite metrics 2. How often Prometheus scrapes graphite-exporter If Prometheus is scraping graphite-exporter every 15 seconds, then you'll need to keep --graphite.sample-expiry to at least 15 seconds; otherwise you may lose the last metric value written by Gatling. On Tuesday, 19 April 2022 at 15:55:12 UTC+1 anik...@gmail.com wrote: > Thanks a lot Brian.. > Setting --graphite.sample-expiry flag solved the issue. > For now, I have kept it to 15 seconds... any guidance on how to decide > this correct value would be appreciated. > > On Tue, Apr 19, 2022, 4:56 PM Aniket Kulkarni wrote: > >> Thanks for the response Stuart.. >> >> To explain you more.. >> I am load testing an application through Gatling scripts (similar to >> jmeter). >> >> Now I want to have a real time monitoring of this load test. >> >> For this, Gatling supports graphite writer protocol(it can't directly >> talk with prometheus hence I have used graphite-exporter in between) >> >> Now Promotheus will collect these metrics sent by Gatling and provide to >> Grafana to plot the graphs. >> >> Now the problem is I am getting graphs but even after my load test is >> finished, I see the last value graph repeating for 5 minutes. >> >> Which is the known issue of prometheus... Hence I am confused on how to >> resolve this issue? Any configuration need to be added to prometheus.yml >> file? >> >> Please let me know if you need any further details.. >> >> >> On Tue, Apr 19, 2022, 4:44 PM Stuart Clark wrote: >> >>> On 2022-04-19 08:58, Aniket Kulkarni wrote: >>> > Hi, >>> > >>> > I have referred below links: >>> > >>> > I understand this was a problem with 1.x >>> > https://github.com/prometheus/prometheus/issues/398 >>> > >>> > I also got this link as a solution >>> > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ >>> > >>> > No doubt it's a great session. But I am still not clear as to what >>> > change I have to make and where? >>> > >>> > I also couldn't find the prometheus docs useful for this. >>> > >>> > I am using following tech stack: >>> > Gatling -> graphite-exporter -> prometheus-> grafana. >>> > >>> > I am still facing staleness issue. Please guide me on the solution or >>> > any extra configuration needed? >>> > >>> > I am using the default storage system by prometheus and not any >>> > external one. >>> > >>> >>> Could you describe a bit more of the problem you are seeing and what you >>> are wanting to do? >>> >>> All time series will be marked as stale if they have not been scraped >>> for a while, which causes data to stop being returned by queries, which >>> is important as things like labels will change over time (especially for >>> things like Kubernetes which include pod names). It is expected that >>> targets will be regularly scraped, so things shouldn't otherwise >>> disapear (unless there is an error, which should be visible via >>> something like the "up" metric). >>> >>> As the standard staleness interval is 5 minutes it is recommended that >>> the maximum scrape period should be no more that 2 minutes (to allow for >>> a failed scrape without the time series being marked as stale). >>> >>> -- >>> Stuart Clark >>> >> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/96a0dd06-6ba8-400c-9838-2fd48eb54459n%40googlegroups.com.
Re: [prometheus-users] Facing 5m staleness issue even with 2.x
Thanks a lot Brian.. Setting --graphite.sample-expiry flag solved the issue. For now, I have kept it to 15 seconds... any guidance on how to decide this correct value would be appreciated. On Tue, Apr 19, 2022, 4:56 PM Aniket Kulkarni wrote: > Thanks for the response Stuart.. > > To explain you more.. > I am load testing an application through Gatling scripts (similar to > jmeter). > > Now I want to have a real time monitoring of this load test. > > For this, Gatling supports graphite writer protocol(it can't directly talk > with prometheus hence I have used graphite-exporter in between) > > Now Promotheus will collect these metrics sent by Gatling and provide to > Grafana to plot the graphs. > > Now the problem is I am getting graphs but even after my load test is > finished, I see the last value graph repeating for 5 minutes. > > Which is the known issue of prometheus... Hence I am confused on how to > resolve this issue? Any configuration need to be added to prometheus.yml > file? > > Please let me know if you need any further details.. > > > On Tue, Apr 19, 2022, 4:44 PM Stuart Clark > wrote: > >> On 2022-04-19 08:58, Aniket Kulkarni wrote: >> > Hi, >> > >> > I have referred below links: >> > >> > I understand this was a problem with 1.x >> > https://github.com/prometheus/prometheus/issues/398 >> > >> > I also got this link as a solution >> > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ >> > >> > No doubt it's a great session. But I am still not clear as to what >> > change I have to make and where? >> > >> > I also couldn't find the prometheus docs useful for this. >> > >> > I am using following tech stack: >> > Gatling -> graphite-exporter -> prometheus-> grafana. >> > >> > I am still facing staleness issue. Please guide me on the solution or >> > any extra configuration needed? >> > >> > I am using the default storage system by prometheus and not any >> > external one. >> > >> >> Could you describe a bit more of the problem you are seeing and what you >> are wanting to do? >> >> All time series will be marked as stale if they have not been scraped >> for a while, which causes data to stop being returned by queries, which >> is important as things like labels will change over time (especially for >> things like Kubernetes which include pod names). It is expected that >> targets will be regularly scraped, so things shouldn't otherwise >> disapear (unless there is an error, which should be visible via >> something like the "up" metric). >> >> As the standard staleness interval is 5 minutes it is recommended that >> the maximum scrape period should be no more that 2 minutes (to allow for >> a failed scrape without the time series being marked as stale). >> >> -- >> Stuart Clark >> > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAPYU55e%2BPTnnQeObYYBBzQhXuUsvnJXf-Mpmg978eGJJVi_Chw%40mail.gmail.com.
Re: [prometheus-users] Facing 5m staleness issue even with 2.x
This is an issue with graphite-exporter, not prometheus or staleness. The problem is this: if your application simply stops sending data to graphite-exporter, then graphite-exporter has no idea whether the time series has finished or not, so it keeps exporting it for a while. See https://github.com/prometheus/graphite_exporter#usage *"To avoid using unbounded memory, metrics will be garbage collected five minutes after they are last pushed to. This is configurable with the * --graphite.sample-expiry* flag."* Once graphite-exporter stops exporting the metric, then on the next scrape prometheus will see that the timeseries has gone, and it will immediately mark it as stale (i.e. has no more values), and everything is fine. Therefore, reducing --graphite.sample-expiry may help, although you need to know how often your application sends graphite data; if you set this too short, then you'll get gaps in your graphs. Another option you could try is to get your application to send a "NaN" value at the end of this run. But technically this is a real NaN value, not a staleness marker (staleness markers are internally represented as a special kind of NaN, but that's an implementation detail that you can't rely on). Still, a NaN may be enough to stop Grafana showing any values from this point onwards. On Tuesday, 19 April 2022 at 12:26:31 UTC+1 anik...@gmail.com wrote: > Thanks for the response Stuart.. > > To explain you more.. > I am load testing an application through Gatling scripts (similar to > jmeter). > > Now I want to have a real time monitoring of this load test. > > For this, Gatling supports graphite writer protocol(it can't directly talk > with prometheus hence I have used graphite-exporter in between) > > Now Promotheus will collect these metrics sent by Gatling and provide to > Grafana to plot the graphs. > > Now the problem is I am getting graphs but even after my load test is > finished, I see the last value graph repeating for 5 minutes. > > Which is the known issue of prometheus... Hence I am confused on how to > resolve this issue? Any configuration need to be added to prometheus.yml > file? > > Please let me know if you need any further details.. > > > On Tue, Apr 19, 2022, 4:44 PM Stuart Clark wrote: > >> On 2022-04-19 08:58, Aniket Kulkarni wrote: >> > Hi, >> > >> > I have referred below links: >> > >> > I understand this was a problem with 1.x >> > https://github.com/prometheus/prometheus/issues/398 >> > >> > I also got this link as a solution >> > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ >> > >> > No doubt it's a great session. But I am still not clear as to what >> > change I have to make and where? >> > >> > I also couldn't find the prometheus docs useful for this. >> > >> > I am using following tech stack: >> > Gatling -> graphite-exporter -> prometheus-> grafana. >> > >> > I am still facing staleness issue. Please guide me on the solution or >> > any extra configuration needed? >> > >> > I am using the default storage system by prometheus and not any >> > external one. >> > >> >> Could you describe a bit more of the problem you are seeing and what you >> are wanting to do? >> >> All time series will be marked as stale if they have not been scraped >> for a while, which causes data to stop being returned by queries, which >> is important as things like labels will change over time (especially for >> things like Kubernetes which include pod names). It is expected that >> targets will be regularly scraped, so things shouldn't otherwise >> disapear (unless there is an error, which should be visible via >> something like the "up" metric). >> >> As the standard staleness interval is 5 minutes it is recommended that >> the maximum scrape period should be no more that 2 minutes (to allow for >> a failed scrape without the time series being marked as stale). >> >> -- >> Stuart Clark >> > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/96842b37-bbd3-49f9-adb3-a24bc28cec79n%40googlegroups.com.
Re: [prometheus-users] Facing 5m staleness issue even with 2.x
Thanks for the response Stuart.. To explain you more.. I am load testing an application through Gatling scripts (similar to jmeter). Now I want to have a real time monitoring of this load test. For this, Gatling supports graphite writer protocol(it can't directly talk with prometheus hence I have used graphite-exporter in between) Now Promotheus will collect these metrics sent by Gatling and provide to Grafana to plot the graphs. Now the problem is I am getting graphs but even after my load test is finished, I see the last value graph repeating for 5 minutes. Which is the known issue of prometheus... Hence I am confused on how to resolve this issue? Any configuration need to be added to prometheus.yml file? Please let me know if you need any further details.. On Tue, Apr 19, 2022, 4:44 PM Stuart Clark wrote: > On 2022-04-19 08:58, Aniket Kulkarni wrote: > > Hi, > > > > I have referred below links: > > > > I understand this was a problem with 1.x > > https://github.com/prometheus/prometheus/issues/398 > > > > I also got this link as a solution > > https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ > > > > No doubt it's a great session. But I am still not clear as to what > > change I have to make and where? > > > > I also couldn't find the prometheus docs useful for this. > > > > I am using following tech stack: > > Gatling -> graphite-exporter -> prometheus-> grafana. > > > > I am still facing staleness issue. Please guide me on the solution or > > any extra configuration needed? > > > > I am using the default storage system by prometheus and not any > > external one. > > > > Could you describe a bit more of the problem you are seeing and what you > are wanting to do? > > All time series will be marked as stale if they have not been scraped > for a while, which causes data to stop being returned by queries, which > is important as things like labels will change over time (especially for > things like Kubernetes which include pod names). It is expected that > targets will be regularly scraped, so things shouldn't otherwise > disapear (unless there is an error, which should be visible via > something like the "up" metric). > > As the standard staleness interval is 5 minutes it is recommended that > the maximum scrape period should be no more that 2 minutes (to allow for > a failed scrape without the time series being marked as stale). > > -- > Stuart Clark > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAPYU55fLc7rnq41B6aFdjc-GyNFq6em6rpw-xeYuE7LbZnhHbA%40mail.gmail.com.
Re: [prometheus-users] Facing 5m staleness issue even with 2.x
On 2022-04-19 08:58, Aniket Kulkarni wrote: Hi, I have referred below links: I understand this was a problem with 1.x https://github.com/prometheus/prometheus/issues/398 I also got this link as a solution https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ No doubt it's a great session. But I am still not clear as to what change I have to make and where? I also couldn't find the prometheus docs useful for this. I am using following tech stack: Gatling -> graphite-exporter -> prometheus-> grafana. I am still facing staleness issue. Please guide me on the solution or any extra configuration needed? I am using the default storage system by prometheus and not any external one. Could you describe a bit more of the problem you are seeing and what you are wanting to do? All time series will be marked as stale if they have not been scraped for a while, which causes data to stop being returned by queries, which is important as things like labels will change over time (especially for things like Kubernetes which include pod names). It is expected that targets will be regularly scraped, so things shouldn't otherwise disapear (unless there is an error, which should be visible via something like the "up" metric). As the standard staleness interval is 5 minutes it is recommended that the maximum scrape period should be no more that 2 minutes (to allow for a failed scrape without the time series being marked as stale). -- Stuart Clark -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a97e8140ebdc538c0553192e3dacaf71%40Jahingo.com.
[prometheus-users] Facing 5m staleness issue even with 2.x
Hi, I have referred below links: I understand this was a problem with 1.x https://github.com/prometheus/prometheus/issues/398 I also got this link as a solution https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ No doubt it's a great session. But I am still not clear as to what change I have to make and where? I also couldn't find the prometheus docs useful for this. I am using following tech stack: Gatling -> graphite-exporter -> prometheus-> grafana. I am still facing staleness issue. Please guide me on the solution or any extra configuration needed? I am using the default storage system by prometheus and not any external one. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAPYU55eVY7OTrVKZ1ijJWt1WFURrkN1jR34G6e9q5Z9QxEs6bA%40mail.gmail.com.