I appreciate your time. I’ve logged off for the day but will get back to you tomorrow with more data.
To answer the question I can: we aren’t using any proxy software to my knowledge. We use the https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus Helm chart (version 13.8.0) hooked up to store data in AWS’s managed Prometheus product. That said, we do run it in StatefulSet mode with 2 replicas. I wonder if that’s causing problems. On Wed, 6 Apr 2022, at 19:49, Brian Candler wrote: > Are you going through any middleware or proxy, like promxy? > > rate(foo[1m]) should definitely give no answer at all, when the timeseries > data is sampled at 1 minute intervals. > > Here is a working query_range for rate[1m] where the scrape interval is 15s: > > # curl -Ssg > 'http://localhost:9090/api/v1/query_range?query=rate(ifHCInOctets{instance="gw1",ifName="ether1"}[60s])&start=1649264340&end=1649264640&step=60' > | python3 -m json.tool > { > "status": "success", > "data": { > "resultType": "matrix", > "result": [ > { > "metric": { > "ifIndex": "16", > "ifName": "ether1", > "instance": "gw1", > "job": "snmp", > "module": "mikrotik_secret", > "netbox_type": "device" > }, > "values": [ > [ > 1649264340, > "578.6444444444444" > ], > [ > 1649264400, > "651.4222222222221" > ], > [ > 1649264460, > "135.17777777777778" > ], > [ > 1649264520, > "1699.4888888888888" > ], > [ > 1649264580, > "441.5777777777777" > ], > [ > 1649264640, > "39768.08888888888" > ] > ] > } > ] > } > } > > But if I make exactly the same query but with rate[15s] then there are no > answers: > > # curl -Ssg > 'http://localhost:9090/api/v1/query_range?query=rate(ifHCInOctets{instance="gw1",ifName="ether1"}[15s])&start=1649264340&end=1649264640&step=60' > | python3 -m json.tool > { > "status": "success", > "data": { > "resultType": "matrix", > "result": [] > } > } > > I think the real reason for your problem is hidden; you're obfuscating the > query and metric names, and I suspect it's hidden behind that. Sorry, I > can't help you any further given what I can see, but hopefully you have an > idea where you can look further. > > On Wednesday, 6 April 2022 at 18:45:10 UTC+1 [email protected] wrote: >> Hey Brian, >> >> In the original post I put the output of the raw time series as gathered the >> way you suggest. I'll copy it again below: >> >> { >> "data": { >> "result": [ >> { >> "metric": {/* redacted */}, >> "values": [ >> [ >> 1649239253.4, >> "225201" >> ], >> [ >> 1649239313.4, >> "225226" >> ], >> [ >> 1649239373.4, >> "225249" >> ], >> [ >> 1649239433.4, >> "225262" >> ], >> [ >> 1649239493.4, >> "225278" >> ], >> [ >> 1649239553.4, >> "225310" >> ], >> [ >> 1649239613.4, >> "225329" >> ], >> [ >> 1649239673.4, >> "225363" >> ], >> [ >> 1649239733.4, >> "225402" >> ], >> [ >> 1649239793.4, >> "225437" >> ], >> [ >> 1649239853.4, >> "225466" >> ], >> [ >> 1649239913.4, >> "225492" >> ], >> [ >> 1649239973.4, >> "225529" >> ], >> [ >> 1649240033.4, >> "225555" >> ], >> [ >> 1649240093.4, >> "225595" >> ] >> ] >> } >> ], >> "resultType": "matrix" >> }, >> "status": "success" >> } >> >> The query was of the form `counter[15m]` at a given time. I don't see >> duplicate scrape data in there. >> >> The version of prometheus is 2.26.0, revision >> 3cafc58827d1ebd1a67749f88be4218f0bab3d8d, go version go1.16.2. >> On Wednesday, April 6, 2022 at 6:13:10 PM UTC+1 Brian Candler wrote: >>> What version of prometheus are you running? >>> >>> With prometheus, rate(counter[1m]) should give you no results at all when >>> you are scraping at 1 minute intervals - unless something has changed very >>> recently (I'm running 2.33.4). So this is a big red flag. >>> Now, for driving the query API, you should be able to do it like this: >>> >>> # curl -Ssg >>> 'http://localhost:9090/api/v1/query?query=ifHCInOctets{instance= >>> <http://localhost:9090/api/v1/query?query=ifHCInOctets%7Binstance=>"gw1",ifName="ether1"}[60s]' >>> | python3 -m json.tool >>> >>> { >>> "status": "success", >>> "data": { >>> "resultType": "matrix", >>> "result": [ >>> { >>> "metric": { >>> "__name__": "ifHCInOctets", >>> "ifIndex": "16", >>> "ifName": "ether1", >>> "instance": "gw1", >>> "job": "snmp", >>> "module": "mikrotik_secret", >>> "netbox_type": "device" >>> }, >>> "values": [ >>> [ >>> 1649264595.241, >>> "117857843410 <tel:(785)%20784-3410>" >>> ], >>> [ >>> 1649264610.241, >>> "117858063821 <tel:(785)%20806-3821>" >>> ], >>> [ >>> 1649264625.241, >>> "117858075769 <tel:(785)%20807-5769>" >>> ] >>> ] >>> } >>> ] >>> } >>> } >>> >>> There I gave a range vector of 60 seconds, and I got 3 data points because >>> I'm scraping at 15 second intervals, so only 3 points fell within the time >>> window of (current time) and (current time - 60s) >>> >>> Sending a query_range will sample the data at intervals. Only an actual >>> range vector query (as shown above) will show you *all* the data points in >>> the time series, wherever they lie. >>> >>> I think you should do this. My guess - and it's only a guess at the moment >>> - is that there are multiple points being received for the same timeseries, >>> and this is giving your spike. This could be due to overlapping scrape >>> jobs for the same timeseries, or relabelling removing some distinguishing >>> label, or some HA setup which is scraping the same timeseries multiple >>> times but not adding external labels to distinguish them. >>> >>> I do have some evidence for my guess. If you are storing the same data >>> points twice, this will give you the rate of zero most of the time, when >>> doing rate[1m], because there are two adjacent identical points most of the >>> time (whereas if there were only a single data point, you'd get no rate at >>> all). And you'll get a counter spike if two data points get transposed. >>> >>> On Wednesday, 6 April 2022 at 14:37:57 UTC+1 [email protected] wrote: >>>> Here's the query inspector output from Grafana for rate(counter[2m]). It >>>> makes the answer to question 1 in my original post more clear. You're >>>> right, the graph for 1m is just plain wrong. We do still see the reset, >>>> though. >>>> >>>> { >>>> "request": { >>>> "url": >>>> "api/datasources/proxy/1/api/v1/query_range?query=rate(counter[2m])&start=1649239200&end=1649240100&step=60", >>>> >>>> "method": "GET", >>>> "hideFromInspector": false >>>> }, >>>> "response": { >>>> "status": "success", >>>> "data": { >>>> "resultType": "matrix", >>>> "result": [ >>>> { >>>> "metric": {/* redacted */}, >>>> "values": [ >>>> [ >>>> 1649239200, >>>> "0.2871886897537781" >>>> ], >>>> [ >>>> 1649239260, >>>> "0.3084619260318357" >>>> ], >>>> [ >>>> 1649239320, >>>> "0.26591545347572043" >>>> ], >>>> [ >>>> 1649239380, >>>> "0.2446422171976628" >>>> ], >>>> [ >>>> 1649239440, >>>> "0.13827603580737463" >>>> ], >>>> [ >>>> 1649239500, >>>> "0.1701858902244611" >>>> ], >>>> [ >>>> 1649239560, >>>> "0.3403717804489222" >>>> ], >>>> [ >>>> 1649239620, >>>> "0.20209574464154753" >>>> ], >>>> [ >>>> 1649239680, >>>> "0.3616450167269798" >>>> ], >>>> [ >>>> 1649239740, >>>> "2397.9404664989347" >>>> ], >>>> [ >>>> 1649239800, >>>> "2397.88728340824" >>>> ], >>>> [ >>>> 1649239860, >>>> "0.3084619260318357" >>>> ], >>>> [ >>>> 1649239920, >>>> "0.27655207161474926" >>>> ], >>>> [ >>>> 1649239980, >>>> "0.39355487114406623" >>>> ], >>>> [ >>>> 1649240040, >>>> "0.27655207161474926" >>>> ], >>>> [ >>>> 1649240100, >>>> "0.43610134370018155" >>>> ] >>>> ] >>>> } >>>> ] >>>> } >>>> } >>>> } >>>> On Wednesday, April 6, 2022 at 2:34:59 PM UTC+1 Sam Rose wrote: >>>>> We do see a graph with rate(counter[1m]). It even looks pretty close to >>>>> what we see with rate(counter[2m]). We definitely scrape every 60 >>>>> seconds, double checked our config to make sure. >>>>> >>>>> The exact query was `counter[15m]`. Counter is >>>>> `django_http_responses_total_by_status_total` in reality, with a long >>>>> list of labels attached to ensure I'm selecting a single time series. >>>>> >>>>> I didn't realise Grafana did that, thank you for the advice. >>>>> >>>>> I feel like we're drifting away from the original problem a little bit. >>>>> Can I get you any additional data to make the original problem easier to >>>>> debug? >>>>> >>>>> On Wednesday, April 6, 2022 at 2:31:27 PM UTC+1 Brian Candler wrote: >>>>>> If you are scraping at 1m intervals, then you definitely need >>>>>> rate(counter[2m]). That's because rate() needs at least two data points >>>>>> to fall within the range window. I would be surprised if you see any >>>>>> graph at all with rate(counter[1m]). >>>>>> >>>>>> > This is the raw data, as obtained through a request to /api/v1/query >>>>>> >>>>>> What is the *exact* query you gave? Hopefully it is a range vector >>>>>> query, like counter[15m]. A range vector expression sent to the simple >>>>>> query endpoint gives you the raw data points with their raw timestamps >>>>>> from the database. >>>>>> >>>>>> > and then we configure the minimum value of it to 1m per-graph >>>>>> >>>>>> Just in case you haven't realised: to set a minimum value of 1m, you >>>>>> must set the data source scrape interval (in Grafana) to 15s - since >>>>>> Grafana clamps the minimum value to 4 x Grafana-configured data source >>>>>> scrape interval. >>>>>> >>>>>> Therefore if you are actually scraping at 1m intervals, and you want the >>>>>> minimum of $__rate_interval to be 2m, then you must set the Grafana data >>>>>> source interval to 30s. This is weird, but it is what it is. >>>>>> https://github.com/grafana/grafana/issues/32169 >>>>>> >>>>>> On Wednesday, 6 April 2022 at 14:07:13 UTC+1 [email protected] wrote: >>>>>>> We do make use of that variable, and then we configure the minimum >>>>>>> value of it to 1m per-graph. I didn't realise you could configure this >>>>>>> per-datasource, thanks for pointing that out! >>>>>>> >>>>>>> We did used to scrape at 15s intervals but we're using AWS's managed >>>>>>> prometheus workspaces, and each data point costs money, so we brought >>>>>>> it down to 1m intervals. >>>>>>> >>>>>>> I'm not sure I understand the relationship between scrape interval and >>>>>>> counter resets, especially considering there doesn't appear to be a >>>>>>> counter reset in the raw data of the time series in question. >>>>>>> >>>>>>> You mentioned "true counter reset", does prometheus have some internal >>>>>>> distinction between types of counter reset? >>>>>>> On Wednesday, April 6, 2022 at 2:03:40 PM UTC+1 [email protected] wrote: >>>>>>>> I would recommend using the `$__rate_interval` magic variable in >>>>>>>> Grafana. Note that Grafana assumes a default interval of 15s in the >>>>>>>> datasource settings. >>>>>>>> >>>>>>>> If your data is mostly 60s scrape intervals, you can configure this >>>>>>>> setting in the Grafana datasource settings. >>>>>>>> >>>>>>>> If you want to be able to view 1m resolution rates, I recommend >>>>>>>> increasing your scrape interval to 15s. This makes sure you have >>>>>>>> several samples in the rate window. This helps Prometheus better >>>>>>>> handle true counter resets and lost scrapes. >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Apr 6, 2022 at 2:56 PM Sam Rose <[email protected]> wrote: >>>>>>>>> Thanks for the heads up! We've flip flopped a bit between using 1m or >>>>>>>>> 2m. 1m seems to work reliably enough to be useful in most situations, >>>>>>>>> but I'll probably end up going back to 2m after this discussion. >>>>>>>>> >>>>>>>>> I don't believe that helps with the reset problem though, right? I >>>>>>>>> retried the queries using 2m instead of 1m and they still exhibit the >>>>>>>>> same problem. >>>>>>>>> >>>>>>>>> Is there any more data I can get you to help debug the problem? We >>>>>>>>> see this happen multiple times per day, and it's making it difficult >>>>>>>>> to monitor our systems in production. >>>>>>>>> On Wednesday, April 6, 2022 at 1:53:26 PM UTC+1 [email protected] >>>>>>>>> wrote: >>>>>>>>>> Yup, PromQL thinks there's a small dip in the data. I'm not sure why >>>>>>>>>> tho. I took your raw values: >>>>>>>>>> >>>>>>>>>> 225201 >>>>>>>>>> 225226 >>>>>>>>>> 225249 >>>>>>>>>> 225262 >>>>>>>>>> 225278 >>>>>>>>>> 225310 >>>>>>>>>> 225329 >>>>>>>>>> 225363 >>>>>>>>>> 225402 >>>>>>>>>> 225437 >>>>>>>>>> 225466 >>>>>>>>>> 225492 >>>>>>>>>> 225529 >>>>>>>>>> 225555 >>>>>>>>>> 225595 >>>>>>>>>> >>>>>>>>>> $ awk '{print $1-225201}' values >>>>>>>>>> 0 >>>>>>>>>> 25 >>>>>>>>>> 48 >>>>>>>>>> 61 >>>>>>>>>> 77 >>>>>>>>>> 109 >>>>>>>>>> 128 >>>>>>>>>> 162 >>>>>>>>>> 201 >>>>>>>>>> 236 >>>>>>>>>> 265 >>>>>>>>>> 291 >>>>>>>>>> 328 >>>>>>>>>> 354 >>>>>>>>>> 394 >>>>>>>>>> >>>>>>>>>> I'm not seeing the reset there. >>>>>>>>>> >>>>>>>>>> One thing I noticed, your data interval is 60 seconds and you are >>>>>>>>>> doing a rate(counter[1m]). This is not going to work reliably, >>>>>>>>>> because you are likely to not have two samples in the same step >>>>>>>>>> window. This is because Prometheus uses millisecond timestamps, so >>>>>>>>>> if you have timestamps at these times: >>>>>>>>>> >>>>>>>>>> 5.335 >>>>>>>>>> 65.335 >>>>>>>>>> 125.335 >>>>>>>>>> >>>>>>>>>> Then you do a rate(counter[1m]) at time 120 (Grafana attempts to >>>>>>>>>> align queries to even minutes for consistency), the only sample >>>>>>>>>> you'll get back is 65.335. >>>>>>>>>> >>>>>>>>>> You need to do rate(counter[2m]) in order to avoid problems. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Apr 6, 2022 at 2:45 PM Sam Rose <[email protected]> wrote: >>>>>>>>>>> I just learned about the resets() function and applying it does >>>>>>>>>>> seem to show that a reset occurred: >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> "request": { >>>>>>>>>>> "url": >>>>>>>>>>> "api/datasources/proxy/1/api/v1/query_range?query=resets(counter[1m])&start=1649239200&end=1649240100&step=60", >>>>>>>>>>> "method": "GET", >>>>>>>>>>> "hideFromInspector": false >>>>>>>>>>> }, >>>>>>>>>>> "response": { >>>>>>>>>>> "status": "success", >>>>>>>>>>> "data": { >>>>>>>>>>> "resultType": "matrix", >>>>>>>>>>> "result": [ >>>>>>>>>>> { >>>>>>>>>>> "metric": {/* redacted */}, >>>>>>>>>>> "values": [ >>>>>>>>>>> [ >>>>>>>>>>> 1649239200, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239260, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239320, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239380, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239440, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239500, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239560, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239620, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239680, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239740, >>>>>>>>>>> "1" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239800, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239860, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239920, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649239980, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649240040, >>>>>>>>>>> "0" >>>>>>>>>>> ], >>>>>>>>>>> [ >>>>>>>>>>> 1649240100, >>>>>>>>>>> "0" >>>>>>>>>>> ] >>>>>>>>>>> ] >>>>>>>>>>> } >>>>>>>>>>> ] >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> I don't quite understand how, though. >>>>>>>>>>> On Wednesday, April 6, 2022 at 1:40:12 PM UTC+1 Sam Rose wrote: >>>>>>>>>>>> Hi there, >>>>>>>>>>>> >>>>>>>>>>>> We're seeing really large spikes when using the `rate()` function >>>>>>>>>>>> on some of our metrics. I've been able to isolate a single time >>>>>>>>>>>> series that displays this problem, which I'm going to call >>>>>>>>>>>> `counter`. I haven't attached the actual metric labels here, but >>>>>>>>>>>> all of the data you see here is from `counter` over the same time >>>>>>>>>>>> period. >>>>>>>>>>>> >>>>>>>>>>>> This is the raw data, as obtained through a request to >>>>>>>>>>>> /api/v1/query: >>>>>>>>>>>> >>>>>>>>>>>> { >>>>>>>>>>>> "data": { >>>>>>>>>>>> "result": [ >>>>>>>>>>>> { >>>>>>>>>>>> "metric": {/* redacted */}, >>>>>>>>>>>> "values": [ >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239253.4, >>>>>>>>>>>> "225201" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239313.4, >>>>>>>>>>>> "225226" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239373.4, >>>>>>>>>>>> "225249" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239433.4, >>>>>>>>>>>> "225262" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239493.4, >>>>>>>>>>>> "225278" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239553.4, >>>>>>>>>>>> "225310" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239613.4, >>>>>>>>>>>> "225329" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239673.4, >>>>>>>>>>>> "225363" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239733.4, >>>>>>>>>>>> "225402" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239793.4, >>>>>>>>>>>> "225437" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239853.4, >>>>>>>>>>>> "225466" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239913.4, >>>>>>>>>>>> "225492" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239973.4, >>>>>>>>>>>> "225529" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649240033.4, >>>>>>>>>>>> "225555" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649240093.4, >>>>>>>>>>>> "225595" >>>>>>>>>>>> ] >>>>>>>>>>>> ] >>>>>>>>>>>> } >>>>>>>>>>>> ], >>>>>>>>>>>> "resultType": "matrix" >>>>>>>>>>>> }, >>>>>>>>>>>> "status": "success" >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> The next query is taken from the Grafana query inspector, because >>>>>>>>>>>> for reasons I don't understand I can't get Prometheus to give me >>>>>>>>>>>> any data when I issue the same query to /api/v1/query_range. The >>>>>>>>>>>> query is the same as the above query, but wrapped in a rate([1m]): >>>>>>>>>>>> >>>>>>>>>>>> "request": { >>>>>>>>>>>> "url": >>>>>>>>>>>> "api/datasources/proxy/1/api/v1/query_range?query=rate(counter[1m])&start=1649239200&end=1649240100&step=60", >>>>>>>>>>>> "method": "GET", >>>>>>>>>>>> "hideFromInspector": false >>>>>>>>>>>> }, >>>>>>>>>>>> "response": { >>>>>>>>>>>> "status": "success", >>>>>>>>>>>> "data": { >>>>>>>>>>>> "resultType": "matrix", >>>>>>>>>>>> "result": [ >>>>>>>>>>>> { >>>>>>>>>>>> "metric": {/* redacted */}, >>>>>>>>>>>> "values": [ >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239200, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239260, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239320, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239380, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239440, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239500, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239560, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239620, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239680, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239740, >>>>>>>>>>>> "9391.766666666665" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239800, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239860, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239920, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649239980, >>>>>>>>>>>> "0" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649240040, >>>>>>>>>>>> "0.03333333333333333" >>>>>>>>>>>> ], >>>>>>>>>>>> [ >>>>>>>>>>>> 1649240100, >>>>>>>>>>>> "0" >>>>>>>>>>>> ] >>>>>>>>>>>> ] >>>>>>>>>>>> } >>>>>>>>>>>> ] >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> Given the gradual increase in the underlying counter, I have two >>>>>>>>>>>> questions: >>>>>>>>>>>> >>>>>>>>>>>> 1. How come the rate is 0 for all except 2 datapoints? >>>>>>>>>>>> 2. How come there is one enormous datapoint in the rate query, >>>>>>>>>>>> that is seemingly unexplained in the raw data? >>>>>>>>>>>> >>>>>>>>>>>> For 2 I've seen in other threads that the explanation is an >>>>>>>>>>>> unintentional counter reset, caused by scrapes a millisecond apart >>>>>>>>>>>> that make the counter appear to go down for a single scrape >>>>>>>>>>>> interval. I don't think I see this in our raw data, though. >>>>>>>>>>>> >>>>>>>>>>>> We're using Prometheus version 2.26.0, revision >>>>>>>>>>>> 3cafc58827d1ebd1a67749f88be4218f0bab3d8d, go version go1.16.2. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>>>> Groups "Prometheus Users" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>>> send an email to [email protected]. >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>> https://groups.google.com/d/msgid/prometheus-users/c1b7568b-f7f9-4edc-943a-22412658975fn%40googlegroups.com >>>>>>>>>>> >>>>>>>>>>> <https://groups.google.com/d/msgid/prometheus-users/c1b7568b-f7f9-4edc-943a-22412658975fn%40googlegroups.com?utm_medium=email&utm_source=footer>. >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "Prometheus Users" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to [email protected]. >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/prometheus-users/888affc4-e9ba-4ea8-8a40-c7b7a17affe4n%40googlegroups.com >>>>>>>>> >>>>>>>>> <https://groups.google.com/d/msgid/prometheus-users/888affc4-e9ba-4ea8-8a40-c7b7a17affe4n%40googlegroups.com?utm_medium=email&utm_source=footer>. > > > -- > You received this message because you are subscribed to a topic in the Google > Groups "Prometheus Users" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/prometheus-users/gJgSjdxlgYY/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/5b7f1f53-1ea4-4c60-b05e-76396a370a46n%40googlegroups.com > > <https://groups.google.com/d/msgid/prometheus-users/5b7f1f53-1ea4-4c60-b05e-76396a370a46n%40googlegroups.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e78b1ee2-10a9-4f47-996a-caf465d3d614%40www.fastmail.com.

