Re: [prometheus-users] Prometheus collect the metric fail
Would it be so hard to make the error message include the position where the problem was? Thanks, Mike prometheus-users@googlegroups.com wrote on 11/12/2021 01:06:11 AM: > From: "易Richard" > To: "Prometheus Users" > Date: 11/12/2021 01:06 AM > Subject: [EXTERNAL] [prometheus-users] Prometheus collect the metric fail > Sent by: prometheus-users@googlegroups.com > > Prometheus collect endpoint fail. The error msg is "expected equal, > got INVALID" I checked the stderr output but didn't find any clue > about the error. The metric content from the endpoint fail to > collect seems normal. ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Prometheus collect endpoint fail. The error msg is "expected equal, > got INVALID" > I checked the stderr output but didn't find any clue about the error. > > [image removed] > > The metric content from the endpoint fail to collect seems normal. > Do I need upload the whole metric content cause it's too long? > -- > You received this message because you are subscribed to the Google > Groups "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to prometheus-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/ > d/msgid/prometheus-users/ > a9b67778-4657-4dd9-9a01-044094f7bc3an%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/OF7B5A36C9.6EBD5867-ON8525878B.0058760F-8525878B.005880EE%40ibm.com.
Re: [prometheus-users] Re: my counters start at zero
I have a specific scenario. I have counters that start at zero when the scraped process starts; they are counting something that happens in the scraped process. If a counter first appears with a non-zero value, I know all those counts happened since the previous scrape. I am not asserting that `rate()` should be changed for everybody. Is there a PromQL query I can write that will behave similarly to `rate()` but will recognize that an initial non-zero count is due to increments since the previous scrape of the same process (yes, restricted to the situations where the process has been scraped before)? Thanks, Mike prometheus-users@googlegroups.com wrote on 07/29/2020 03:27:03 AM: > From: Brian Candler > To: Prometheus Users > Date: 07/29/2020 03:27 AM > Subject: [EXTERNAL] [prometheus-users] Re: my counters start at zero > Sent by: prometheus-users@googlegroups.com > > rate() calculates the rate between the first and last available > samples in the given time window, as long as there are at least two samples. > > irate() calculates the rate between the last two samples in the > given time window. > > On Wednesday, 29 July 2020 05:25:04 UTC+1, Mike Spreitzer wrote: > Now suppose instead that foo first shows up in a scrape at time t0 > with a value of 10, and in every scrape after that the value of foo > is also 10. What will `rate(foo[60s])` give me? If I understand > correctly, it will give me nothing until time t0+60s, and from then > on it will give me zero. Have I got this right? > > It will show a rate of 0 as soon as two values are available, that > is, from t0+10s onwards. > > If a new counter appears with value 10, it tells you nothing about > rate just before the counter appeared. It maybe that scraping was > broken, and the counter had value 10 for the last year. It could be > that the counter had being going 1-2-3-4-5-6-7-8-9-10 at intervals > of 10 seconds. Or at intervals of 1 week. > > As a real-world example, it is very common to start polling an SNMP > device and find its interface byte counters already at huge values, > reflecting how much traffic has been carried in total by that > interface since the device was powered on. It would be completely > wrong to have an enormous blip which effectively compresses months > or years of traffic into one sample interval. > -- > You received this message because you are subscribed to the Google > Groups "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to prometheus-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/ > d/msgid/prometheus-users/b9dfe865-3be6-414f- > b6f9-7e55caa52196o%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/OF2A889B1D.27215406-ON852585B6.00240C50-852585B6.00247512%40notes.na.collabserv.com.
[prometheus-users] my counters start at zero
Suppose I have a counter metric; let's name it `foo`. Suppose foo first shows up with a value of 0 in a scrape at time t0, shows up with a value of 10 in a scrape at time t0+10s, and has value 10 in all subsequent scrapes. What will the PromQL expression `rate(foo[60s])` get me? I suppose nothing until time t0+60s; some non-zero value from t0+60s to t0+70s; and zero from t0+70s onward. Is that right? If not, what will I get? Now suppose instead that foo first shows up in a scrape at time t0 with a value of 10, and in every scrape after that the value of foo is also 10. What will `rate(foo[60s])` give me? If I understand correctly, it will give me nothing until time t0+60s, and from then on it will give me zero. Have I got this right? That is a rather disappointing answer. This counter really did start at zero, and got 10 increments before the first scrape. It would be gratifying to have a PromQL query that shows this blip of activity. Can I write a different PromQL query that will get this result? While retaining all the other smarts of `rate`? Thanks, Mike -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/954b128d-fdf9-4829-86c7-8645fde91cb3o%40googlegroups.com.
[prometheus-users] Re: No joy from Prometheus snapshot
Interestingly, the Prometheus server running against the new snapshot logged some compactions after about a minute. ... level=info ts=2020-07-07T01:57:08.694Z caller=main.go:646 msg="Server is ready to receive web requests." level=info ts=2020-07-07T01:58:18.185Z caller=compact.go:441 component=tsdb msg="compact blocks" count=3 mint=159401520 maxt=159403680 ulid= 01ECKFX8HKYEP8Y2YPX2WB0PC3 sources="[01ECHNMH8MRPZYHM8KXKBE60ZA 01ECHWG8GCAZWMW1JDZB9ERQ0N 01ECJ3BZRM3SP7XQAZRJ867C30]" duration= 9.494006165s level=info ts=2020-07-07T01:58:27.860Z caller=compact.go:441 component=tsdb msg="compact blocks" count=3 mint=159403680 maxt=159405840 ulid= 01ECKFXHV47D51X0GREQM6QZE3 sources="[01ECJA7Q18RT0VW9S5HKSVDP0J 01ECJH3E8ENEWVK8E41G5XWM71 01ECJQZ5GB6TGV21QMHCWZ3JFY]" duration=9.64782793s level=info ts=2020-07-07T01:58:37.619Z caller=compact.go:441 component=tsdb msg="compact blocks" count=3 mint=159405840 maxt=159408000 ulid= 01ECKFXV9PSG91VCTAMAVKZ1YR sources="[01ECJYTWRJZG6YSGH3XBTJK6CX 01ECK5PQETDE8VDRSR6NZ6FGXQ 01ECKCJEPAJ19AJ7DXJY6K6P6B]" duration= 9.725054346s level=info ts=2020-07-07T01:58:53.953Z caller=compact.go:441 component=tsdb msg="compact blocks" count=3 mint=159401520 maxt=159408000 ulid= 01ECKFY4TFCY2J3KQ30VSZFRKS sources="[01ECKFX8HKYEP8Y2YPX2WB0PC3 01ECKFXHV47D51X0GREQM6QZE3 01ECKFXV9PSG91VCTAMAVKZ1YR]" duration= 16.305728942s This server logged finding 10 blocks when it started (see my previous email); those compactions compacted the first 9 blocks into one. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/67ceeb43-6e7c-43b4-96a1-519d7405c6dao%40googlegroups.com.
[prometheus-users] Re: No joy from Prometheus snapshot
I suspected the relative directory and then noticed the double equal in the prometheus command line. So I erased the old snapshot and tried again. sysop@r26data0:/var/lib/prometheus/snapshots$ ls sysop@r26data0:/var/lib/prometheus/snapshots$ curl -X POST http: //localhost:30909/api/v1/admin/tsdb/snapshot {"status":"success","data":{"name":"20200707T015331Z-b7fbfbcafd915bb"}} sysop@r26data0:/var/lib/prometheus/snapshots$ sysop@r26data0:/var/lib/prometheus/snapshots$ ls -la total 12 drwxr-xr-x 3 nobody nogroup 4096 Jul 7 01:53 . drwxr-xr-x 14 nobody 65533 4096 Jul 7 00:59 .. drwxr-xr-x 12 nobody nogroup 4096 Jul 7 01:53 20200707T015331Z- b7fbfbcafd915bb Next I run the server again. This time it logs messages about finding data blocks. sysop@r26data0:/var/lib/prometheus/snapshots/20200707T015331Z-b7fbfbcafd915bb$ sudo -u nobody ~/prometheus --storage.tsdb.path=$PWD --web.enable-admin-api --config.file=$HOME/prom-config/config.yaml level=info ts=2020-07-07T01:57:08.662Z caller=main.go:302 msg="No time or size retention was set so using the default time retention" duration=15d level=info ts=2020-07-07T01:57:08.662Z caller=main.go:337 msg="Starting Prometheus" version="(version=2.19.1, branch=HEAD, revision=eba3fdcbf0d378b66600281903e3aab515732b39)" level=info ts=2020-07-07T01:57:08.662Z caller=main.go:338 build_context="(go=go1.14.4, user=root@62700b3d0ef9, date=20200618-16:35:26)" level=info ts=2020-07-07T01:57:08.662Z caller=main.go:339 host_details="(Linux 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 r26data0 (none))" level=info ts=2020-07-07T01:57:08.662Z caller=main.go:340 fd_limits="(soft=65535, hard=65535)" level=info ts=2020-07-07T01:57:08.662Z caller=main.go:341 vm_limits="(soft=unlimited, hard=unlimited)" level=info ts=2020-07-07T01:57:08.664Z caller=main.go:678 msg="Starting TSDB ..." level=info ts=2020-07-07T01:57:08.664Z caller=web.go:524 component=web msg="Start listening for connections" address=0.0.0.0:9090 level=info ts=2020-07-07T01:57:08.664Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159401520 maxt=159402240 ulid= 01ECHNMH8MRPZYHM8KXKBE60ZA level=info ts=2020-07-07T01:57:08.664Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159402240 maxt=159402960 ulid= 01ECHWG8GCAZWMW1JDZB9ERQ0N level=info ts=2020-07-07T01:57:08.664Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159402960 maxt=159403680 ulid= 01ECJ3BZRM3SP7XQAZRJ867C30 level=info ts=2020-07-07T01:57:08.664Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159403680 maxt=159404400 ulid= 01ECJA7Q18RT0VW9S5HKSVDP0J level=info ts=2020-07-07T01:57:08.664Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159404400 maxt=159405120 ulid= 01ECJH3E8ENEWVK8E41G5XWM71 level=info ts=2020-07-07T01:57:08.664Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159405120 maxt=159405840 ulid= 01ECJQZ5GB6TGV21QMHCWZ3JFY level=info ts=2020-07-07T01:57:08.665Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159405840 maxt=159406560 ulid= 01ECJYTWRJZG6YSGH3XBTJK6CX level=info ts=2020-07-07T01:57:08.665Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159406560 maxt=159407280 ulid= 01ECK5PQETDE8VDRSR6NZ6FGXQ level=info ts=2020-07-07T01:57:08.665Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159407280 maxt=159408000 ulid= 01ECKCJEPAJ19AJ7DXJY6K6P6B level=info ts=2020-07-07T01:57:08.665Z caller=repair.go:59 component=tsdb msg="Found healthy block" mint=159408000 maxt=1594086829277 ulid= 01ECKFMSZ5XDWD5296R3C8ZXZ6 level=info ts=2020-07-07T01:57:08.688Z caller=head.go:645 component=tsdb msg ="Replaying WAL and on-disk memory mappable chunks if any, this may take a while" level=info ts=2020-07-07T01:57:08.688Z caller=head.go:706 component=tsdb msg ="WAL segment loaded" segment=0 maxSegment=0 level=info ts=2020-07-07T01:57:08.688Z caller=head.go:709 component=tsdb msg ="WAL replay completed" duration=365.54µs level=info ts=2020-07-07T01:57:08.690Z caller=main.go:694 fs_type= EXT4_SUPER_MAGIC level=info ts=2020-07-07T01:57:08.690Z caller=main.go:695 msg="TSDB started" level=info ts=2020-07-07T01:57:08.690Z caller=main.go:799 msg="Loading configuration file" filename=/home/sysop/prom-config/config.yaml level=info ts=2020-07-07T01:57:08.694Z caller=main.go:827 msg="Completed loading of configuration file" filename=/home/sysop/prom-config/config.yaml level=info ts=2020-07-07T01:57:08.694Z caller=main.go:646 msg="Server is ready to receive web requests." The last data block ends about 5 minutes ago. sysop@r26data0:~$ date --date @1594086829 Tue Jul 7 01:53:49 UTC 2020 But still no data. sysop@r26data0:~$ curl http://localhost:9090/api/v1/metadata {"status":"success","data":{}}sysop@r26data0:~$ The web UI shows the following for "/status". Runtime