[+CCing back prometheus-users, which I had accidentally removed] How similar are the others? The ones in your example are from this afternoon (2:57pm UTC), I guess that's when you downloaded the file for grepping first?
A regular instant vector selector in PromQL (like just "foo") will only select data points up to 5 minutes into the past from the current evaluation timestamp. So the table view would not show samples for any series whose last sample is more than 5m into the past. You could try a range selector like "foo[24h]" on these to see if any historical data is returned (I would expect so). On Sun, May 10, 2020 at 9:37 PM Yashar Nesabian <[email protected]> wrote: > Sure, here it is: > if the second parameter is the timestamp, then yes that's the problem, but > I wonder how come other metrics are stored by the Prometheus server? > because they also have a similar timestamp > > grep -i "netdata_web_log_detailed_response_codes_total" > allmetrics\?format=prometheus_all_hosts\&source=as-collected.2 | grep -i "abs" > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-02.x.y.zabs"} > 245453 1589122673736 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-02.x.y.zabs"} > 82 1589122673736 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-02.x.y.zabs"} > 6 1589122673736 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-04.x.y.zabs"} > 238105 1589122673017 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-04.x.y.zabs"} > 59 1589122673017 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-04.x.y.zabs"} > 3 1589122673017 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-03.x.y.zabs"} > 241708 1589122673090 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-03.x.y.zabs"} > 68 1589122673090 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-03.x.y.zabs"} > 5 1589122673090 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-01.x.y.zabs"} > 250296 1589122674872 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-01.x.y.zabs"} > 81 1589122674872 > netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-01.x.y.zabs"} > 7 1589122674872 > > > > On Sun, May 10, 2020 at 10:36 PM Julius Volz <[email protected]> > wrote: > >> Hmm, odd. Could you share some of the lines that your grep finds in the >> metrics output of the correctly scraped target? >> >> The example at the top of https://github.com/netdata/netdata/issues/3891 >> suggests >> that Netdata sets client-side timestamps for samples (which is uncommon for >> Prometheus otherwise). Maybe those timestamps are too far in the past (more >> than 5 minutes), so they would not be shown anymore? >> >> On Sun, May 10, 2020 at 6:51 PM Yashar Nesabian <[email protected]> >> wrote: >> >>> I have a job on the Prometheus which gathers metrics from 4 netdata >>> master servers. Here is the scenario I had: >>> - on netdata master number 1, I gather metrics of about 200 slaves >>> - For some reason, I decided to move 12 slaves >>> (a1,a2,a3,a4,b1,b2,b3,b4,c1,c2,c3,c4) from the first netdata master to the >>> second netdata master >>> - Now I only see metrics from 8 servers on the Prometheus server >>> a1,a2,a3,a4,b1,b2,b3,b4) coming from the second master >>> - I check the job status in the targets page and I see all 4 masters are >>> up and metrics are gathered successfully >>> - Here is the URL which Prometheus uses to read the metrics from the >>> netdata master number 2: >>> http://172.16.76.152:19999/api/v1/allmetrics?format=prometheus_all_hosts >>> - I grep the downloaded file with hosts metrics for the c1,c2,c3,c4 >>> hosts and I see netdata is sending all the metrics relevant to these slaves >>> - But when I search for the metric in the Graph page, I don't see any >>> results: >>> >>> [image: Screenshot from 2020-05-10 20-58-27.png] >>> >>> all the servers' time is synced and are correct. >>> here is the output of systemctl status prometheus: >>> >>> May 10 19:35:07 devops-mon-01 systemd[1]: Reloading Prometheus. >>> May 10 19:35:07 devops-mon-01 prometheus[6076]: level=info >>> ts=2020-05-10T15:05:07.407Z caller=main.go:734 msg="Loading configuration >>> file" filename=/e >>> tc/prometheus/prometheus.yml >>> May 10 19:35:07 devops-mon-01 prometheus[6076]: level=info >>> ts=2020-05-10T15:05:07.416Z caller=main.go:762 msg="Completed loading of >>> configuration file >>> " filename=/etc/prometheus/prometheus.yml >>> May 10 19:35:07 devops-mon-01 systemd[1]: Reloaded Prometheus. >>> May 10 19:53:22 devops-mon-01 prometheus[6076]: level=error >>> ts=2020-05-10T15:23:22.621Z caller=api.go:1347 component=web msg="error >>> writing response" >>> bytesWritten=0 err="write tcp 172.16.77.50:9090->172.16.76.168:56778: >>> write: broken pipe" >>> May 10 20:25:53 devops-mon-01 prometheus[6076]: level=error >>> ts=2020-05-10T15:55:53.058Z caller=api.go:1347 component=web msg="error >>> writing response" >>> bytesWritten=0 err="write tcp 172.16.77.50:9090->172.16.76.168:41728: >>> write: broken pipe" >>> >>> 172.16.77.50 is our Prometheus server and 172.16.76.168 is our grafana >>> server so I think the last error is not related to my problem >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Prometheus Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/prometheus-users/156d8c36-c1de-4ca3-8b2a-2cfbcb5895fc%40googlegroups.com >>> <https://groups.google.com/d/msgid/prometheus-users/156d8c36-c1de-4ca3-8b2a-2cfbcb5895fc%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> >> -- >> Julius Volz >> PromLabs - promlabs.com >> > > > -- > > *Best Regards* > > *Yashar Nesabian* > > *Senior Site Reliability Engineer* > -- Julius Volz PromLabs - promlabs.com -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAObpH5yC5-W2cSAqy3%3DC48JqLZWO2U4RNDGXAT9ty5Jr8odtzw%40mail.gmail.com.

