Hi,
Could you try with honor_timestamps: false ? Thanks On 10 May 13:13, Yashar Nesabian wrote: > The other slaves have 2-3 seconds difference with the timestamp of these > metrics, and yes the 2:57pm UTC is almost correct (I don't know the exact > time) and using foo[24h] is not very informative right now because we still > have the previous metrics when the slaves were on netdata master number 1. > I did another experiment, I downloaded the metric files again and ran the > command (date +%s) on the Prometheus server almost at the same time, > The metrics' timestamp was 1589141392868 and the server's timestamp > was 1589141393 So I think this is not the problem > > On Monday, May 11, 2020 at 12:19:23 AM UTC+4:30, Julius Volz wrote: > > > > [+CCing back prometheus-users, which I had accidentally removed] > > > > How similar are the others? The ones in your example are from this > > afternoon (2:57pm UTC), I guess that's when you downloaded the file for > > grepping first? > > > > A regular instant vector selector in PromQL (like just "foo") will only > > select data points up to 5 minutes into the past from the current > > evaluation timestamp. So the table view would not show samples for any > > series whose last sample is more than 5m into the past. You could try a > > range selector like "foo[24h]" on these to see if any historical data is > > returned (I would expect so). > > > > On Sun, May 10, 2020 at 9:37 PM Yashar Nesabian <[email protected] > > <javascript:>> wrote: > > > >> Sure, here it is: > >> if the second parameter is the timestamp, then yes that's the problem, > >> but I wonder how come other metrics are stored by the Prometheus server? > >> because they also have a similar timestamp > >> > >> grep -i "netdata_web_log_detailed_response_codes_total" > >> allmetrics\?format=prometheus_all_hosts\&source=as-collected.2 | grep -i > >> "abs" > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-02.x.y.zabs"} > >> 245453 1589122673736 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-02.x.y.zabs"} > >> 82 1589122673736 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-02.x.y.zabs"} > >> 6 1589122673736 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-04.x.y.zabs"} > >> 238105 1589122673017 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-04.x.y.zabs"} > >> 59 1589122673017 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-04.x.y.zabs"} > >> 3 1589122673017 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-03.x.y.zabs"} > >> 241708 1589122673090 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-03.x.y.zabs"} > >> 68 1589122673090 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-03.x.y.zabs"} > >> 5 1589122673090 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="200",instance="abs-01.x.y.zabs"} > >> 250296 1589122674872 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="400",instance="abs-01.x.y.zabs"} > >> 81 1589122674872 > >> netdata_web_log_detailed_response_codes_total{chart="web_log_passenger_event.detailed_response_codes",family="responses",dimension="401",instance="abs-01.x.y.zabs"} > >> 7 1589122674872 > >> > >> > >> > >> On Sun, May 10, 2020 at 10:36 PM Julius Volz <[email protected] > >> <javascript:>> wrote: > >> > >>> Hmm, odd. Could you share some of the lines that your grep finds in the > >>> metrics output of the correctly scraped target? > >>> > >>> The example at the top of https://github.com/netdata/netdata/issues/3891 > >>> suggests > >>> that Netdata sets client-side timestamps for samples (which is uncommon > >>> for > >>> Prometheus otherwise). Maybe those timestamps are too far in the past > >>> (more > >>> than 5 minutes), so they would not be shown anymore? > >>> > >>> On Sun, May 10, 2020 at 6:51 PM Yashar Nesabian <[email protected] > >>> <javascript:>> wrote: > >>> > >>>> I have a job on the Prometheus which gathers metrics from 4 netdata > >>>> master servers. Here is the scenario I had: > >>>> - on netdata master number 1, I gather metrics of about 200 slaves > >>>> - For some reason, I decided to move 12 slaves > >>>> (a1,a2,a3,a4,b1,b2,b3,b4,c1,c2,c3,c4) from the first netdata master to > >>>> the > >>>> second netdata master > >>>> - Now I only see metrics from 8 servers on the Prometheus server > >>>> a1,a2,a3,a4,b1,b2,b3,b4) coming from the second master > >>>> - I check the job status in the targets page and I see all 4 masters > >>>> are up and metrics are gathered successfully > >>>> - Here is the URL which Prometheus uses to read the metrics from the > >>>> netdata master number 2: > >>>> http://172.16.76.152:19999/api/v1/allmetrics?format=prometheus_all_hosts > >>>> - I grep the downloaded file with hosts metrics for the c1,c2,c3,c4 > >>>> hosts and I see netdata is sending all the metrics relevant to these > >>>> slaves > >>>> - But when I search for the metric in the Graph page, I don't see any > >>>> results: > >>>> > >>>> [image: Screenshot from 2020-05-10 20-58-27.png] > >>>> > >>>> all the servers' time is synced and are correct. > >>>> here is the output of systemctl status prometheus: > >>>> > >>>> May 10 19:35:07 devops-mon-01 systemd[1]: Reloading Prometheus. > >>>> May 10 19:35:07 devops-mon-01 prometheus[6076]: level=info > >>>> ts=2020-05-10T15:05:07.407Z caller=main.go:734 msg="Loading > >>>> configuration > >>>> file" filename=/e > >>>> tc/prometheus/prometheus.yml > >>>> May 10 19:35:07 devops-mon-01 prometheus[6076]: level=info > >>>> ts=2020-05-10T15:05:07.416Z caller=main.go:762 msg="Completed loading of > >>>> configuration file > >>>> " filename=/etc/prometheus/prometheus.yml > >>>> May 10 19:35:07 devops-mon-01 systemd[1]: Reloaded Prometheus. > >>>> May 10 19:53:22 devops-mon-01 prometheus[6076]: level=error > >>>> ts=2020-05-10T15:23:22.621Z caller=api.go:1347 component=web msg="error > >>>> writing response" > >>>> bytesWritten=0 err="write tcp 172.16.77.50:9090->172.16.76.168:56778: > >>>> write: broken pipe" > >>>> May 10 20:25:53 devops-mon-01 prometheus[6076]: level=error > >>>> ts=2020-05-10T15:55:53.058Z caller=api.go:1347 component=web msg="error > >>>> writing response" > >>>> bytesWritten=0 err="write tcp 172.16.77.50:9090->172.16.76.168:41728: > >>>> write: broken pipe" > >>>> > >>>> 172.16.77.50 is our Prometheus server and 172.16.76.168 is our grafana > >>>> server so I think the last error is not related to my problem > >>>> > >>>> -- > >>>> You received this message because you are subscribed to the Google > >>>> Groups "Prometheus Users" group. > >>>> To unsubscribe from this group and stop receiving emails from it, send > >>>> an email to [email protected] <javascript:>. > >>>> To view this discussion on the web visit > >>>> https://groups.google.com/d/msgid/prometheus-users/156d8c36-c1de-4ca3-8b2a-2cfbcb5895fc%40googlegroups.com > >>>> > >>>> <https://groups.google.com/d/msgid/prometheus-users/156d8c36-c1de-4ca3-8b2a-2cfbcb5895fc%40googlegroups.com?utm_medium=email&utm_source=footer> > >>>> . > >>>> > >>> > >>> > >>> -- > >>> Julius Volz > >>> PromLabs - promlabs.com > >>> > >> > >> > >> -- > >> > >> *Best Regards* > >> > >> *Yashar Nesabian* > >> > >> *Senior Site Reliability Engineer* > >> > > > > > > -- > > Julius Volz > > PromLabs - promlabs.com > > > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/763aa44f-814a-40c9-b291-da1251329134%40googlegroups.com. -- Julien Pivotto @roidelapluie -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/20200510205424.GA174608%40oxygen.
signature.asc
Description: PGP signature

