Re: [prometheus-users] Re: Efficient way to query non-active time series's last value

Brian Candler Tue, 01 Sep 2020 02:01:52 -0700

On Tuesday, 1 September 2020 01:55:15 UTC+1, Peter S wrote:
>
> Thanks. Unfortunately, exporting and scraping the same values have become 
> costly for us. We have metrics endpoints of 50MB+, and scraping have begun 
> to time out more and more often.
>
>
Sorry, can you explain what you mean by "metrics endpoints of 50MB+" ?  
Where are you measuring 50MB exactly?

If you have 50 million timeseries, that's huge. But I don't think that's
what you mean.

If you are returning 50MB of prometheus line-format data in a single
scrape, that's quite a lot, but it will compress to very little in the TSDB
if the values are not changing.

What's important to prometheus is not the volume of the scrape, but the
number of active timeseries. Timeseries are active if they're in the head,
which means a sample has been seen in the last ~2 hours. Leaving gaps in
the timeseries, when the gaps are less than 2 hours, is not going to save
you any TSDB resources at all, but will cause you problems with staleness
at query time.

What are you trying to optimise: the volume of TSDB storage, or the volume
of network traffic? If it's network traffic then you might be better off
having a local prometheus server right next to where the data is
collected. You can either query it directly, or via promxy, or use
something like Thanos. In either case, the only traffic will be the query
request/response.

You could also use remote_write to forward data to a central server such as
VictoriaMetrics, although I have not measured how the volume of
remote_write traffic compares with the volume of prometheus line protocol
traffic.

Another option to consider would be to use statsd_exporter or possibly
pushgateway, and have those local to your prometheus server. The remote
metrics updates would be done via statsd or pushgateway updates, and when
they don't change, prometheus just scrapes the same value.

Finally, it would be pretty easy to write a proxy which is tailored to your
requirements: incoming scrape performs outbound scrape, merges the results
into a cache, and then returns the whole cache contents.

--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/eeb6bd7d-153f-4c24-9a1d-4f2c5fdf1286o%40googlegroups.com.

Re: [prometheus-users] Re: Efficient way to query non-active time series's last value

Reply via email to