Re: [prometheus-users] Re: Simple increase - sum of the metrics in time range with dynamic metric count

2023-09-26 Thread Aliaksandr Valialkin
вс, 24 сент. 2023 г., 19:43 Michał Idzikowski :

> raw_data - processed_bytes
> increase - increase(processed_bytes)
> sum increase - sum(increase(processed_bytes))
> sum increase range - sum(increase(processed_bytes[$__range]))
>
> What I want to see is ~530G at the end, starting from 0.
>

In VictoriaMetrics you can use the following MetricsQL query for bulilding
summary increase graph over multiple time series of counter type, which
starts from zero on the left side:

running_sum(sum(increase(metric_name)))

Note that you don't need specifying lookbehind window in square brackets at
increase(...), since VictoriaMetrics automatically sets it to the interval
between points shown on the graph (aka step query arg automatically passed
by Grafana to /api/v1/query_range - see
https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries ).

I need to manually check, because maybe this graph/metric (sum increase
> range) has correct final value, just misleading shape falling down few
> times.
>
> piątek, 22 września 2023 o 15:58:07 UTC+2 Brian Candler napisał(a):
>
>> You haven't shown any examples of the metrics, nor the queries relating
>> to each of those graphs.
>>
>> Speaking generally though, given that counters can reset, I think the
>> best you can do is to use increase(foo[time]) which will give you an
>> *estimate* of the amount the counter has increased over the given time
>> window (it calculates the rate, skipping over counter resets, and scales it
>> up to cover the whole window period. This may give a non-integer result).
>> You should then be able to sum() over that: sum(increase(foo[time])).
>>
>> Note that it will give you the amount of processing done *in that
>> specified time window*, not since you started monitoring.
>>
>> You said you tried sum(increase), and maybe it's one of the graphs you
>> showed.  Suppose you made a graph of sum(increase(foo[24h])); then each
>> point on that graph represents the amount of work done *in the 24 hours up
>> to that point*. This value will of course go up and down, since the amount
>> of work done in any given 1 day period may go up and down.
>>
>> You can't possibly know the total amount of work done since the beginning
>> of time, if the counters arbitrarily reset.  You would have to have to
>> create a persistent, non-resetting counter.
>>
>> increase(sum) is wrong because it can't handle the counter resets
>> properly: see
>> https://www.robustperception.io/rate-then-sum-never-sum-then-rate (rate
>> and increase are essentially the same function, except increase scales its
>> output by the width of the window)
>>
>> sum_over_time is very wrong: it would add together all the values within
>> the time window.
>>
>> On Friday, 22 September 2023 at 14:32:19 UTC+1 Michał Idzikowski wrote:
>>
>>> Hello!
>>> I'm fightning hard to get corect result. The problem is - I need to sum
>>> data processed by service. Metrics are counters, instances are sometimes
>>> replaced by newer ones, so they don't outlive time range most of the time.
>>>
>>> I've tried multiple combinations of increase, sum_over_time,
>>> sum(increase), increase(sum) and even tried on VictoriaMetrics. Each time I
>>> got a result were the final sum is dropping in multiple places - and as you
>>> may image - there's is no un-processing of the data.
>>>
>>>
>>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/99452267-8979-4ca9-9c3c-1b5d6b496401n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmByFn4koeRR0s%3DaN%2BS%3DKMFqfsMY0nDQYXdT-9sgpyiGoA%40mail.gmail.com.


Re: [prometheus-users] Re: Help me to understand Prometheus terms

2022-09-23 Thread Aliaksandr Valialkin
See also https://docs.victoriametrics.com/keyConcepts.html

On Fri, Sep 23, 2022 at 10:22 AM Brian Candler  wrote:

> Does this help?
>
> https://nsrc.org/workshops/2021/pacnog29/nmm/netmgmt/en/nmm-2.0/NMM-2.0-metrics-time-series.pdf
>
> On Friday, 23 September 2022 at 08:08:43 UTC+1 promus...@gmail.com wrote:
>
>> Hi,
>>
>> I am a new user to the prometheus. Can someone please help me to
>> understand the difference between a Metric, timeseries and samples?
>>
>> Thanks,
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/f9a2805d-b449-488e-acdf-f1bc7ea563adn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/f9a2805d-b449-488e-acdf-f1bc7ea563adn%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmA9B30Z8MvuEErs%2BQO0_t7hFpn8Bzs168Fu%2BgwVen1mCw%40mail.gmail.com.


Re: [prometheus-users] Counting observations per Histogram bucket within a range

2021-09-16 Thread Aliaksandr Valialkin
On Thu, Sep 9, 2021 at 10:42 AM José San Gil  wrote:

> increase(operation_x_bucket{operationType="TypeA", le="+Inf"}[24h]), I'd
> be getting the total number of observations of the bucket for that specific
> operationType in the last 24h. Is that correct?
>
> The problem is the that the values I get are significantly lower than
> reality for a period of let's say 24 hours. I thought that maybe he buckets
> might not behave as a regular counter To view this discussion on the web
> visit
> https://groups.google.com/d/msgid/prometheus-users/CAOULDYmnSbgvJuHpso7t53gTB0GusvxGk2SiQDGChsVzy3hXeA%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAOULDYmnSbgvJuHpso7t53gTB0GusvxGk2SiQDGChsVzy3hXeA%40mail.gmail.com?utm_medium=email_source=footer>
> .
>

The issue may be related to the fact that increase() in Prometheus may
perform extrapolation in some cases, so the end result might be slightly
different than expected. See
https://github.com/prometheus/prometheus/issues/3746 for details.

-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAgE5hCOE1f88FMUS8%2BRXr31%3DxZRY4W3gnM%3Ds%3D0CS%3DdfA%40mail.gmail.com.


Re: [prometheus-users] Prometheus (metric) relabel config with inverse regex match / negative lookahead

2021-06-18 Thread Aliaksandr Valialkin
Try the following relabeling rules:

- source_labels: [mountpoint]
  regex: '(/home|/var/domains)/something.*'
  target_label: __keep
  replacement: yes
- source_labels: [mountpoint]
  regex: ''
  target_label: __keep
  replacement: yes
- source_labels: [__keep]
  regex: yes
  action: keep

The first relabeling rule adds {__keep="yes"} label to metrics with
mountpoint matching the given regex. The second relabeling rule adds
{__keep="yes"} label to metrics with empty `mountpoint` label, e.g. metrics
without this label. The last relabeling rule drops all the metrics
without {__keep="yes"} label. See
https://www.robustperception.io/or-in-relabelling .


On Fri, Jun 18, 2021 at 8:24 PM Julian v.d Berkmortel <
julianvdberkmortel1...@gmail.com> wrote:

> Right now I'm scraping metrics from a Node Exporter. Some of the metrics
> which the Node Exporters exports have a `mountpoint` label.
>
> I'd like to drop time series that have this label and **do not** match a
> regular expression. I tried using the `keep` action (as I'd like to keep
> time series that **do** match this regular expression) but this also drops
> all other metrics that do not have the `mountpoint` label.
>
> "metric_relabel_configs:
>   - source_labels: ['mountpoint']
> regex: '(\/home|\/var\/domains)\/something.*'
> action: keep"
>
> I tried using the `drop` action too but this requires me to inverse the
> regular expression using a negative-lookahead (which isn't supported
> because Prometheus is written in Go of course).
>
> What are my options in this?
>
> **Important,** I do not have control over the way the Node Exporter is
> configured, thus I can't configure the Node Exporter itself to not export
> metrics for some specific mountpoint (if this is even possible).
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/e2bf0768-8213-4c35-b0e0-4c03395ffdc6n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/e2bf0768-8213-4c35-b0e0-4c03395ffdc6n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBxStnOXvZODe20RAJEbFaLwMVmV6YP8r2j%2Bf9dwVAKWg%40mail.gmail.com.


Re: [prometheus-users] Difference between 2 metrics

2021-05-17 Thread Aliaksandr Valialkin
Just write metric_a - metric_b . See
https://prometheus.io/docs/prometheus/latest/querying/operators/#arithmetic-binary-operators


On Mon, May 17, 2021 at 8:07 PM Sachin Patil 
wrote:

> Hi Team,
>
> is there any Prometheus alert rule function to get the  difference between
> 2 metrics
>
>
> Thanks and regards,
> Sachin P.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/cd620fb5-f533-4ff3-a0d7-bb32c5f181b3n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/cd620fb5-f533-4ff3-a0d7-bb32c5f181b3n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDgvk2ydeDT3CMpENzXnRCOQxSc8B-GGKS3Mg_N3HJg2Q%40mail.gmail.com.


Re: [prometheus-users] Best way to Exporting Prometheus Metrics to BQ table

2021-05-08 Thread Aliaksandr Valialkin
I believe this can be done with the help of  "promtool tsdb dump" - see
https://manpages.debian.org/unstable/prometheus/promtool.1.en.html#tsdb_dump_%5B%3Cflags%3E%5D_%5B%3Cdb_path*%3E%5D


On Sat, May 8, 2021 at 1:42 AM chuanjia xing  wrote:

> Hi there,
>  I am wondering what is the best way to exporting prometheus data to
> BigQuery? For my use case, Prometheus data is stored locally. Now due to
> some requirement, we want to export Prometheus data to some BigQuery table.
> I want to see if there're already some prometheus plugins existing to do
> this kind of job, since I think this is not a very rare case. Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/fb002318-8c39-49d7-a289-1fdc395135c0n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/fb002318-8c39-49d7-a289-1fdc395135c0n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmC0UvSN_7XwkMZ%3DH8DTgOhKdkq6OQxVnxGYSh606Rv%2B1g%40mail.gmail.com.


Re: [prometheus-users] Metrics naming convention

2021-05-03 Thread Aliaksandr Valialkin
https is just an http over tls. That's why it is ok to have http_ prefix
for this metric. The `tls` property can be passed via a label
{is_secure="true"} or something like that.

On Mon, May 3, 2021 at 5:28 PM vteja...@gmail.com 
wrote:

> Hi,
>
> Why is it *http_requests_total* if *https *is used? Any thoughts on
> naming convention?
>
> Thanks,
> Teja
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/550a3eab-3952-4590-843d-27a7c2873d7bn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/550a3eab-3952-4590-843d-27a7c2873d7bn%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDH7koiYme%3D89P6EcHdJbNfkHv1G9HcDv4tYbq2Vzg64g%40mail.gmail.com.


Re: [prometheus-users] Count over period of time

2021-04-19 Thread Aliaksandr Valialkin
Try the following query:

count(metrics) - count(metrics offset 24h)

It should return the delta in the number of time series with the name
`metrics` between now and 24 hours ago.

It would be great if you could share mode details on the use case for this
query, since the requirements look slightly non-standard.

On Thu, Apr 15, 2021 at 4:01 PM sunils...@gmail.com 
wrote:

> Hi Folks ,
>
> I got stuck into issue related to count over period of time .
>
> Currently I am getting count of metrics with below promql . I want to know
> the count increased over 1 day period
>
> promql : count(metrics)
>
> How I can find increase in last 24 hours
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/41d77f8d-cfcd-4106-8cc2-de7413dd3a2bn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/41d77f8d-cfcd-4106-8cc2-de7413dd3a2bn%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCA1ifZXy4UQ30CdPYaZDQubgU8yi9SekQw5ekt1CDzfQ%40mail.gmail.com.


Re: [prometheus-users] Anyone used ClickHouse as remote_write?

2021-03-30 Thread Aliaksandr Valialkin
Probably you need VictoriaMetrics
<https://github.com/VictoriaMetrics/VictoriaMetrics/> instead of
ClickHouse. It is built on top of ClickHouse architecture ideas, which give
good performance. It provides PromQL-like query language instead of SQL -
MetricsQL <https://victoriametrics.github.io/MetricsQL.html> - which is
better suited for typical queries for monitoring. See
https://faun.pub/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac
and
https://valyala.medium.com/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282
.

On Tue, Mar 30, 2021 at 11:13 PM an...@signoz.io  wrote:

> Has anyone sent Prometheus Metrics to ClickHouse?
>
> I need to understand how metric types Counter and Gauge be ingested to
> ClickHouse? ClickHouse has a rollup feature to sum up events before
> ingestion but the Counters are already summed up. Also, Gauge metric can
> decrease over time. I can't clearly see how this can be handled in
> ClickHouse.
>
> There is a repo to remote_write to clickhouse
> https://github.com/mindis/prom2click#getting-started
>
> Will the table schema work for my above concerns?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/5973e5ec-e070-45be-8dca-d18120d4d5e4n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/5973e5ec-e070-45be-8dca-d18120d4d5e4n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAP9eYKbjLmk2TQ0yfEC2eJmyjVkYOZAWOzJgsQ%2BrV9MA%40mail.gmail.com.


Re: [prometheus-users] Consul SD and adding labels via static_config

2021-03-25 Thread Aliaksandr Valialkin
The `labels` section is unavailable in `consul_sd_configs`. If you want
adding labels to all the metrics collected from discovered Consul targets,
then add the following relabeling rules under consul_sd_configs section:

- target_label: env
  replacement: foo
- target_label: region
  replacement: bar


See this article
<https://valyala.medium.com/how-to-use-relabeling-in-prometheus-and-victoriametrics-8b90fc22c4b2>
for details.

On Thu, Mar 25, 2021 at 12:36 PM Rasmus Rüngenen 
wrote:

> Hello! I wanted to ask if it would be possible to add labels via
> static_config to a scrape job that uses the Consul Service Discovery
> mechanism?
>
> Right now I have tested two solutions and with the static target, the
> labels are being applied while the Consul SD does not apply.
>
> Consul SD:
>   - job_name: 'kubernetes-pods-federation'
> scrape_interval: 30s
> metrics_path: '/federate'
> consul_sd_configs:
>   - server: 'localhost:8500'
> relabel_configs:
>   - source_labels: [__meta_consul_tags]
> regex: '^.*k8s.*$'
> action: keep
>   - source_labels: ['__meta_consul_node']
> target_label: instance
>   - source_labels: ['__meta_consul_service']
> target_label: service
>   - source_labels: ['__meta_consul_tags']
> target_label: consul_tags
> static_configs:
>   - labels:
>   env: foo
>   region: bar
>
> and second that has static target
>   - job_name: 'consul-agent'
> scrape_interval: 30s
> metrics_path: '/v1/agent/metrics'
> params:
>   format: ['prometheus']
> scheme: http
> static_configs:
>  - targets: ['localhost:8500']
>labels:
>  env: foo
>  region: bar
>
> Thanks in advance for the help!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/2bff96e0-a11f-4d10-b01d-60965241e099n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/2bff96e0-a11f-4d10-b01d-60965241e099n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCUcGC%2BZ1ujVy3gpTz21pF39_G%2BjYvF5_dAoSN27aKopg%40mail.gmail.com.


Re: [prometheus-users] Setting up a Prometheus environment with a push mechanism

2021-03-25 Thread Aliaksandr Valialkin
 from both. It shows for example for
>>>>> server A
>>>>> > the cpu load but it shows it twice instead of only once because its
>>>>> being
>>>>> > pulled from both exporters. We want to have something like this but
>>>>> it
>>>>> > needs to have a form of deduplication so we dont have the same data
>>>>> twice.
>>>>> >
>>>>> >
>>>>> > Op donderdag 25 maart 2021 om 13:27:44 UTC+1 schreef
>>>>> sup...@gmail.com:
>>>>> >
>>>>> > > Can you describe more about what your network topology is exactly?
>>>>> There
>>>>> > > are a number of solutions for dealing with distributed monitoring.
>>>>> > >
>>>>> > > On Thu, Mar 25, 2021 at 12:45 PM robbe vaes 
>>>>> wrote:
>>>>> > >
>>>>> > >> Hi,
>>>>> > >>
>>>>> > >> I am trying to setup a monitoring environment with Prometheus,
>>>>> but it has
>>>>> > >> to be using a push mechanism instead of the standard pull
>>>>> mechanism
>>>>> > >> Prometheus uses. I was wondering what options there are to create
>>>>> an
>>>>> > >> environment like this. It would also have to perfom data
>>>>> deduplication. The
>>>>> > >> main issue is that I don't want Prometheus to scrape the clients
>>>>> itself,
>>>>> > >> but rather that it scrapes a certain location for all the metrics
>>>>> and that
>>>>> > >> the clients push their metrics to that location automatically.
>>>>> > >>
>>>>> > >> Suggestions are very welcome!
>>>>> > >>
>>>>> > >> --
>>>>> > >> You received this message because you are subscribed to the
>>>>> Google Groups
>>>>> > >> "Prometheus Users" group.
>>>>> > >> To unsubscribe from this group and stop receiving emails from it,
>>>>> send an
>>>>> > >> email to prometheus-use...@googlegroups.com.
>>>>> > >> To view this discussion on the web visit
>>>>> > >>
>>>>> https://groups.google.com/d/msgid/prometheus-users/44d081b2-90c1-451f-ac94-efb143fdd0c0n%40googlegroups.com
>>>>> > >> <
>>>>> https://groups.google.com/d/msgid/prometheus-users/44d081b2-90c1-451f-ac94-efb143fdd0c0n%40googlegroups.com?utm_medium=email_source=footer>
>>>>>
>>>>> > >> .
>>>>> > >>
>>>>> > >
>>>>> >
>>>>> > --
>>>>> > You received this message because you are subscribed to the Google
>>>>> Groups "Prometheus Users" group.
>>>>> > To unsubscribe from this group and stop receiving emails from it,
>>>>> send an email to prometheus-use...@googlegroups.com.
>>>>> > To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/prometheus-users/3325df03-3ad7-4d24-be81-f0e6c4c5b07fn%40googlegroups.com.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Julien Pivotto
>>>>> @roidelapluie
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Prometheus Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to prometheus-use...@googlegroups.com.
>>>>
>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/prometheus-users/634b223b-467e-4356-a556-513675d79b9fn%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/prometheus-users/634b223b-467e-4356-a556-513675d79b9fn%40googlegroups.com?utm_medium=email_source=footer>
>>>> .
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>> Aliaksandr Valialkin, CTO VictoriaMetrics
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/9f28afd6-2516-4329-b8df-dc08d5bb84f9n%40googlegroups.com
>> <https://groups.google.com/d/msgid/prometheus-users/9f28afd6-2516-4329-b8df-dc08d5bb84f9n%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
>
>
> --
> Julius Volz
> PromLabs - promlabs.com
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAObpH5x%2BM9i%3DuDS45tGgVSmERv8fjD56p0qj4VSM2dQf6mEoYg%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAObpH5x%2BM9i%3DuDS45tGgVSmERv8fjD56p0qj4VSM2dQf6mEoYg%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDbzhCoXCzor%3D%2BP1_dy58e%3Ds0Y-ajc%2BCD3NbZGp7A5RfQ%40mail.gmail.com.


Re: [prometheus-users] Setting up a Prometheus environment with a push mechanism

2021-03-25 Thread Aliaksandr Valialkin
Probably VictoriaMetrics could help with this setup. It supports data push
via various protocols
<https://victoriametrics.github.io/#how-to-import-time-series-data>, it
supports data deduplication
<https://victoriametrics.github.io/#deduplication> and it is compatible
with Prometheus datasource in Grafana
<https://victoriametrics.github.io/#grafana-setup>.

On Thu, Mar 25, 2021 at 3:20 PM robbe vaes  wrote:

> The thing is we can implement multiple prometheus instances as well, thats
> no issue and will probably happen anyway. The thing is, we tried using
> Thanos to manage multiple prometheus servers and do deduplication, but the
> deduplication does not work for the collectd exporters. The problem its
> having i think is that we need to filter out duplicate data by exported
> instance, but this is not possible for Thanos since it needs predefined
> external labels within prometheus.
>
> Regards,
>
> Op donderdag 25 maart 2021 om 14:15:24 UTC+1 schreef Julien Pivotto:
>
>> The way this is usually solved is by duplicating prometheus - it seems
>> that now you have moved the SPOF from the exporter to prometheus.
>>
>> Regards,
>>
>> On 25 Mar 06:01, robbe vaes wrote:
>> > Okay so, we want to have an environment using Prometheus, where we can
>> > monitor our servers etc with a push method rather then pull due to
>> network
>> > security aspects. As of now, we managed to set up collectd together
>> with
>> > collectd exporter for prometheus. This way we can have the clients or
>> > servers push their data to the exporter and have prometheus scrape the
>> data
>> > from the exporter to minimize the amount of pulling that has to be
>> done.
>> > The issue now is, we want to have multiple collectd exporters to
>> improve
>> > HA, but the problem now is that in prometheus, when scraping both
>> > exporters, it takes the data from both. It shows for example for server
>> A
>> > the cpu load but it shows it twice instead of only once because its
>> being
>> > pulled from both exporters. We want to have something like this but it
>> > needs to have a form of deduplication so we dont have the same data
>> twice.
>> >
>> >
>> > Op donderdag 25 maart 2021 om 13:27:44 UTC+1 schreef sup...@gmail.com:
>> >
>> > > Can you describe more about what your network topology is exactly?
>> There
>> > > are a number of solutions for dealing with distributed monitoring.
>> > >
>> > > On Thu, Mar 25, 2021 at 12:45 PM robbe vaes 
>> wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> I am trying to setup a monitoring environment with Prometheus, but
>> it has
>> > >> to be using a push mechanism instead of the standard pull mechanism
>> > >> Prometheus uses. I was wondering what options there are to create an
>> > >> environment like this. It would also have to perfom data
>> deduplication. The
>> > >> main issue is that I don't want Prometheus to scrape the clients
>> itself,
>> > >> but rather that it scrapes a certain location for all the metrics
>> and that
>> > >> the clients push their metrics to that location automatically.
>> > >>
>> > >> Suggestions are very welcome!
>> > >>
>> > >> --
>> > >> You received this message because you are subscribed to the Google
>> Groups
>> > >> "Prometheus Users" group.
>> > >> To unsubscribe from this group and stop receiving emails from it,
>> send an
>> > >> email to prometheus-use...@googlegroups.com.
>> > >> To view this discussion on the web visit
>> > >>
>> https://groups.google.com/d/msgid/prometheus-users/44d081b2-90c1-451f-ac94-efb143fdd0c0n%40googlegroups.com
>> > >> <
>> https://groups.google.com/d/msgid/prometheus-users/44d081b2-90c1-451f-ac94-efb143fdd0c0n%40googlegroups.com?utm_medium=email_source=footer>
>>
>> > >> .
>> > >>
>> > >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "Prometheus Users" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an email to prometheus-use...@googlegroups.com.
>> > To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/3325df03-3ad7-4d24-be81-f0e6c4c5b07fn%40googlegroups.com.
>>
>>
>>
&

Re: [prometheus-users] Why does increase() return fractional results for integer-valued metrics?

2021-03-22 Thread Aliaksandr Valialkin
Prometheus extrapolates `increase()` results - see
https://github.com/prometheus/prometheus/issues/3746 for more details.
There is an implementation, which returns exact results from increase()
without extrapolation - https://victoriametrics.github.io/MetricsQL.html .

On Fri, Mar 19, 2021 at 4:55 PM John Dexter  wrote:

> increase(messages_total{direction='send'}[1m])
>
> I am finding that queries like this are frequently return fractional
> values, even though these metrics are only ever incremented.
> I am expecting increase() to show me the difference in the metric value
> between two time-stamps and both of those will be integral values so what
> is going on? Some interpolation perhaps?
>
> I want to graph how how many messages were actually sent in each actual
> minute e.g. I should be able to corroborate the value on the graph with a
> log of each minute's activity in a log in testing.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAGJRanhc27CwtOiX-s-qn1SFk%3DigsA3V%2BH5VB6ttdgW%2BrO8JNQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAGJRanhc27CwtOiX-s-qn1SFk%3DigsA3V%2BH5VB6ttdgW%2BrO8JNQ%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAV%3DSEjgAgETkWmuRsmvC97N4j7oFG1KNvj7ZTMcmsOOA%40mail.gmail.com.


Re: [prometheus-users] Re: How query data with PromQL from Influxdb with remote_read configuration in Prometheus

2021-03-18 Thread Aliaksandr Valialkin
If you need querying Influx data with PromQL, then take a look at
VictoriaMetrics - it accepts Influx line protocol data (
https://victoriametrics.github.io/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf
) and it supports PromQL-like query language - MetricsQL (
https://victoriametrics.github.io/MetricsQL.html ). Additionally, it
requires a lower amount of resources compared to InfluxDB -
https://valyala.medium.com/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893
.

On Thu, Mar 18, 2021 at 3:50 AM Kai Xue  wrote:

>
> I think i solved this problem, I thought prometheus can directly read the
> data in influxdb, but it only supports reading his remote_write data.
>
> *--*
> *Regards*
> *Kai Xue*
>
> *slack link <https://tusimple.slack.com/archives/D01PNSZ0MEF>*
> *SRE of Infrastructure China*
>
>
> Kai Xue  于2021年3月17日周三 下午2:22写道:
>
>> I also encountered the same problem, have you solved it over there?
>>
>> On Tuesday, November 3, 2020 at 4:06:06 AM UTC+8 andrej...@gmail.com
>> wrote:
>>
>>> Hi to everyone,
>>>
>>> I would like to make PromQL query over influxdb database data. According
>>> to documentation, I configured *remote_read* in pormetheus
>>> configuration file. Then, I create database *pometheus* in influxdb and
>>> create measurement named *mymetric* and insert there 10 values:
>>>
>>> > use prometheus Using database prometheus
>>> > select * from mymetric
>>> name: mymetric
>>> time type value
>>>    -
>>> 1604262435942881813B 1
>>> 1604262437459302370B 2
>>> 1604262440173123935B 3
>>> 1604262441637944365B 4
>>> 1604262443156839758B 5
>>> 1604262444871165072B 6
>>> 1604262446436865145B 7
>>> 1604262448095138304B 8
>>> 1604262449753490166B 9
>>> 1604262451728886523B 10
>>>
>>> I can see influx logs, that prometheus is trying to reach endpoint and
>>> is successful:
>>>
>>> ```
>>> influxdb-test | [httpd] 192.168.16.2 - - [01/Nov/2020:20:32:20 +]
>>> "POST /api/v1/prom/read?db=prometheus HTTP/1.1" 200 4 "-"
>>> "Go-http-client/1.1" 5702f0b8-1c81-11eb-8158-0242c0a81005 271
>>> influxdb-test | [httpd] 192.168.16.2 - - [01/Nov/2020:20:32:20 +]
>>> "POST /api/v1/prom/read?db=prometheus HTTP/1.1" 200 4 "-"
>>> "Go-http-client/1.1" 57032f67-1c81-11eb-8159-0242c0a81005 222
>>> ```
>>> According to me, now I should be able to see *mymetric* in prometheus.
>>> But I don't see any metric besides some basics metrics like *scrape_**,
>>> *up*, etc...
>>>
>>> What am I doing wrong, How should I reach metrics from influx database
>>> prometheus?
>>>
>>> *Versions:*
>>> Influx: 1.8.3
>>> Prometheus: v2.1.0
>>>
>>> Thank you.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/d3fc13c3-d588-4ed2-a485-a20fcccab089n%40googlegroups.com
>> <https://groups.google.com/d/msgid/prometheus-users/d3fc13c3-d588-4ed2-a485-a20fcccab089n%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAGRfKW5BjSb8dMOBve7Wia-XSQYK8oy2TyZR6_OdORxKg36E3w%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAGRfKW5BjSb8dMOBve7Wia-XSQYK8oy2TyZR6_OdORxKg36E3w%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmB-T_ejX3JZZLD1%3DvFLb_RB4yL-mw-7uMYnjpfFOwyCPQ%40mail.gmail.com.


Re: [prometheus-users] Prometheus not able to scale vertically due to lock contention

2021-03-09 Thread Aliaksandr Valialkin
On Fri, Mar 5, 2021 at 12:10 AM Dhruv Patel 
wrote:

> Hi Folks,
>   We are seeing an issue in our current Prometheus Setup where we are not
> able to ingest beyond 22 million metrics/min. We have run several Load Test
> at 25 Million, 29 Million and 35 Million but the ingestion rate remains
> constant around the same 22 million metrics/min. Moreover, we are also
> seeing that our CPU Usage is around 70% and have more than 50% memory
> available memory. Looking at this it feels like we are not hitting resource
> limitations but something to do with lock contention.
>
> *Prometheus Version:* 2.9.1
> *Host Shape:* x7-enclave-104 (It is a bare metal host with 104 processor
> units). More info can be obtained in below screenshots
> *Memory Info: *
>totalusedfree shared
> buff/cache   available
> Mem:   754G 88G528G 67M136G
> 719G
> Swap:  1.0G   0B   1.0G
> Total:   755G  88G529G
>
> We also ran some profiling during our load test setup at 20Million, 22
> Million and 25 Million and have seen an increase in time taken taken for
> running runtime.mallocgc which leads to an increased usage in
> runtime.futex. Some how we are not able to figure out what could be the
> issue of the lock contention. I have attached our profiling results at
> different load test levels if thats any useful. Any ideas on what could be
> causing the high time taken in runtime malloc gc?
>

Prometheus is written in Go. The runtime.mallocgc function is called every
time Prometheus allocates a new object during its operation. It looks like
Prometheus 2.9.1 allocates a lot during the load test. The runtime.futex is
used internally by Go runtime during objects' allocation and subsequent
objects' deallocation (aka garbage collection). It looks like the Go
runtime used in Prometheus 2.9.1 isn't optimized well for programs with
frequent object allocations that run on systems with many CPU cores. This
should be improved in Go 1.15 - Allocation of small objects now performs
much better at high core counts, and has lower worst-case latency
<https://tip.golang.org/doc/go1.15#runtime> . So it is recommended
repeating the load test on to the latest available version of Prometheus,
which is hopefully built with at least Go 1.15 - see
https://github.com/prometheus/prometheus/releases .

Additionally, you can run the load test on VictoriaMetrics and compare its
scalability with Prometheus. See
https://victoriametrics.github.io/#how-to-scrape-prometheus-exporters-such-as-node-exporter
.


>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/abccd4c0-c69d-4869-8598-899b3de693f7n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/abccd4c0-c69d-4869-8598-899b3de693f7n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmC5W-Q_Y5krMZNK-tnJsNUbjxcX2Cebqncrzq%3DQy%2BSa_Q%40mail.gmail.com.


Re: [prometheus-users] How to scrape K8s pod metrics from prometheus running in VM instance

2021-02-16 Thread Aliaksandr Valialkin
Try substituting Prometheus with vmagent inside the K8S and pushing metrics
to a remote storage outside the Kubernetes. See
https://victoriametrics.github.io/vmagent.html#drop-in-replacement-for-prometheus

On Thu, Feb 11, 2021 at 7:29 PM David Frank 
wrote:

> Hey community, I am currently running Prometheus inside a K8s cluster in
> our production. However, we have frequent updates to K8s version and this
> keeps restarting our pods, causing outage for couple of hours (WAL
> loading).
>
> Is it possible to run Prometheus on a VM instance outside the K8s cluster
> but have service discovery to scrape the metrics from the pods running
> inside K8s cluster?
> I have had no success so far to get this to work.
>
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/0702263b-7219-480c-8346-ed0289345b8bn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/0702263b-7219-480c-8346-ed0289345b8bn%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmD2zpEuiYy_nRrVmqdUZ4Aqnx9ZgGGFB1RboVcVMU8Sgw%40mail.gmail.com.


Re: [prometheus-users] prometheus for IOT devices

2021-02-16 Thread Aliaksandr Valialkin
Just install vmagent on every moving object, so it collects metrics from
local devices, and then sends the collected data to a Prometheus-compatible
centralized storage when it has network connection with the  storage. See
https://victoriametrics.github.io/vmagent.html#use-cases

On Fri, Feb 12, 2021 at 4:41 PM Amit Das  wrote:

> Hi,
> I was trying to find a solution to monitor iot devices in the filed
> (moving objects like bus  etc with internet connected). If i ping to the ip
> from network it responds.  but sometime when the engine is turned off no
> communication for few hours. Maybe 100 buses got iot devices which gets
> some data from the bus details. Data from the IOT hubs regarding bus speed
> etc are transmitted every 1hour to the server, Whats is the best solution
> for  monitoring and alerting 100 iot hubs in this case.  there will be lots
> of noise how to eliminate.
> Tried with blackbox exporter but 50% of them keep alerting all time.
> Any suggestions.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/3b4bf0cc-3922-4ff1-b9de-5806b736a192n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/3b4bf0cc-3922-4ff1-b9de-5806b736a192n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAHjwtDSfGvw%2BNYUb9x6p5RTSP8L%2BBwTRtCr84aWv4ANw%40mail.gmail.com.


Re: [prometheus-users] Difference between relabel_config and metric_relabel_config

2021-02-16 Thread Aliaksandr Valialkin
The `relabel_config` is applied to labels on the discovered scrape targets,
while `metrics_relabel_config` is applied to metrics collected from scrape
targets.

On Tue, Feb 16, 2021 at 4:10 PM He Wu  wrote:

> Hi all,  I want to collect the kube-apiserver metrics by promethues but
> drop any label with 'le="+Inf"' , the other labels like 'le="1"'  and
> 'le="10"'  need to be kept , the metric is something like below:
>
> 'workqueue_work_duration_seconds_bucket{name="non_structural_schema_condition_controller",le="1"}
> 669
> workqueue_work_duration_seconds_bucket{name="non_structural_schema_condition_controller",le="10"}
> 669
> workqueue_work_duration_seconds_bucket{name="non_structural_schema_condition_controller",le="+Inf"}
> 669'
>
> I have try to use "relabel_config" and "metric_relabel_confg" to achieve
> that, turn out "metric_relabel_confg" works , but "relabel_config" did not
> work with same configuration as below:
>
> - source_labels: [le]
>regex: '\+Inf'
>action: 'drop'
>
> In my special use case, i need to filter the "+Inf" label before ingesting
> to storage, so "relabel_config" is what i need to use. cloud you please
> help to advise how can i filter the "+Inf" label with "relabel_config". And
> beside the stage they take affect, what else different between
> "relabel_config" and "metric_relabel_confg" , why same config can not work
> on "relabel_config", thanks!
>
> Regards,
> He
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/e4b661ea-1de4-4ff4-959f-ce159226cc4an%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/e4b661ea-1de4-4ff4-959f-ce159226cc4an%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCyPt_2Lo5GGQ9xzXKGovv5ziBsSvP2sAfE-cKbitaJPA%40mail.gmail.com.


Re: [prometheus-users] Showing difference between 2 histograms

2021-02-16 Thread Aliaksandr Valialkin
The difference should work - just remove the cluster from by(...) list:

histogram_quantile(0.50, sum(rate(ampq_bucket{cluster="first"}[15s)) by
(le))
-
histogram_quantile(0.50, sum(rate(ampq_bucket{cluster="second"}[15s)) by
(le))


See more details about time series matching during binary operator
execution in PromQL -
https://prometheus.io/docs/prometheus/latest/querying/operators/

On Thu, Feb 11, 2021 at 12:56 PM Hozapero 
wrote:

> We have a histogram metric with lots of "le" bucket. I usually use
> histogram_quantile to see 50th percentile for this, etc. as such:
>
> histogram_quantile(0.50,
> sum(rate(amqp_bucket{env="prod"}[$__rate_interval])) by (cluster,le))
>
> We have a few clusters which run the same service with the same metrics. I
> would like to see a difference comparison (to see how much faster/slower
> the calls are in an another cluster for the same services).
>
> Is there any way to do difference with histogram_quantiles?
> Something like
>
> histogram_quantile(0.50, sum(rate(ampq_bucket{cluster="first"}[15s)) by
> (cluster,le))
> -
> histogram_quantile(0.50, sum(rate(ampq_bucket{cluster="second"}[15s)) by
> (cluster,le))
>
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/3cd8f5cd-8f71-41f2-9ddf-112df72df115n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/3cd8f5cd-8f71-41f2-9ddf-112df72df115n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAEPF%3DWL%2BnhZ3inohTyR7mdPMw%3D%2BPjzLFRJ2xAwXhp6vQ%40mail.gmail.com.


Re: [prometheus-users] Prometheus with remote_write VM dies with OOM: remote_write settings problem?

2021-01-15 Thread Aliaksandr Valialkin
Try increasing `capacity` to 3x max_samples_per_send, i.e. to 2 for
your case according to https://prometheus.io/docs/practices/remote_write/ .

Prometheus may require up to 30% more memory after enabling remote_write
according to production measurements. Make sure that your Prometheus
instance runs on a host with at least 30% of free memory before enabling
remote_write on it.

On Fri, Jan 15, 2021 at 11:18 AM Olga Chukanova 
wrote:

> Hello!
> I have prometheus like monitoring system in kubernetes, and I trying to
> set up remote_write to victoria metrics. But I have one tragic problem - my
> prometheus dies by OOM.
>  I’ve tested two versions of Prometheus (v2.11.0 and v.2.23.0) and had
> same problem on both.
> My average value of rate(prometheus_remote_storage_samples_in_total [5m])
> is ~75k, prometheus pod limits is cpu ‘4’ and memory 6144M and average
> metric prometheus_remote_storage_shards = 1.
> Settings in remote_write are:
> queue_config:
>   capacity: 100
>   max_samples_per_send: 1
>   max_shards: 10
>   min_shards: 1
> Global scrape setting:
> global:
>   scrape_interval: 10s
>   scrape_timeout: 10s
>   evaluation_interval: 10s
> In logs (with debug mode) I didn’t found anything, what can explayn the
> problem.
> I think, I’m doing something wrong in remote_write setting, but I don’t
> understand what, and based on wich metrics I should configure that.
> Thank you for any help!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/830a40c0-dac9-43ca-880d-5898a2d70f50n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/830a40c0-dac9-43ca-880d-5898a2d70f50n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDJMGmBtbRZ%3DGXq3X7Du6MEtTvW_Ng07Gyfw8orU5GT6A%40mail.gmail.com.


Re: [prometheus-users] Is there any way to migrate data stored in InfluxDB to Prometheus TSDB?

2020-12-15 Thread Aliaksandr Valialkin
Prometheus doesn't support importing historical data yet - see
https://prometheus.io/docs/introduction/roadmap/#backfill-time-series . So
it is impossible to import data from InfluxDB to Prometheus at the moment.
But you can import InfluxDB data to other TSDB systems with tools like vmctl
<https://github.com/VictoriaMetrics/vmctl>.

On Mon, Dec 14, 2020 at 11:38 AM zhao wang  wrote:

> Just as the title. Is there any tool that can replay the InfluxDB data and
> write them to Prometheus TSDB? Or some other ways to migrate data from
> InfluxDB to Prometheus TSDB?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/8f7f3906-6b06-48f8-8dc7-944b59c05725n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/8f7f3906-6b06-48f8-8dc7-944b59c05725n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmByTjPB7Q-qnQ-zu4x-4WOREpse0HX4FC%3D0q2ROobof%3DQ%40mail.gmail.com.


Re: [prometheus-users] Prometheus increase function not considering first value after server reset

2020-12-08 Thread Aliaksandr Valialkin
On Tue, Dec 8, 2020 at 5:37 PM TEST TEST 
wrote:

> Hi,
>
> I am using increase function to track the counter increase to track the
> number of logins in the web application.  We are observing a strange
> scenario, where after an idle time or after the web application is
> redeployed, the counter is not considering the first login after the
> reset.
>
> Is there any way to fix this issue ?
>

increase() function in Prometheus skips the first data point in a time
series during calculations. It also may return fractional results for time
series with integer data points. See more details at
https://github.com/prometheus/prometheus/issues/3746 . I'm not aware of a
workaround for this in Prometheus. I'd suggest trying MetricsQL
 instead - it should
return the expected result from increase() function for your case.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBkYL3NA-zUrJ6pD9uR8v6g_jV%2BXdh2yf5Jz22yFAVb0w%40mail.gmail.com.


Re: [prometheus-users] Re: Losing metrics for the time when federation node is down

2020-11-30 Thread Aliaksandr Valialkin
On Sun, Nov 29, 2020 at 3:10 PM Ben Kochie  wrote:

> On Sun, Nov 29, 2020 at 11:51 AM Aliaksandr Valialkin 
> wrote:
>
>>
>>
>> On Fri, Nov 27, 2020 at 11:11 AM Ben Kochie  wrote:
>>
>>>
>>>>
>>>>> Or else is there any other ways by which we can solve this issue.
>>>>>
>>>>
>>>> Using something other than federation.  remote_write is able to buffer
>>>> up data locally if the endpoint is down.
>>>>
>>>> Prometheus itself can't accept remote_write requests, so you'd have to
>>>> write to some other system
>>>> <https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage>
>>>> which can.  I suggest VictoriaMetrics, as it's simple to run and has a very
>>>> prometheus-like API, which can be queried as if it were a prometheus
>>>> instance.
>>>>
>>>
>>> I recommend Thanos, as it scales better and with less effort than
>>> VictoriaMetrics. It also uses PromQL code directly, so you will get the
>>> same results as Prometheus, not an emulation of PromQL.
>>>
>>>
>> Could you share more details on why you think that VictoriaMetrics has
>> scalability issues and is harder to set up and operate than Thanos?
>> VictoriaMetrics users have quite the opposite opinion. See
>> https://victoriametrics.github.io/CaseStudies.html and
>> https://medium.com/faun/comparing-thanos-to-victoriametrics-cluster-b193bea1683
>> .
>>
>
> Thanos uses object storage, which avoids the need for manual sharding of
> TSDB storage. Today I have 100TiB of data stored in object storage buckets.
> I make no changes to scale up or down these buckets.
>
>
VictoriaMetrics stores data on persistent disks. Every replicated durable
persistent disk in GCP <https://cloud.google.com/persistent-disk> can scale
up to 64TB
<https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd>
without the need to stop VictoriaMetrics, i.e. without downtime. Given
that VictoriaMetrics
compresses real-world data much better than Prometheus
<https://valyala.medium.com/prometheus-vs-victoriametrics-benchmark-on-node-exporter-metrics-4ca29c75590f>,
a single-node VictoriaMetrics can substitute the whole Thanos cluster for
your workload (in theory of course - just give it a try in order to verify
this statement :) ). Cluster version of VictoriaMetrics
<https://victoriametrics.github.io/Cluster-VictoriaMetrics.html> can scale
to petabytes. For example, a cluster with one terabyte capacity can be
built with 16 vmstorage nodes with 64TB persistent disk per each node.
That's why VictoriaMetrics in production usually has lower infrastructure
costs than Thanos.


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCfjWu60ggYpXbPt%3DeZr9KNi1rTseEyiOd9-opL%3DiuNLQ%40mail.gmail.com.


Re: [prometheus-users] Re: Losing metrics for the time when federation node is down

2020-11-29 Thread Aliaksandr Valialkin
On Fri, Nov 27, 2020 at 11:11 AM Ben Kochie  wrote:

>
>>
>>> Or else is there any other ways by which we can solve this issue.
>>>
>>
>> Using something other than federation.  remote_write is able to buffer up
>> data locally if the endpoint is down.
>>
>> Prometheus itself can't accept remote_write requests, so you'd have to
>> write to some other system
>> <https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage>
>> which can.  I suggest VictoriaMetrics, as it's simple to run and has a very
>> prometheus-like API, which can be queried as if it were a prometheus
>> instance.
>>
>
> I recommend Thanos, as it scales better and with less effort than
> VictoriaMetrics. It also uses PromQL code directly, so you will get the
> same results as Prometheus, not an emulation of PromQL.
>
>
Could you share more details on why you think that VictoriaMetrics has
scalability issues and is harder to set up and operate than Thanos?
VictoriaMetrics users have quite the opposite opinion. See
https://victoriametrics.github.io/CaseStudies.html and
https://medium.com/faun/comparing-thanos-to-victoriametrics-cluster-b193bea1683
.

--

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBdtVys5bfTLKkNXJocfWr4%3DDPw%3DBro3A2YE-hLMo6doQ%40mail.gmail.com.


Re: [prometheus-users] Re: The result of delta function is not same with raw data?

2020-11-21 Thread Aliaksandr Valialkin
On Sat, Nov 21, 2020 at 4:34 PM Ben Kochie  wrote:

> While that sounds like a good idea, it's going to produce less accurate
> results for most use cases.
>

Could you provide practical examples?


>
> On Sat, Nov 21, 2020 at 3:15 PM Aliaksandr Valialkin 
> wrote:
>
>> There is an alternative solution - to use the increase() function from
>> MetricsQL - it doesn't extrapolate results and it takes into account the
>> previous value before the window in square brackets. So it returns exact
>> expected values. See more details at
>> https://victoriametrics.github.io/MetricsQL.html
>>
>> On Fri, Nov 20, 2020 at 2:49 PM b.ca...@pobox.com 
>> wrote:
>>
>>> On Friday, 20 November 2020 at 02:29:12 UTC mono...@gmail.com wrote:
>>>
>>>> *Query*:
>>>>  1. normal query: error_counter_something{job=“monitor”, device=“dev0”,
>>>> serial=“”}
>>>>  2. delta query: delta(error_counter_something{job=“monitor”,
>>>> device=“dev0”, serial=“”}[$__interval] > 0)
>>>>
>>>> *Time Range*: 2020-11-19 16:16:00 ~ 2020-11-19 16:20:00 with 15sec
>>>> interval
>>>>
>>>> *result*
>>>>
>>>> 16:16:15~30 raise 2 errors on device and move that error counter value
>>>> from 7616 to 7618,
>>>> but the delta query shows result of 3
>>>>
>>>> time ,delta  ,
>>>> normal
>>>> 2020-11-19 16:16:00,   , 7616
>>>> 2020-11-19 16:16:15,   , 7616
>>>> 2020-11-19 16:16:30, 3, 7618
>>>> 2020-11-19 16:16:45,   , 7618
>>>> 2020-11-19 16:17:00,   , 7618
>>>> (keep these value until end of query time range)
>>>>
>>>>
>>> See
>>> https://prometheus.io/docs/prometheus/latest/querying/functions/#delta
>>> *"delta(v range-vector) calculates the difference between the first and
>>> last value of each time series element in a range vector v, returning an
>>> instant vector with the given deltas and equivalent labels. The delta is
>>> extrapolated to cover the full time range as specified in the range vector
>>> selector, so that it is possible to get a non-integer result even if the
>>> sample values are all integers."*
>>>
>>> You haven't said what $__interval expands to in your query.  It must be
>>> at least 30 seconds, because otherwise you wouldn't have two values in your
>>> range vector.
>>>
>>> So let's see what happens with 30 seconds.  The window contains two
>>> values:
>>>
>>> [...XX...]
>>>76167618
>>> <--15s-->
>>>
>>> The difference between these is 2, and the time interval between them is
>>> 15 seconds.  However this increase is then extrapolated to cover the whole
>>> window period of 30 seconds, so the value returned by delta() would be 4.
>>>
>>> What about if $__interval was 45 seconds?  Then you'd have three values,
>>> the difference between the first and last is 2, the time difference is 30
>>> seconds extrapolated to 45 seconds, so the result would be 2 x (45/30) = 3.
>>>
>>> If you want the actual difference between the metric now and the metric
>>> some time ago, you can do :
>>>
>>> something - something offset 15s
>>>
>>> However, both that expression and delta() will give you nonsense values
>>> if a counter resets, because it will jump back down towards zero and give
>>> you a large negative value.
>>>
>>> Better:
>>>
>>> (something - something offset 15s) >= 0
>>>
>>> but it won't handle counter resets as well as rate() or increase() can.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/f75b2007-96ed-4c66-b719-602934827cd3n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/prometheus-users/f75b2007-96ed-4c66-b719-602934827cd3n%40googlegroups.com?utm_medium=email_source=footer>
&g

Re: [prometheus-users] Re: The result of delta function is not same with raw data?

2020-11-21 Thread Aliaksandr Valialkin
There is an alternative solution - to use the increase() function from
MetricsQL - it doesn't extrapolate results and it takes into account the
previous value before the window in square brackets. So it returns exact
expected values. See more details at
https://victoriametrics.github.io/MetricsQL.html

On Fri, Nov 20, 2020 at 2:49 PM b.ca...@pobox.com 
wrote:

> On Friday, 20 November 2020 at 02:29:12 UTC mono...@gmail.com wrote:
>
>> *Query*:
>>  1. normal query: error_counter_something{job=“monitor”, device=“dev0”,
>> serial=“”}
>>  2. delta query: delta(error_counter_something{job=“monitor”,
>> device=“dev0”, serial=“”}[$__interval] > 0)
>>
>> *Time Range*: 2020-11-19 16:16:00 ~ 2020-11-19 16:20:00 with 15sec
>> interval
>>
>> *result*
>>
>> 16:16:15~30 raise 2 errors on device and move that error counter value
>> from 7616 to 7618,
>> but the delta query shows result of 3
>>
>> time ,delta  ,
>> normal
>> 2020-11-19 16:16:00,   , 7616
>> 2020-11-19 16:16:15,   , 7616
>> 2020-11-19 16:16:30, 3, 7618
>> 2020-11-19 16:16:45,   , 7618
>> 2020-11-19 16:17:00,   , 7618
>> (keep these value until end of query time range)
>>
>>
> See https://prometheus.io/docs/prometheus/latest/querying/functions/#delta
> *"delta(v range-vector) calculates the difference between the first and
> last value of each time series element in a range vector v, returning an
> instant vector with the given deltas and equivalent labels. The delta is
> extrapolated to cover the full time range as specified in the range vector
> selector, so that it is possible to get a non-integer result even if the
> sample values are all integers."*
>
> You haven't said what $__interval expands to in your query.  It must be at
> least 30 seconds, because otherwise you wouldn't have two values in your
> range vector.
>
> So let's see what happens with 30 seconds.  The window contains two values:
>
> [...XX...]
>76167618
> <--15s-->
>
> The difference between these is 2, and the time interval between them is
> 15 seconds.  However this increase is then extrapolated to cover the whole
> window period of 30 seconds, so the value returned by delta() would be 4.
>
> What about if $__interval was 45 seconds?  Then you'd have three values,
> the difference between the first and last is 2, the time difference is 30
> seconds extrapolated to 45 seconds, so the result would be 2 x (45/30) = 3.
>
> If you want the actual difference between the metric now and the metric
> some time ago, you can do :
>
> something - something offset 15s
>
> However, both that expression and delta() will give you nonsense values if
> a counter resets, because it will jump back down towards zero and give you
> a large negative value.
>
> Better:
>
> (something - something offset 15s) >= 0
>
> but it won't handle counter resets as well as rate() or increase() can.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/f75b2007-96ed-4c66-b719-602934827cd3n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/f75b2007-96ed-4c66-b719-602934827cd3n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAMMFA3gsTAbqVC5uz%3Dp-kymL3QT31vb9G3vOhemjWCfw%40mail.gmail.com.


Re: [prometheus-users] Best HA solution for Promethues

2020-10-17 Thread Aliaksandr Valialkin
Take a look also at VictoriaMetrics - see
https://victoriametrics.github.io/#high-availability and
https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#high-availability


On Wed, Sep 30, 2020 at 7:52 PM sreehari M V 
wrote:

>
> Hi All ,
>
> Greetings,
>
> Can you please suggest a best High Availability solution for Prometheus.
>
> Server count: 750
> OS : RHEL 7
> Exporters: node_exporter, process_exporter and JMX exporter
>
>
> Thanks and Regards,
> Sreehari M V
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/a647071f-a8ca-4afe-ba8d-5162589fe207n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/a647071f-a8ca-4afe-ba8d-5162589fe207n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAY6xruZ8yGh-%2BBKQHFUpzDgpZpS9Y8GbU81PhYiGT5-A%40mail.gmail.com.


Re: [prometheus-users] Scaling Prometheus

2020-10-17 Thread Aliaksandr Valialkin
the next step for us is to shard and use namespace level
>>>>> Prometheis. But I expect a similar level of usage in about an year again 
>>>>> at
>>>>> the namespace level, with multiple apps in a single namespace scaling to
>>>>> 1000s of pods exporting 5K metrics each. And I will not be able to shard
>>>>> again because I don't want to go below  the NS granularity.
>>>>>
>>>>> How have others dealt with this situation where is the bottle neck is
>>>>> going to be ingestion and not queries?
>>>>>
>>>>> Thanks for your time,
>>>>> KVR
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Prometheus Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to prometheus-use...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/prometheus-users/cf15cc42-fe3e-4f4d-8489-3750fac7f81en%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/prometheus-users/cf15cc42-fe3e-4f4d-8489-3750fac7f81en%40googlegroups.com?utm_medium=email_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/58c5326d-58c7-42b5-9ec4-1fc8c9eb27b3n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/prometheus-users/58c5326d-58c7-42b5-9ec4-1fc8c9eb27b3n%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CABbevGkm9MFTxhX_HTF5kwcdjmUVmyhqO_-ebj-yBM_FKpFk8A%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CABbevGkm9MFTxhX_HTF5kwcdjmUVmyhqO_-ebj-yBM_FKpFk8A%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmB%3DpNwhMPqKCMR28%2B4LEJw4002Ev7pXaHx%3DsavD5Fs9xw%40mail.gmail.com.


Re: [prometheus-users] Re: TSDB Import Command

2020-10-16 Thread Aliaksandr Valialkin
On Wed, Oct 14, 2020 at 6:12 PM Al  wrote:

> For this specific use-case, it's understood that we wouldn't have alerting
> rules as the metrics would ingested only after they've been created.  It
> would be used purely for inspecting historical trends and identifying
> outliers over weeks, months. etc.   I'm leaning more towards TimescaleDB as
> we already have many instances of Postgres internally so it would end up
> being less work to get this up vs implementing something new such as
> VictoriaMetrics.


I'd recommend trying both solutions in parallel - TimescaleDB and
VictoriaMetrics - and then choosing the most suitable solution for your
case.

---
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAp_t_cGXGXU-V3skqzg4%3DV%2Bw7dA_eA9zfhfTA_TAXsBw%40mail.gmail.com.


Re: [prometheus-users] Re: How can I limit access to data

2020-10-16 Thread Aliaksandr Valialkin
Take a look at https://github.com/prometheus-community/prom-label-proxy and

On Thu, Oct 15, 2020 at 3:19 PM Brian Candler  wrote:

> You can't do this at the HTTP layer, e.g. with Nginx.  All PromQL queries
> will be passed straight through.
>
> What you need is a frontend like Grafana, and configure dashboards
> restricted to each client (which Grafana does allow).
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/7fb87243-6687-4dd9-ade2-d2ddd92cc88ao%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/7fb87243-6687-4dd9-ade2-d2ddd92cc88ao%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBa4pXdaja2zb%2BBq26v7eL4SeY6NozvWfh5Sdk5Qhi%2BpQ%40mail.gmail.com.


Re: [prometheus-users] Prometheus disconnected data recovery

2020-09-21 Thread Aliaksandr Valialkin
Try vmagent - it can run in the same network as metrics exporter, scrape
data from the exporter and push it to Prometheus-compatible remote storage
such as Cortex, M3DB or VictoriaMetrics
when the network connection is in working state. vmagent buffers scraped
data on local storage while the network connection to remote storage is
unavailable.
See more details at https://victoriametrics.github.io/vmagent.html .

On Tue, Sep 15, 2020 at 6:16 AM tiecheng shen 
wrote:

> Hello, I am a newbie to prometheus. I have a requirement. When the
> prometheus server and the captured client are disconnected from the
> network, prometheus cannot capture the data when the network is
> disconnected, and the graph will be disconnected when the network is
> reconnected. For one thing, if I store the data when I was disconnected
> locally, can I recover those data? If not, is there any third-party way to
> support this?
>
> Thanks for any help!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/0e541265-bfc9-4529-aa0e-0aa3dbf18dffn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/0e541265-bfc9-4529-aa0e-0aa3dbf18dffn%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAxsF-RLyKZvva5xfm1Uxj3jFEk89J1BJqQmqXuY2VvsA%40mail.gmail.com.


Re: [prometheus-users] any best practice on using limited le's for a given histogram

2020-09-15 Thread Aliaksandr Valialkin
FYI, the following article is quite interesting re histograms -
https://linuxczar.net/blog/2020/08/13/histogram-error/

On Tue, Sep 15, 2020 at 10:41 PM 'Rong Hu' via Prometheus Users <
prometheus-users@googlegroups.com> wrote:

> We would love to learn more about the roadmap for histogram improvements
> and rough timeline / estimates for earliest GA. We are trying to
> standardize our metrics on Prometheus internally and have lots of DDSketch
> histograms to migrate. In the short term we plan to roughly translate
> existing DDSketch buckets to default histogram buckets. It would greatly
> incentivize migration internally if the feature gap is filled.
> Thank you for doing this valuable work!
>
> Rong Hu
> Airbnb
>
> On Tuesday, September 8, 2020 at 1:56:49 PM UTC-7 bjo...@rabenste.in
> wrote:
>
>> On 02.09.20 00:38, rs vas wrote:
>> >
>> > • any good number we can cross when defining buckets for example not to
>> > define more than 10 le's.
>>
>> It all really depends on your total cardinality. It's fine to create a
>> histogram with loads of buckets if that's only exposed on three
>> targets and has no further labels at all.
>>
>> In your case, where you have many hosts _and_ partitioning by a bunch
>> of other labels with some significant cardinality, too, you really
>> have to be careful with the number of buckets.
>>
>> A common pattern for something like HTTP request metrics is to have a
>> counter with many labels (like method, path, status code, ...) and
>> then a histogram for the request duration with no further labels (or
>> at least only a few with low cardinality). In that way, you cannot
>> calculate latency per status code and such, but it might be a good
>> compromise.
>>
>> In different news, I'm working on ways to allow high-res histograms in
>> the future, see
>>
>> https://grafana.com/blog/2020/08/24/kubecon-cloudnativecon-eu-recap-better-histograms-for-prometheus/
>> for a bunch of links to talks etc.
>>
>> --
>> Björn Rabenstein
>> [PGP-ID] 0x851C3DA17D748D03
>> [email] bjo...@rabenste.in
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/d17ea97b-06ba-496d-bf47-87a31743e3ebn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/d17ea97b-06ba-496d-bf47-87a31743e3ebn%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmA1ZAOviEekNTo6aM0AwooEOgFtO%2BwmKA4uNpVGaBVjPQ%40mail.gmail.com.


Re: [prometheus-users] Network traffic used per day

2020-08-29 Thread Aliaksandr Valialkin
Hi Mario!

Try the following query:

sum(max_over_time(increase(node_network_receive_bytes_total[1d])[1d:1d]))
by (instance,device)

It will return summary per-day incoming traffic for each device for every
instance. Note that dates on the graph are shifted by one day, i.e. the
value for the current day on the graph corresponds to the value for the
previous day.


On Saturday, August 29, 2020, Mario Pranjic  wrote:

> Hi,
>
> I have panel showing network send/receive network traffic as follows:
> rate(node_network_receive_bytes_total{instance="proxy.yggdrasil.local:9100",
> device="ens3"}[1m])
> rate(node_network_transmit_bytes_total{instance="proxy.yggdrasil.local:9100",
> device="ens3"}[1m])
>
> I need one to show network traffic summarized per day.
> Whatever I tried, I ended-up with bits and pieces, and not proper one
> number per day.
>
> I collect in Prometheus and visualize in Grafana.
>
> How to accomplish this?
>
> Thanks in advance!
>
> Best regards,
>
> Mario.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/prometheus-users/CAE%2BvKfAqZgd8YDGBojXmDXhvAdwO0CV
> MnxzO-98tdhy-h0gcVQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAE%2BvKfAqZgd8YDGBojXmDXhvAdwO0CVMnxzO-98tdhy-h0gcVQ%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDc299%2BEwn6TtdiFfYDxg%2B2o4kpF7ZPOf88zRBvsXb_eg%40mail.gmail.com.


Re: [prometheus-users] Golang microservice promQL client example or package?

2020-08-08 Thread Aliaksandr Valialkin
Hi Luke!

See documentation and examples at
https://godoc.org/github.com/prometheus/client_golang/api/prometheus/v1

On Fri, Aug 7, 2020 at 4:05 AM 'Luke Hamilton' via Prometheus Users <
prometheus-users@googlegroups.com> wrote:

> Hey all,
>
> I am wondering if there is a golang based promQL client package or
> exmaples anywhere on how I can call a Prometheus server we have collecting
> a bunch of data that we need to use for a billing service thats built in
> Go.
> Sounds like this repo has this feature in it, but there is no doco or
> examples I can find? https://github.com/prometheus/client_golang
>
> Any help is great appreciated
>
> p.s I'm a total newbie around Prometheus
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/20515ff3-81b4-412d-b53b-c9048446e4d7n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/20515ff3-81b4-412d-b53b-c9048446e4d7n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCb8yYhTbux5N2X%2BsjxbbPGsengyXQp6cEn7cMm0pqKgQ%40mail.gmail.com.


Re: [prometheus-users] Re: Select metric only if another does not exist

2020-08-04 Thread Aliaksandr Valialkin
Try the following query:

metric_name unless on (hostname) another_metrics{type="value"}

On Tue, Aug 4, 2020 at 2:44 PM Sam Lee  wrote:

> try metric{type != "two"} ?
>
> 在 2020年8月4日星期二 UTC+8下午3:30:05,Seitan写道:
>>
>> Hello,
>> I'm trying to do a promql query to select only hosts, that don't have
>> specific label on them.
>> For example, selecting metrics from hosts who DO have label (type=value)
>> works:
>>
>> metrics_name * on (hostname) another_metrics{type="value"}
>>
>> Problem is when i try to do same select against hosts without that label:
>>
>>
>> metrics_name * on (hostname) (absent(another_metrics{type="value"}))
>>
>> this query gives no results.
>> Is there any way to do such query?
>> Thank you
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/bce85d3d-80c0-4ffb-ae40-9b0476a5b553o%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/bce85d3d-80c0-4ffb-ae40-9b0476a5b553o%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmB31V3jrgbcMkCQUrtKUY8KSs1JoiQTZ4-2S9OTMwYYOA%40mail.gmail.com.


Re: [prometheus-users] How Protected Prometheus (OpenID Auth enaled) can be use as target in federation Prometheus scrape config

2020-07-06 Thread Aliaksandr Valialkin
Prometheus supports basic auth
<https://en.wikipedia.org/wiki/Basic_access_authentication> and/or mutual
TLS <https://en.wikipedia.org/wiki/Mutual_authentication> for scraping
targets - see `basic_auth` and `tls_config` sections in
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
for
details.

On Mon, Jul 6, 2020 at 1:44 PM chandan kashayp 
wrote:

> What would be the other auth methods suitable in my case. I didn't find
> any docs which relate like how federate Prometheus can access slave targets
> if it have some auth is involved.
>
> On Monday, 6 July 2020 16:05:42 UTC+5:30, Stuart Clark wrote:
>>
>> On 2020-07-06 11:17, chandan kashayp wrote:
>> > Hello Guys,
>> >
>> > I am stuck at point by doing integration of openid auth enabled
>> > prometheus to federation. Let me explain in detail about the
>> > configuration and blocker.
>> >
>> > My slave Prometheus is openid auth enabled. Whenever, We try to access
>> > the prometheus it ask for login authentication and get us IN if
>> > authorization get success. Post successful login, prometheus dashboard
>> > and its graph can be see.
>> >
>> > Federation prometheus is running at different place. Federation
>> > prometheus scrape_configs looks like below
>> >
>> > scrape_configs:
>> >
>> > *
>> >
>> > job_name: 'federate'
>> > scrape_interval: 15s
>> >
>> > honor_labels: true
>> > metrics_path: '/federate'
>> >
>> > params:
>> > 'match[]':
>> > - '{job="prometheus"}'
>> > - '{name=~"job:.*"}'
>> >
>> > static_configs:
>> >
>> > * targets:
>> >
>> > * 'prometheus-slave.xyz.com:443' (my slave prometheus
>> endpoint)
>> >
>> > Issue : The target status is DOWN and Status is "server returned HTTP
>> > status 403 Forbidden".
>> >
>> > I know the error is coming because of federation Prometheus has not
>> > getting credential to access slave Prometheus. But, I am not getting
>> > anything at federation prometheus where credentials related
>> > configuration will be done which allow federation Prometheus to access
>> > auth protected slave prometheus.
>> >
>> > Suggestion & help need !!
>> > #FederationPrometheus
>> >
>>
>> I don't believe Prometheus supports OIDC authenticaiton, so you would
>> need to allow other authentication or whitelisting methods for your
>> federation. OIDC is really best suited for people, with other forms
>> better for machines.
>>
>> --
>> Stuart Clark
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/cb2e02f3-c6af-47c6-9029-d1c5f0b66c9do%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/cb2e02f3-c6af-47c6-9029-d1c5f0b66c9do%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDkaRFaue6EvZyEYDboQxExRXdFDdBuJpkA%3DH2-JzUPvw%40mail.gmail.com.


Re: [prometheus-users] Re: prometheus not scrapping targets when timestamp field is present

2020-07-06 Thread Aliaksandr Valialkin
Prometheus doesn't support storing historical data and samples with
out-of-order timestamps. If you need writing such data, then take a look at
other Prometheus-inspired solutions. See, for example,
https://victoriametrics.github.io/#backfilling .

On Mon, Jul 6, 2020 at 10:31 AM Venkata Bhagavatula 
wrote:

> Hi All,
> Can any one respond to my queries? Also we observed the following:
> 1. If for eg timestamp(epoch) in the scrape is 12:00:00, then prometheus
> is not scrapping the targets
> 2. If for eg timestamp(epoch) in the scrape is 12:00:01, then prometheus
> is scrapping the targets.
>
> Thanks & regards,
> Chalapathi
>
> On Thu, Jul 2, 2020 at 3:03 PM Venkata Bhagavatula <
> venkat.cha...@gmail.com> wrote:
>
>> Hi,
>>
>> We are using prometheus version 2.11.1, In our application, the scrape
>> target has timestamp field.  when timestamp field is present, then
>> prometheus is not scrapping any metrics.
>> Following is the output of the curl request for scrape target:
>>
>>- *cmd: curl  http://:24231/metrics*
>>
>> meas_gauge{id="Filtered",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 0.0 159368040
>> meas_gauge{id="Rejected",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 0.0 159368040
>> meas_gauge{id="ReprocessedIn",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 0.0 159368040
>> meas_gauge{id="Created",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 0.0 159368040
>> meas_gauge{id="Duplicated",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 0.0 159368040
>> meas_gauge{id="Stored",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 336.0 159368040
>> meas_gauge{id="Retrieved",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 354.0 159368040
>> meas_gauge{id="ReducedInMerging",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
>> 0.0 159368040
>>
>>
>>
>>- I checked that time is in sync between the prometheus node and the
>>target node.
>>- Following is the epoch time on the prometheus node:
>>
>> *cmd: date +'%s%3N'*
>> *1593681793979*
>>
>>
>>- Epoch difference between the prometheus node and the time stamp
>>present in the sample is more than an hour.
>>
>> difference = ( 1593681793979 -  159368040) / 1000 = 1393sec = 23min
>>
>> Scrape_interval is configured as 300s
>> honor_timestamps is set to true.
>>
>> Can you let us know why prometheus is not able to scrape the targets? Is
>> it due to the timestamp difference between prometheus and target?
>> How much difference will prometheus tolerate?
>>
>> Thanks n Regards,
>> Chalapathi
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CABXnQPuB5iWDhDw06OLOepmz5_XgC2a%3DC9uuVaDKcczR9B-%2BAA%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CABXnQPuB5iWDhDw06OLOepmz5_XgC2a%3DC9uuVaDKcczR9B-%2BAA%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCdo2txis5J8vVRGbMGB4X1N%2BjQiUr9aujrf1sqdXmjPw%40mail.gmail.com.


Re: [prometheus-users] Query with diivision

2020-07-06 Thread Aliaksandr Valialkin
Try the following query:

(rules_job_count{cluster="loco-prod", status="failed"} + ignoring(status)
rules_job_count{cluster="loco-prod", status="cancelled"}) /
ignoring(status) rules_job_count{cluster="loco-prod", status="finished"}

It instructs Prometheus to ignore the `status` label when performing the
addition and division operations. See more details about this at
https://prometheus.io/docs/prometheus/latest/querying/operators/#vector-matching


On Mon, Jul 6, 2020 at 10:48 AM Альберт Александров 
wrote:

>
> Hi all!
>
>
> Have such metrics:
>
>
> [image: photo_2020-07-06_10-30-12.jpg]
>
> I would like to query:
>
> (rules_job_count{cluster="loco-prod", status="failed"} +
>> rules_job_count{cluster="loco-prod", status="cancelled"}) /
>> rules_job_count{cluster="loco-prod", status="finished"}
>
>
> But this didn't work. At the same time this query works:
>
> rules_job_count{cluster="loco-prod", status="failed"} +
>> rules_job_count{cluster="loco-prod", status="failed"}
>
>
> Could you say please how to make the first query work?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/ac417564-df12-4627-8c09-2538c759a7c7o%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/ac417564-df12-4627-8c09-2538c759a7c7o%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBXZoVtuxBXJjJ3GDf7PvVEszx%3DF3kqo_rwxa7yNKNhtA%40mail.gmail.com.


Re: [prometheus-users] Prometheus crashes when number of unique metrics increase

2020-06-22 Thread Aliaksandr Valialkin
On Mon, Jun 22, 2020 at 9:12 PM Paul Dubuc  wrote:

> I have a question about this for some clarification. How does the
> '--metric-interval' affect the cardinality of the metric data?  Does it
> have the same multplier effect that '--series-interval' does in that every
> 120 seconds (default) a whole new set of unique metrics are generated on
> top of those generated at the '--series-interval'?
>

Yes.


>
> Also you you know if setting these intervals to 0 would keep the changes
> from taking effect?
>

No. Just set `--metric-interval` to the value exceeding your test duration
if you need suppressing churn rate. Something like the following should
work: --metric-interval=10


>
> Thanks.
>
> On Thursday, December 5, 2019 at 2:09:55 PM UTC-5, Aliaksandr Valialkin
> wrote:
>>
>> Note that `avalanche` introduces high churn rate for time series, i.e.
>> old time series are constantly substituted by new time series every
>> `--series-interval` seconds. Default value for `--series-interval` is 60
>> seconds, i.e. every 60 seconds new time series are created. So for
>> `--metrics-count=1000 --series-count=1000` case `avalanche` introduces new
>> 1M time series every minute. In 30 minutes Prometheus scrapes 30M time
>> series from `avalanche`. See also `--metric-interval` command-line flag,
>> which has almost the same meaning as `--series-interval`.
>>
>> BTW, how much RAM is available for your Prometheus setup?
>>
>> On Thu, Dec 5, 2019 at 9:34 AM Rupesh Tripathi 
>> wrote:
>>
>>> Hello Folks,
>>>
>>> I performed some load/stress tests on Prometheus, Please find the
>>> details and outcome below. I observed that prometheus docker container
>>> abruptly disappeared/crashed in few instances. Can someone please help us
>>> explaining what are the limitation of prometheus in terms of number of
>>> unique metrics with high cardinality data?
>>>
>>> Steps:
>>>
>>>
>>>
>>>1. Start Avalanche for producing unique metrics
>>>   1. docker run -d --net=host quay.io/freshtracks.io/avalanche
>>>   --metric-count=1000 --series-count=1000 --port=9001
>>>   2. This will create 1000 unique metrics name each with 1000
>>>   unique tag values, overall 1000*1000 unique metrics.
>>>   3. Avalanche runs on 9001 port by default but we can change by
>>>   providing port value.
>>>
>>>
>>>1. *Start prometheus docker container pointing to 9001/metrics to
>>>fetch metrics from Avalanche *
>>>   1. docker run -d --net=host --rm -v
>>>   $(pwd)/prometheus0_eu2.yml:/etc/prometheus/prometheus.yml -v
>>>   $(pwd)/prometheus0_eu2_data:/prometheus -u root --name 
>>> prometheus-0-eu2
>>>   prom/prometheus --config.file=/etc/prometheus/prometheus.yml
>>>   --storage.tsdb.path=/prometheus --web.listen-address=:9092
>>>   --web.enable-lifecycle --web.enable-admin-api
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Number of unique metrics
>>>
>>> Duration
>>>
>>> CPU Usage (Average)
>>>
>>> Memory Usage
>>>
>>> Issues/outcome
>>>
>>> 1000 * 1000 = 1000,000 (1000 unique metric names each with 1000 unique
>>> label/tag values)
>>>
>>> 10 - 15 minutes
>>>
>>> 15.0-20.0%
>>>
>>> 95-99%
>>>
>>> The prometheus container crashes and stops abruptly after 15-30 minutes,
>>> most likely due to out of memory.
>>>
>>> 100 * 1000 = 100,000 (100 unique metric names each with 1000 unique
>>> label/tag values)
>>>
>>> 1-1.5 hours
>>>
>>> 15.0-20.0%
>>>
>>> 90-99%(Starts increasing slowly and after an hour it grows to 90 %+
>>>
>>> Prometheus container stops abruptly after running for 3-4 hours.
>>>
>>> 100 * 100 = 10,000 (100 unique metric names each with 100 unique
>>> label/tag values)
>>>
>>> 2 days
>>>
>>> 5.0-7.0%
>>>
>>> 25-28%
>>>
>>> Prometheus service continues to run without any issues.
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to promethe...@googlegroups.com.
>>> To view this discussion o

Re: [prometheus-users] data folder is still use with influxdb

2020-06-22 Thread Aliaksandr Valialkin
Hi Mathieu!

Prometheus continues writing data to local storage after enabling remote
write to InfluxDB or any other remote storage. This is needed for alerting
<https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/>
and recording rules
<https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/>,
which rely on locally stored data. The amounts of locally stored data may
be controlled with --storage.tsdb.retention.time and
-storage.tsdb.retention.size command-line flags passed to Prometheus during
startup. See
https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects for
details. The minimum value for --storage.tsdb.retention.time is 2h due to
implementation details.

BTW, InfluxDB isn't the best remote storage for Prometheus, since it
doesn't support PromQL
<https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085>
natively. I'd recommend taking a look at other remote storage systems such
as M3DB, Cortex or VictoriaMetrics, since they have better integration with
Prometheus ecosystem comparing to InfluxDB.

On Mon, Jun 22, 2020 at 11:42 AM Mathieu 
wrote:

> Hello,
>
> I use influxdb as a storage but after configure everything, prometheus
> (version 2.19.1) creates /data/ directory and continue to push the metrics
> on it.
>
> The metrics are also push in influxdb.
>
> I guess, the metrics should be are in the only place (influxdb).
>
>
> this is my logs output :
>
> Jun 21 13:27:52 ip-172-31-0-6 prometheus[1332]: level=info
> ts=2020-06-21T13:27:52.861Z caller=main.go:827 msg="Completed loading of
> configuration file"
> filename=/usr/local/prometheus-2.19.1.linux-amd64/prometheus.yml Jun 21
> 13:27:52 ip-172-31-0-6 prometheus[1332]: level=info
> ts=2020-06-21T13:27:52.861Z caller=main.go:646 msg="Server is ready to
> receive web requests." Jun 21 13:27:58 ip-172-31-0-6 prometheus[1332]:
> ts=2020-06-21T13:27:58.614Z caller=dedupe.go:112 component=remote
> level=info remote_name=d2d964 url="
> http://localhost:8086/api/v1/prom/write?db=prometheus=prometheus=x;
> msg="Done replaying WAL" duration=5.816472058s Jun 21 13:35:23
> ip-172-31-0-6 influxd[13959]: [httpd] 127.0.0.1 - prometheus
> [21/Jun/2020:13:35:23 +] "POST
> /api/v1/prom/write?db=prometheus=%5BREDACTED%5D=prometheus HTTP/1.1"
> 204 0 "-" "Prometheus/2.19.1" 0f20b833-b3c4-11ea-b27d-0e0df4cb2655 2384 Jun
> 21 13:35:23 ip-172-31-0-6 influxd[13959]: [httpd] 127.0.0.1 - prometheus
> [21/Jun/2020:13:35:23 +] "POST
> /api/v1/prom/write?db=prometheus=%5BREDACTED%5D=prometheus HTTP/1.1"
> 204 0 "-" "Prometheus/2.19.1" 0f212701-b3c4-11ea-b27e-0e0df4cb2655 2454 Jun
> 21 13:35:23 ip-172-31-0-6 influxd[13959]: [httpd] 127.0.0.1 - prometheus
> [21/Jun/2020:13:35:23 +] "POST
> /api/v1/prom/write?db=prometheus=%5BREDACTED%5D=prometheus HTTP/1.1"
> 204 0 "-" "Prometheus/2.19.1" 0f2196e6-b3c4-11ea-b27f-0e0df4cb2655 2596 Jun
> 21 13:35:23 ip-172-31-0-6 influxd[13959]: [httpd] 127.0.0.1 - prometheus
> [21/Jun/2020:13:35:23 +] "POST
> /api/v1/prom/write?db=prometheus=%5BREDACTED%5D=prometheus HTTP/1.1"
> 204 0 "-" "Prometheus/2.19.1" 0f220c17-b3c4-11ea-b280-0e0df4cb2655 7915
>
> Anybody already have this issue ?
>
> Warm regards,
>
> Mathieu
>
> *ARTERYS**: one of the World's *
>
> *50 Most Innovative Companies*
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/e7288eee-9b72-4e46-acab-b6e39ff53375n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/e7288eee-9b72-4e46-acab-b6e39ff53375n%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmC9QrJy82MmcCK-fCvChORYK0Aq3zO%3DgFghg8tYX_nn8g%40mail.gmail.com.


Re: [prometheus-users] Preventing data loss from poor network communication

2020-06-19 Thread Aliaksandr Valialkin
Hi Mathieu!

What kind of resources are available on the metrics server? Probably,
vmagent
<https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md>
could be placed on each metrics server in order to reliably collect data
and then send it to a centralized storage when the connection is available.
This is one of the main use cases for vmagent - see
https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md#iot-and-edge-monitoring
for
details.

On Mon, Jun 15, 2020 at 3:30 PM Mathieu Tétreault <
mathieu.alexandre.tetrea...@gmail.com> wrote:

> Alright, I'll look into it.
>
> Just in case we don't have the resources required to run prometheus and
> thanos sidecar on the metrics server.
>
> Would there be any issues using the pushgateway to cache the metrics while
> the network is down? I understand that it would be more complicated to
> implement, but other than that? I'll do some testing this week, but I was
> wondering if there were anything that I was missing.
>
> Thanks for your help, it is really appreciated.
>
> Cheers,
>
> Mathieu
>
> On Sun, Jun 14, 2020 at 7:32 AM Stuart Clark 
> wrote:
>
>> What you'd generally do is look at using federation or one of the global
>> storage systems like Victoria Metrics, Thanos or Cortex.
>>
>> You'd have a Prometheus server in each location, and then central systems
>> for global views and alerts.
>>
>> On 14 June 2020 12:19:43 BST, "Mathieu Tétreault" <
>> mathieu.alexandre.tetrea...@gmail.com> wrote:
>>>
>>> I will have to double check, at first glance, the metrics servers didn't
>>> have enough resources available to run prometheus alongside their
>>> application.
>>> That's the main reason why I started to investigate setting up a
>>> watchdog setup and the pushgateway.
>>>
>>> My understanding is that it will also prevent grafana frome properly
>>> displaying the data properly from time to time. Since sometimes it won't be
>>> able to query the metrics server, an issue that would be less visible if we
>>> have a global prometheus instance that stores all the data.
>>>
>>> Cheers,
>>>
>>> Mathieu
>>>
>>> On Sat, Jun 13, 2020 at 8:25 AM Stuart Clark 
>>> wrote:
>>>
>>>> On 12/06/2020 19:45, Mathieu Tétreault wrote:
>>>> > We plan on using prometheus to fetch data from multiples servers and
>>>> > the link between the metrics's server and the prometheus servers is
>>>> > known for not being that reliable. The instability can last a couples
>>>> > of minutes and there is nothing we can do about it.
>>>> >
>>>> > Most of the time prometheus will be able to fetch the metrics.
>>>> > However, when prometheus is unable to pull the data the metrics
>>>> server
>>>> > will need to be able to cache them until the connection is back.
>>>> >
>>>> > Since most of the time the connection will be up, I was thinking
>>>> about
>>>> > setting up a watchdog refreshed by the metric pull. When the watchdog
>>>> > trigs, then cache the data until the pushgateway is pulled.
>>>> >
>>>> > If anyone had any advise on that, that'd be appreciated.
>>>> >
>>>>
>>>> Is it possible to run the Prometheus server on the other end of the
>>>> link?
>>>>
>>>> In general it is advised to run Prometheus servers as close as possible
>>>> to the things being monitored. For example a server per datacenter
>>>> instead of a single global server, etc.
>>>>
>>>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAO%2BPXKMdJCKuBJqZp0TOthyAr6okKrgJH3cNMSLGSqUjzYBgKg%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAO%2BPXKMdJCKuBJqZp0TOthyAr6okKrgJH3cNMSLGSqUjzYBgKg%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmD3mJvgC8t%3DsMO0mAdqwLG0HEepZU34rfhkU4DirEkvKQ%40mail.gmail.com.


Re: [prometheus-users] Ways to Mitigate on huge label value pair during high cardinalities

2020-06-05 Thread Aliaksandr Valialkin
On Fri, Jun 5, 2020 at 2:55 PM Dinesh N 
wrote:

> Hi Aliaksandr,
>
> Thanks for the valuable insights..
>
> I shall take a look at bomb-squad in the meanwhile do you foresee any
> generic options optimizations using metrics_relabel_configs or by using
> sample_limit can help to reduce the cardinalities.
>

`sample_limit` won't help here, since it limits the number of samples that
can be scraped from a single target. It doesn't limit the number of unique
label=value pairs.
The generic solution is to identify label with the biggest number of unique
values via `/api/v1/status/tsdb` page and then remove these labels via
`metrics_relabel_configs` using `action: labeldrop`.



>
> Time series -
>
> Currently we have close to 8 million time series for a single block which
> compacts in an event of every 2 hours
>
> Promethus config -
>
> RAM - 120 GB
> CPU - 32 core CPU
> Storage - 1 TB
>
> Problem statement -
>
> Once the RSS memory spikes more than 110 GB it crashes and which kind of
> makes our system very unstable ... Even we can't be increasing the
> resources more as we already operating with higest config.
>

>
> Any directions/approaches/mechanism are highly appreciated .
>

Try increasing scrape_interval for all the metrics. This should reduce RAM
usage for Prometheus.

Another option is to try VictoriaMetrics - it should use lower amounts of
RAM comparing to Prometheus for this workload.

-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmATa894%3DbrO7UQrfAq%3Dzqw3q%2B-%2BKpdeBtF4pAKEknUkBA%40mail.gmail.com.


Re: [prometheus-users] Ways to Mitigate on huge label value pair during high cardinalities

2020-06-05 Thread Aliaksandr Valialkin
High cardinality labels may be monitored at /api/v1/status/tsdb page. See
https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats for
details. Note that this page became available starting from Prometheus
v2.14 .

Take a look also at https://github.com/open-fresh/bomb-squad project, which
detects high cardinality labels and automatically adds relabeling rules in
order to reduce the cardinality.

On Thu, Jun 4, 2020 at 11:16 PM Dinesh N 
wrote:

> Thanks Murali for the quick response
>
> But how do we analyse it and can we below options
>
> 1) metrics_relabel_configs
> 2) recording rules
>
>
> On Fri, 5 Jun, 2020, 12:38 am Murali Krishna Kanagala, <
> kanagalamur...@gmail.com> wrote:
>
>> This should be taken care by the exporter you are collecting metrics
>> from. If you are writing your own exporter then make validate what labels
>> stay unique all the time. For ex. If you are collecting ngnix request
>> metrics then passing request uuid as a label makes the metrics highly
>> cardinal.
>>
>> On Thu, Jun 4, 2020, 2:01 PM Dinesh N 
>> wrote:
>>
>>> Hi Team,
>>>
>>> Figuring out ways to optimise high cardinality labels, any suggestions
>>> are welcomed here.
>>>
>>> Regards
>>> Dinesh
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/CAA6KEskjY1TpjVddTSAAx8AL3aH7Zc7yyNcg0McpOBOfrrow0g%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/prometheus-users/CAA6KEskjY1TpjVddTSAAx8AL3aH7Zc7yyNcg0McpOBOfrrow0g%40mail.gmail.com?utm_medium=email_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAA6KEs%3D4Zf29aOdMTXG_0JExSkXKaow3JG6aZSh-X2xb1PSHgA%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAA6KEs%3D4Zf29aOdMTXG_0JExSkXKaow3JG6aZSh-X2xb1PSHgA%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAH0eXLgi0%2BqtBOSmVSXB7OOYe0pdhp%2BpPf_uZ%2BvSMCnw%40mail.gmail.com.


Re: [prometheus-users] Re: Kubernetes pod memory limit and Prometheus mmap data

2020-06-04 Thread Aliaksandr Valialkin
FYI, the following program may be useful for this case -
https://github.com/linchpiner/cgroup-memory-manager

On Thu, Jun 4, 2020 at 1:13 PM Shadi Abdelfatah 
wrote:

> Related blog post:
> https://medium.com/faun/how-much-is-too-much-the-linux-oomkiller-and-used-memory-d32186f29c9d
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/2507b002-5e17-4a74-bcea-a774f6ac7d55%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/2507b002-5e17-4a74-bcea-a774f6ac7d55%40googlegroups.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDhC299BwiEbYxxsg-46KRvDdzT4zorVA-EpOr%2B7E3RtQ%40mail.gmail.com.


Fwd: [prometheus-users] Is remote read right approach for for cross cluster or distributed environment alert rules ?

2020-05-27 Thread Aliaksandr Valialkin
Take a look also at the following projects:

* Promxy <https://github.com/jacksontj/promxy> - it allows executing alerts
over multiple Prometheus instances. See these docs
<https://github.com/jacksontj/promxy/blob/master/README.md#how-do-i-use-alertingrecording-rules-in-promxy>
for details.
* VictoriaMetrics <https://github.com/VictoriaMetrics/VictoriaMetrics>+
vmalert
<https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmalert>.
Multiple Prometheus instances may write data into a centralized
VictoriaMetrics via remote_write API, then vmalert may be used for alerting
on top all the collected metrics in VictoriaMetrics.

On Wed, May 27, 2020 at 7:46 PM Rajesh Reddy Nachireddi <
rajeshredd...@gmail.com> wrote:

> Hi Ben,
>
> Does latest version of Cortex /Thanos supports the alerting with multiple
> shards of prometheus ?
> Thanos Ruler wasn't ready for production to evalute the expression across
> the prometheus instances .. Do we have any docuemnet or blog about this ?
>
> Thanks,
> Rajesh
>
> On Tue, May 26, 2020 at 11:37 AM Ben Kochie  wrote:
>
>> This is probably a case where you would want to look into Thanos or
>> Cortex to provide a larger aggregation layer on top of multiple Prometheus
>> servers.
>>
>> On Sun, May 17, 2020 at 11:53 AM Rajesh Reddy Nachireddi <
>> rajeshredd...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Basically, we have large networking setup with 10k devices. we are
>>> hitting 1M metrics every second from 20 % of devices itself, so we have 5
>>> prom instances and one global proemtheus which uses remote read to handle
>>> alert rule evaluations and thanos querier for visualisation on grafana.
>>>
>>> We have segregated devices with specific device ip ranges to each
>>> Prometheus instances.
>>>
>>> So, we have one aggregator which is using remote read from all the
>>> individual prom instances through remote read
>>>
>>> 1. will the remote read cause an issue w.r.t loading the large time
>>> series over wire every 1 min ?
>>> 2. Is it CPU or memory intensive ?
>>>
>>> What is best design strategy to handle these scale and alerting across
>>> the devices or metrics ?
>>>
>>> Regards,
>>>
>>> Rajesh
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/CAEyhnp%2BfG8YvciR4-30D%2BzsDzg_kF%2BKkJUavdbyGCxoz-97q_A%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/prometheus-users/CAEyhnp%2BfG8YvciR4-30D%2BzsDzg_kF%2BKkJUavdbyGCxoz-97q_A%40mail.gmail.com?utm_medium=email_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAEyhnpJt4QoMxzcMPvMa8qyDra8LLR9Je4nJqPZek8jSGYPbwA%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAEyhnpJt4QoMxzcMPvMa8qyDra8LLR9Je4nJqPZek8jSGYPbwA%40mail.gmail.com?utm_medium=email_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBzx7oHm_rg4dpq8aGJmbJN_ev5szRNa%2BN_pjp13HabXQ%40mail.gmail.com.


Re: [prometheus-users] Re: InfluxQL queries for Data collected by Prometheus

2020-05-21 Thread Aliaksandr Valialkin
Hi Yogesh,

On Sun, May 10, 2020 at 3:51 PM Yogesh Jadhav 
wrote:

> Thanks for the reply, Brian.
>
> I don't mind writing recording rules. My only worry is will I be able to
> use Grafana dashboards without any issues (especially when we access data
> over years) as InfluxDB stores it internally in a different format.
>
> Our's is a small organization with a small network ( a few thousand
> devices to monitor) and a small team ( I am the only one working on this).
> Keeping disk space usage (relatively) small and fixed even after storing
> data for years is
>
> Thanos has a lot of overheads, it supports only object/cloud storage. I
> don't think it supports custom resolutions/retentions. The most important
> point is it's downsampling is intended for query performance optimization
> instead of reducing storage footprint. In fact, it increases it by 3 times.
>
> VictoriaMetrics can't be considered for production systems (yet) as it is
> basically a one-man show. Also similar to other tools like Thanos, cortex,
> m3, etc, it has installation and maintenance overheads.
>

VictoriaMetrics is successfully used in production by many happy users -
see https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/CaseStudies .
One of the most important selling points for VictoriaMetrics is easy
installation and maintenance. Could you share more details on installation
and maintenance overhead related to VictoriaMetrics?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAgFaFJ_H86-StawA0o0OQQuEKwFACbRSO2-c386rqF1w%40mail.gmail.com.


Re: [prometheus-users] How to query to know if a metric has been available for an hour

2020-05-01 Thread Aliaksandr Valialkin
Try the following Prometheus subquery
:

count_over_time((CMSummary)[1h:1m]) == 60


It should return only time series, which were available every minute during
the last hour.


On Fri, May 1, 2020 at 11:59 AM Julius Volz  wrote:

> Btw. you could even modify this expression to check manually every e.g. 15
> minutes within the last hour, whether an IP was present at that time
> increment within the last hour:
>
> CMSummary offset 1h
>   and
> CMSummary offset 45m
>   and
> CMSummary offset 30m
>   and
> CMSummary offset 15m
>   and
> CMSummary
>
> So you see, theoretically you could even check every minute or so for the
> presence, but that would become a long query...
>
> On Fri, May 1, 2020 at 10:55 AM Julius Volz  wrote:
>
>> Do you also need to exclude IPs that are present at the beginning of the
>> hour, go missing briefly in between (as in, the time series becomes fully
>> absent), but are present again at the end of the hour? We don't have a way
>> in PromQL to check whether a series has been absent just briefly within an
>> interval, but is there most of the time. We have absent_over_time(), but
>> that only checks whether a series has been fully absent over a given time
>> range.
>>
>> But if your IPs behave in such a way that they don't appear / disappear
>> that rapidly, you could check which ones were there both at the beginning
>> and the end of the interval:
>>
>>   CMSummary offset 1h and CMSummary
>>
>> On Thu, Apr 30, 2020 at 10:51 PM Arnav Bose 
>> wrote:
>>
>>> Hi,
>>>
>>> I know I can check in the graph how long the metric has been available.
>>> In my case I want to create a query which will list down the data for a
>>> metric which has been available for an hour, excluding the ones which at
>>> least went down/missing during the same period.
>>>
>>> Here is my metric - CMSummary{ipAddr="$$$"}. There are at least 2
>>> different IP sources with this metric. I want to know which ones have been
>>> available for the last 1 hr.
>>>
>>>
>>> Thanks,
>>> Arnav
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/d21d5adf-0307-4367-b050-ec81cf0ce8b2%40googlegroups.com
>>> 
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CA%2BT6Yow7Ka2eXx7a-VaJxDpXC8PAmTXTU3tYwnFVMzVvTe4NKA%40mail.gmail.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAfaL41D13xWu%2B%3DFf4Kgv-eV8maD3zhazEt6JNVwnx6Pg%40mail.gmail.com.


Re: [prometheus-users] Remote Write Server Side Traffic Mirroring + Obfuscator for Prometheus Ecosystem

2020-04-20 Thread Aliaksandr Valialkin
> just I am a bit concerned about one binary that enables everything. It
might be quite hard work to maintain it.

vmagent is just easy to use metrics proxy, which performs the following
tasks:
- Accepts data via various popular ingestion protocols (Prometheus
remote_write, Influx line protocol, Graphite, OpenTSDB, CSV). Additionally,
it can scrape Prometheus targets.
- Augments and filters the accepted data with Prometheus-compatible
relabeling.
- Pushes the filtered data to the configured remote storage targets via
Prometheus remote_write protocol. Additional per-target relabeling can be
applied to data before sending it to each remote storage target.

vmagent uses independent file-base buffers for each configured remote
storage target, so it may buffer the data locally until temporarily
unavailable remote storage target becomes available again.


On Mon, Apr 20, 2020 at 9:44 PM Bartłomiej Płotka 
wrote:

> Thanks, Aliaksandr!
>
> So vmagent on top of scraping ALSO receive remote write API?  What it
> CANNOT do? =D
>
> It looks indeed that feature-wise it is what we meant, just I am a bit
> concerned about one binary that enables everything. It might be quite hard
> work to maintain it... You must be some kind of superhuman Aliaksandr! (:
> Definitely will take a look, thanks. (:
>
> Kind Regards,
> Bartek
>
> On Mon, 20 Apr 2020 at 19:38, Aliaksandr Valialkin 
> wrote:
>
>> Such a mirroring can be done with vmagent
>> <https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md>
>> - just configure multiple `-remoteWrite.url` targets with distinct
>> `-remoteWrite.urlRelabelConfig` configs for obfuscation and filtering. The
>> final system will look like the following:
>>
>>   ->remote target1 (prod)
>> Prometheus -> vmagent -> filtering ->   remote target2 (staging)
>>   -> obfuscation -> remote target3 (dev)
>>
>>
>> On Mon, Apr 20, 2020 at 9:24 PM Bartłomiej Płotka 
>> wrote:
>>
>>> Hi!
>>>
>>> This question is not strictly related to Prometheus, but rather to
>>> server-side Remote Write APIs.
>>> We are looking at how to have more realistic staging environments for
>>> servers like that. In order to achieve so, we want to "mirror" / "fork"
>>> portion of production remote write traffic to other clusters APIs (e.g
>>> staging or dev environment).
>>>
>>> As part of this mirroring, data has to be potentially obfuscated to
>>> avoid leaking of sensitive data, but also without totally changing the
>>> characteristic of data (e.g same number labels, labels values/names with
>>> the same amount of characters, etc).
>>>
>>> In the future, we could add some more advanced features if needed (e.g
>>> load balancing).
>>>
>>> Wonder if anyone in the community had been working on something like
>>> that already and has something to share/is already shared?
>>>
>>> ProxySQL <https://github.com/sysown/proxysql/wiki/Mirroring> is
>>> something like that but in the SQL world. Would be awesome to have the same
>>> for remote write (and Query API as well I guess, but let's think about it
>>> in a separate thread) (:
>>> <https://github.com/thanos-io/thanos/issues/2480>
>>> Some discussion on Thanos project:
>>> https://github.com/thanos-io/thanos/issues/2480
>>>
>>> Please help if you know or have worked on something like this (: Would
>>> be a nice community Project if nothing exists!
>>>
>>> Kind Regards,
>>> Bartek
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/CAMssQwYa3kW8UMPtJ2PuW8%3Dd8kWB-sz1E99D20ODn28KZTb%2BNQ%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/prometheus-users/CAMssQwYa3kW8UMPtJ2PuW8%3Dd8kWB-sz1E99D20ODn28KZTb%2BNQ%40mail.gmail.com?utm_medium=email_source=footer>
>>> .
>>>
>>
>>
>> --
>> Best Regards,
>>
>> Aliaksandr
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometh

Re: [prometheus-users] Remote Write Server Side Traffic Mirroring + Obfuscator for Prometheus Ecosystem

2020-04-20 Thread Aliaksandr Valialkin
Such a mirroring can be done with vmagent

- just configure multiple `-remoteWrite.url` targets with distinct
`-remoteWrite.urlRelabelConfig` configs for obfuscation and filtering. The
final system will look like the following:

  ->remote target1 (prod)
Prometheus -> vmagent -> filtering ->   remote target2 (staging)
  -> obfuscation -> remote target3 (dev)


On Mon, Apr 20, 2020 at 9:24 PM Bartłomiej Płotka 
wrote:

> Hi!
>
> This question is not strictly related to Prometheus, but rather to
> server-side Remote Write APIs.
> We are looking at how to have more realistic staging environments for
> servers like that. In order to achieve so, we want to "mirror" / "fork"
> portion of production remote write traffic to other clusters APIs (e.g
> staging or dev environment).
>
> As part of this mirroring, data has to be potentially obfuscated to avoid
> leaking of sensitive data, but also without totally changing the
> characteristic of data (e.g same number labels, labels values/names with
> the same amount of characters, etc).
>
> In the future, we could add some more advanced features if needed (e.g
> load balancing).
>
> Wonder if anyone in the community had been working on something like that
> already and has something to share/is already shared?
>
> ProxySQL  is something
> like that but in the SQL world. Would be awesome to have the same for
> remote write (and Query API as well I guess, but let's think about it in a
> separate thread) (:
> 
> Some discussion on Thanos project:
> https://github.com/thanos-io/thanos/issues/2480
>
> Please help if you know or have worked on something like this (: Would be
> a nice community Project if nothing exists!
>
> Kind Regards,
> Bartek
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAMssQwYa3kW8UMPtJ2PuW8%3Dd8kWB-sz1E99D20ODn28KZTb%2BNQ%40mail.gmail.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBEvDbThrr1cx9FgPhZRVrJqeS2%3DsgEOZguoMWH6FS7BA%40mail.gmail.com.


Re: [prometheus-users] Re: Prometheus exporter - reading CSV file having data from the past day

2020-04-15 Thread Aliaksandr Valialkin
On Tue, Apr 14, 2020 at 1:38 PM Brian Candler  wrote:

> Sorry, but I'm afraid you cannot backfill historical data into
> prometheus.  Prometheus will only scrape the current/latest value.
> Backfill is a feature being considered for the future
> .
>
> For now, you will need to look at a different storage engine.  I suggest
> VictoriaMetrics, which is prometheus-compatible (i.e. it supports
> prometheus API and promQL and can be configured as a remote-write endpoint
> for prometheus) and it *does* support the back-filling that you want to
> do.  However it's not a remote-read endpoint, so for example prometheus'
> alerting engine can't read data from it.  If you're only ingesting daily
> then this is unlikely to be a limitation.
>

An additional important detail is that VictoriaMetrics supports CSV data
ingestion

additionally to backfilling

.


>
> Also check out other storage integrations
> 
> which support both remote write and remote read, as you could populate the
> backend directly but prometheus can still read from them.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/fc3f5c0b-c187-4698-9028-14144a7f38c4%40googlegroups.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCRPu7_VHe32Ou2BXGxD1C%2BzAGX1_J3kcfvjHeQ_PstiQ%40mail.gmail.com.


Re: [prometheus-users] Re: How to solve "Two hour in-memory prometheus data during upgrade/failover"

2020-03-31 Thread Aliaksandr Valialkin
Another option is to configure Prometheus instances to replicate data to
remote storage via remote_write
.
Prometheus replicates data to the configured remote storage systems as soon
as the data is scraped, so it shouldn't lose big amounts of data on unclean
shutdown. See the list of supported remote storage systems here
.
The most promising systems are: Cortex, M3DB and VictoriaMetrics. You can
evaluate multiple systems at once - just add multiple `remote_write->url`
entries in Prometheus config.

It is worth reading these docs
 on remote_write config
tuning in Prometheus.

On Mon, Mar 30, 2020 at 10:21 PM Shaam Dinesh  wrote:

> Hi Brain
>
> Thanks for the response, yes I am still leveraging persistent disk to
> mitigate the restarts but it was not helping to save 2 hours data as
> configured
>
> I there any better way to address it
>
> On Tuesday, March 31, 2020 at 12:35:40 AM UTC+5:30, Brian Candler wrote:
>>
>> When you stop prometheus it writes out its WAL to disk, and when you
>> start it it reads in WAL back from disk.  This is why a prometheus restart
>> can take several minutes (and you should ensure that your supervisor
>> process isn't configured to do a hard kill after a short timeout).
>>
>> Of course it won't be ingesting data during that time, but if you have a
>> second instance, that one still will be.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/d1adfd24-a199-4f12-b85d-28394a7d52c0%40googlegroups.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDP_PFJ80fbhq48U7FMMB%3DaGNA5TF-Ao5OBjTWO2Ec%2BpA%40mail.gmail.com.


Re: [prometheus-users] Prometheus metrics to a url

2020-03-02 Thread Aliaksandr Valialkin
There are the following integration mechanisms for Prometheus metrics exist:

- Pushgateway  for pushing
Prometheus metrics, so the could be scraped by Prometheus later.
- Prometheus federation
 for scraping
metrics from Prometheus to another system.
- remote_write API

for replicating metrics to external storage systems. See the list of
supported systems

.

On Mon, Mar 2, 2020 at 12:42 PM adi garg  wrote:

> Hello experts,
>
> Is there a way to send Prometheus metrics to a URL, where there can be an
> exporter to send metrics to another system?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/980590a4-45b0-41ec-a7c3-b26371091e7d%40googlegroups.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmCJZFhow7Zs6gYeQGbb7UQKZDxX0TdbMdSrNJWE4cXFjw%40mail.gmail.com.


Re: [prometheus-users] Apply multiple functions on metric

2020-02-26 Thread Aliaksandr Valialkin
I believe something like the following PromQL query should work for your
case:

sum(rate(metric[5m]))

Note that you cannot swap sum() and rate() in the query in general case,
since this can give unexpected results. See
https://www.robustperception.io/rate-then-sum-never-sum-then-rate for
details.

On Wed, Feb 26, 2020 at 4:37 PM 'Avner Adania' via Prometheus Users <
prometheus-users@googlegroups.com> wrote:

> i'm trying to apply two function on one metric.
> For example:
>
> i would like to get a perSecond(Graphite function) on a sumSeries function:
>
> perSecond(sumSeries(prefix.region.cluster.metric_name.*.offset),"Rate")
>
> Question is how to apply two or more functions on one metric?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/51a596fb-67d1-45d8-a069-19c94fc2a211%40googlegroups.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBg%2B-yyPf7PPxZvyyTfyiZf7WrHXLeCUo7sax3RdAPZig%40mail.gmail.com.


Re: [prometheus-users] Prometheus remote_write to VictoriaMetrics

2020-02-24 Thread Aliaksandr Valialkin
Hi Maria,

Prometheus should write the same metrics to all the configured remote
storage systems via `remote_write -> url`.

On Mon, Feb 10, 2020 at 1:03 PM Maria Stroe  wrote:

> Hello,
>
> I have smth to ask. I want to know how Prometheus acts if i have two
> remote_writes. For example i have one remote_write to a VictoriaMetrics
> node and another remote_write to another VictoriaMetrics.
> My question is: Does Prometheus write the same metrics on both Victorias
> or does it shard?
>
> Ty in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/59026d02-fc04-4d13-b2b2-2802593e8b18%40googlegroups.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAWV2NHh43sJ5NSoSMOh6aAcDtgc38CCnwo9NkbOKsn8g%40mail.gmail.com.


Re: [prometheus-users] Compute aggregate percentile (or average) over multiple time series?

2020-02-24 Thread Aliaksandr Valialkin
On Mon, Feb 24, 2020 at 8:56 PM Yongjik Kim  wrote:

> Hi Aliaksandr,
>
> Thanks a lot for the reply, but I think quantile_over_time() will compute
> percentiles over each series?
>

Yes.


>
> So, for example, if I have three different time series A/B/C (representing
> three instances of the same task T), and I use quantile_over_time(), then I
> could get "95% CPU usage of A/B/C" separately, but it still won't tell me
> "95% CPU usage across all instances of T", as far as I can tell.
>

I'm afraid PromQL doesn't provide the functionality, which can be used for
calculating percentiles over data points on the given range from multiple
time series :( The closest approximation is max(quantile_over_time(0.95,
...)) . I don't recommend using avg() instead of max(), since it hides time
series spikes.


>
>
>
> On Mon, Feb 24, 2020 at 10:41 AM Aliaksandr Valialkin 
> wrote:
>
>> Hi Yongjik!
>>
>> Try using `quantile_over_time` instead of `quantile`. See
>> https://prometheus.io/docs/prometheus/latest/querying/functions/#aggregation_over_time
>>
>>
>> On Fri, Feb 14, 2020 at 9:06 PM 'Yongjik Kim' via Prometheus Users <
>> prometheus-users@googlegroups.com> wrote:
>>
>>> Hi,
>>>
>>> I have a problem with aggregation. I want to get the CPU usage of a set
>>> of jobs (each with potentially different start/stop time), over the past
>>> week, and then get 95% percentile among these values.
>>>
>>> So, I can get the raw data points with this:
>>>
>>> > rate(cpu_usage{name="myjob"}[5m])[1d:5m]
>>>
>>> cpu_usage is an accumulative series (counter?) which records "the amount
>>> of CPU resource this job has used since it started."  So, as far as I
>>> understand, this gives me a nice list of "average CPU usage for each
>>> 5-minute interval, for every job and for every interval the job was alive."
>>>
>>> So far so good, but then how do I get the 95% percentile of *all these
>>> values*?
>>>
>>> If I try this:
>>>
>>> > quantile(0.95, rate(cpu_usage{name="myjob"}[5m])[1d:5m])
>>>
>>> I get: "Error executing query: invalid parameter 'query': parse error at
>>> char 147: expected type instant vector in aggregation expression, got range
>>> vector"
>>>
>>> I can make it output *some number* by removing [1d:5m], but that's not
>>> what I want. I don't need 95% percentile at the current instant, but over
>>> the past week.
>>>
>>> Any way to make it work without piping the result through a custom
>>> script?
>>>
>>> Thanks,
>>> - Yongjik Kim
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/637d706d-e39a-4b0a-8ec3-70bdcc9c3cbc%40googlegroups.com
>>> <https://groups.google.com/d/msgid/prometheus-users/637d706d-e39a-4b0a-8ec3-70bdcc9c3cbc%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>>
>>
>>
>> --
>> Best Regards,
>>
>> Aliaksandr
>>
>

-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmDdn-T27u_%3DT1e9hGzktYhGsud4u7v3zWrfUPDipZ9L8g%40mail.gmail.com.


Re: [prometheus-users] Compute aggregate percentile (or average) over multiple time series?

2020-02-24 Thread Aliaksandr Valialkin
Hi Yongjik!

Try using `quantile_over_time` instead of `quantile`. See
https://prometheus.io/docs/prometheus/latest/querying/functions/#aggregation_over_time


On Fri, Feb 14, 2020 at 9:06 PM 'Yongjik Kim' via Prometheus Users <
prometheus-users@googlegroups.com> wrote:

> Hi,
>
> I have a problem with aggregation. I want to get the CPU usage of a set of
> jobs (each with potentially different start/stop time), over the past week,
> and then get 95% percentile among these values.
>
> So, I can get the raw data points with this:
>
> > rate(cpu_usage{name="myjob"}[5m])[1d:5m]
>
> cpu_usage is an accumulative series (counter?) which records "the amount
> of CPU resource this job has used since it started."  So, as far as I
> understand, this gives me a nice list of "average CPU usage for each
> 5-minute interval, for every job and for every interval the job was alive."
>
> So far so good, but then how do I get the 95% percentile of *all these
> values*?
>
> If I try this:
>
> > quantile(0.95, rate(cpu_usage{name="myjob"}[5m])[1d:5m])
>
> I get: "Error executing query: invalid parameter 'query': parse error at
> char 147: expected type instant vector in aggregation expression, got range
> vector"
>
> I can make it output *some number* by removing [1d:5m], but that's not
> what I want. I don't need 95% percentile at the current instant, but over
> the past week.
>
> Any way to make it work without piping the result through a custom script?
>
> Thanks,
> - Yongjik Kim
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/637d706d-e39a-4b0a-8ec3-70bdcc9c3cbc%40googlegroups.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmC7TAv2nLmc_sNWScYxy3SbaxWT4gq7B9QPv0b4a7DmzQ%40mail.gmail.com.


Re: [prometheus-users] Re: monitoring Openshift Cluster with External Promethus

2020-02-24 Thread Aliaksandr Valialkin
Another option is to set up remote write from both Prometheus instances to
a centralized external remote storage and then query the remote storage
directly without the need of Prometheus. Unlike the approach with
Prometheus federation, such a setup doesn't require allowing incoming
connections to Openshift cluster.
See the list of supported remote storage systems at
https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage


On Sat, Feb 15, 2020 at 12:29 AM Murali Krishna Kanagala <
kanagalamur...@gmail.com> wrote:

> You can scrape the Prometheus on the openshift using federation
> configuration on the external Prometheus. With this approach you don't have
> to persist the metrics on the openshift cluster(no need of persist storage)
>
> Just expose the Prometheus on the open shift cluster using nodeport or
> something like that and add it to the federation configuration on your
> external Prometheus.
>
> https://prometheus.io/docs/prometheus/latest/federation/
>
> On Fri, Feb 14, 2020, 12:28 PM Tarun Gupta 
> wrote:
>
>> Were you able to figure this out. I have exactly the same problem to
>> solve.
>>
>> Thanks
>>
>> On Friday, June 7, 2019 at 11:31:09 AM UTC-7, IndGirl6 wrote:
>>>
>>> Hi,
>>>
>>> would appreciate it if some one can detail step by step or point me to a
>>> good document for this.
>>> I have an Openshift Cluster setup. It already has its own prometheus and
>>> grafana  setup internally within the cluster.
>>>
>>> External to this cluster, i also have another standalone Prometheus
>>> setup, that monitors my other devices (physical servers / databases /
>>> application etc).
>>>
>>> I would like to be able to monitor my openshift cluster using the
>>> External standalone prometheus server.
>>>
>>> Is this possible, can some one guide me on the setup.
>>>
>>> Thanks
>>> IG6
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/ae87e731-f57b-4a6b-8b4e-0e26160c8f37%40googlegroups.com
>> 
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAKimyZsm4GzkahGEX%3DgObF46UVssYN%3DaAfFAyG-pDLLuL1d1AQ%40mail.gmail.com
> 
> .
>


-- 
Best Regards,

Aliaksandr

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBdTEqmkBL4J5GyeuJtbLDOnZMHXCaYf%2BNXgJ9imtJYhQ%40mail.gmail.com.