from:"Stuart Clark"

Re: [prometheus-developers] Most recent Timeseries in case of GAUGE metrics

2023-09-04 Thread Stuart Clark

On 25/08/2023 04:37, sunil sahu wrote:

Hello All,

I need help in my below use case.

We have a centralized python application which continuously monitors
state for other applications by running tests. We creates
*GUAGE* metrics with a value (Ex. 100) and set to different value when
state changes (Ex. 200 or 300) using gauge's set function (python
prometheus_client instrumentation). This run frequently (1min, 5m etc)
depending on tests.

Our central application runs with multiples pods in load balancing. So
any pod can serve the request and set the metric value to new/existing
one.

We are scraping individual pods every minute using k8s discovery
(role: pods), now here any of the pod/instance can have latest/recent
updated value. N*ow my problem is how to get the metric with latest
value out of all instances' metrics.*

Aggregations on 'instance' label like sum/min or max won't be a ideal
choice here, may be `last` (which I don't find. I am aware of
last_over_time but that is kind of to fill the data-point gaps with
previous values)
Currently I am living with `last_over_time(max without(instance)
(metric_name))` but as I explained it is giving wrong results for
flaky tests.

I tried to explain as best as possible, still let me know if any
missing gaps to understand the scenario.

So are you saying that the multiple pods which are all getting scraped
might give totally different values for this metric, but only one of
those is correct?

If that is the case then you'd need to ensure the correct value is
returned by every pod, or you store that value elsewhere and scrape that
other system.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/37460f3a-1ef0-69b5-d3ad-02cf2f8e4154%40Jahingo.com.

Re: [prometheus-developers] Maximum length of a time series in Prometheus

2023-03-27 Thread Stuart Clark


On 2023-03-27 14:10, Abdelouahab Khelifati wrote:

I mean the maximum number of datapoints per time series, not the
length of a specific value.


Sorry that is what I was meaning.

There will be loads of resource or performance limitations - such as not 
being able to load the data for very long periods of time without huge 
amounts of memory. However in the normal usage of scrapes around every 
30 seconds there are people who have multi-year storage retentions that 
work successfully (so that would be around a million datapoints per 
year).


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/0f10e2d5aeb2ea7619846506fa81f1a4%40Jahingo.com.

Re: [prometheus-developers] Loading a large csv file to Prometheus

2023-03-27 Thread Stuart Clark

On 2023-03-27 13:22, Abdelouahab Khelifati wrote:

Hello Stuart, thanks for reaching out!

I see your point!

My data does contain only numerical data, has a constant rate, and
only has one string label. Would you say that it is possible but not
optimal or simply not possible to use Prometheus as a general-purpose
database?

Also, would it be possible to point out the main technical factors
rendering Prometheus unsuitable as a general-purpose database?

Well the TDSB in Promethues could be used elsewhere if you created code
(as I think the TDSB code is fairly abstracted from other parts of
Prometheus) but otherwise all interactions are via the rest of
Prometheus. So for example the main way of populating the database is
via scraping targets. There is no way to change existing data (other
than deleting things). There is only a single schema (each timeseries
contains a single float64 per data point with a list of string based
labels [the metric name is actually a label]).

So I'd suggest that the TDSB is as it stands very much not a
general-purpose database - it is just a integral part of Promtheus,
designed for the use case of regular live numeric metric values, with
the ability to extract values via PromQL for graphing/alerts/etc. If
your use case aligns with that, then it could be a good fit, but if you
are wanting other things (being able to change values, doing a lot of
bulk importing, wanting to run/scale the database separately, wanting
database reliability/HA similar to other general purpose databases) it
may be a poor fit.

--
Stuart Clark

Re: [prometheus-developers] Maximum length of a time series in Prometheus

2023-03-27 Thread Stuart Clark


On 2023-03-27 14:07, Abdelouahab Khelifati wrote:

Hello,

Is there a hard maximum length limit of a time series in Prometheus?



There is probably something around 64 bit numbers, but as that is a very 
large number I don't think there is anything specific limiting things 
other than memory/disk.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/8422a6919b069dd025962cc5d535301b%40Jahingo.com.

Re: [prometheus-developers] Loading a large csv file to Prometheus

2023-03-27 Thread Stuart Clark


On 27/03/2023 13:01, Abdelouahab Khelifati wrote:

Thanks Ben for the answer!

Would you have an example of how such data should look?

> Also, that doesn't really look like time-series data. Prometheus is 
a monitoring system, not a generic database.
Would that mean that I cannot store such data in standalone 
Prometheus? My understanding is that Prometheus includes a full TSDB 
system.


Prometheus does contain a TSDB, but it isn't designed as a general 
purpose database. Instead it is designed to be used for metrics that are 
scraped at a constant rate, only storing numeric values per metric (with 
optional string labels). If you are wanting something more generic, you 
are better off looking at other databases (both timeseries based and 
otherwise).


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/f2de37a3-5a5c-d60b-33b0-8895425433ff%40Jahingo.com.

Re: [prometheus-developers] Windows Exporter License

2022-12-22 Thread Stuart Clark

On 22/12/2022 09:39, Julien Pivotto wrote:

On 22 Dec 10:09, Ben Kochie wrote:

It was my understanding that license changes, can be done by the copyright
holder, without consent of all contributors. Because they do not hold any
copyright to the code. IIRC this is how Grafana was able to relicense from
Apache to AGPL. They did not need to get consent from all contributors.

Of course, old versions are subject to the old license, but moving from
prometheus-community to prometheus would effectively be a fork.

In this case we could do it with permission from the original author as
stated in the LICENSE file.

I did ask the question to CNCF via the service desk, if copyright owners
would be enough. They replied that we have to ask all contributors to
change the license.

Note that grafana was able to relicense because for many years they made
people sign a CLA.

Indeed. If you look at the CLA that everyone is required to sign (and
Grafana have to keep proof that everyone has done so) at
https://grafana.com/docs/grafana/latest/developers/cla/ it says:

"Grant of Copyright License. Subject to the terms and conditions of this
Agreement, You hereby grant to Grafana Labs and to recipients of
software distributed by Grafana Labs a perpetual, worldwide,
non-exclusive, no-charge, royalty-free, irrevocable copyright license to
reproduce, prepare derivative works of, publicly display, publicly
perform, sublicense, and distribute Your Contributions and such
derivative works."

Which allows Grafana (the company) to do anything it likes with your
code - it can be licensed in any way (commercial or open source), given
to others, etc. without needing any further permission.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/af19530d-a7d2-4b3d-e67c-bd25cfe0609d%40Jahingo.com.

Re: [prometheus-developers] Windows Exporter License

2022-12-22 Thread Stuart Clark

On 2022-12-22 09:09, Ben Kochie wrote:

It was my understanding that license changes, can be done by the
copyright holder, without consent of all contributors. Because they do
not hold any copyright to the code. IIRC this is how Grafana was able
to relicense from Apache to AGPL. They did not need to get consent
from all contributors.

Of course, old versions are subject to the old license, but moving
from prometheus-community to prometheus would effectively be a fork.

In this case we could do it with permission from the original author
as stated in the LICENSE file.

You are correct in saying that it is the copyright owner(s) who have to
agree to any license changes.

However by default if you contribute something to a project you are now
one of the copyright owners (only to your contributed code, not the
whole thing). The original owner is nothing special (other than possibly
being the largest owner, because there might be more of their code than
anyone else).

The only way around this (which I assume Grafana did, and other projects
require) is when contributing you sign a copyright transfer agreement -
that way legally the person/organisation the contributors transferred
ownership to is the only owner, and they have the right to do anything
they wanted (including using the code commercially or making everything
closed source).

So if this happened, and there is a record of signed copyright transfers
the license could be changed just by the agreement of the one owner.
Presumably however that isn't the case, and therefore it isn't possible.

Another option which has been used in other projects (such as the Linux
kernel for code that was found to not be correctly licensed [contributed
by someone who didn't have the rights to do so]) is to remove that code
& rewrite it (although you have to be careful that is is done 'cleanly'
to stop claims that you just copied that bad code). At that point the
contributor's code is no more, so no permission is then needed. If 95%
of existing contributors agreed to relicense and/or assign copyright but
there was 5% who didn't agree or couldn't be contacted that would
potentially be an option - of course it could be very
difficult/impossible if the remaining code was something really core.

--
Stuart Clark

Re: [prometheus-developers] rbac support for exporter-toolkit

2022-11-28 Thread Stuart Clark


On 2022-11-28 11:40, Ben Kochie wrote:

It depends on if the sidecar is with Prometheus or with the target.

If it's with Prometheus, that's probably just a docs update.

If it's with every exporter, that's probably something we would want
in the exporter-toolkit.

But, my understanding was that the typical thing here was to use mTLS
for securing and authorizing Prometheus.

If it's something we need to integrate into every exporter to do some
kind of token auth, we might want to consider this.



Do you mean building in the functionality directly into the exporter 
instead of using a sidecar?


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/e723d741f1ceee469483d71c9e5b66e1%40Jahingo.com.

Re: [prometheus-developers] rbac support for exporter-toolkit

2022-11-28 Thread Stuart Clark


On 2022-11-28 11:01, Jesús Samitier wrote:

Yeah, maybe add some documentation with example configurations.



If it just some docs I don't see any issue?

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/756378d93ef162cbf2f8b9328684eaeb%40Jahingo.com.

Re: [prometheus-developers] rbac support for exporter-toolkit

2022-11-28 Thread Stuart Clark


On 2022-11-28 10:56, Jesús Samitier wrote:

Hi

The idea is to integrate kube-rbac-proxy to add an extra (and
optional) security feature in a new exporter, so the final user can
rely on RBAC to assure that only Prometheus can scrape its metrics.
This is something you get when you install Prometheus in K8s using the
official helm chart - only Prometheus can scrape the Prometheus
metrics exposed by the K8s internals. The idea is to have something
similar but for any exporter.

Any developer can integrate it in its exporter (as shown here
https://www.brancz.com/2018/02/27/using-kube-rbac-proxy-to-secure-kubernetes-workloads),
but someone pointed out on Mastodon that we could also integrate in in
the exporter toolkit so it's even easier.



What would actually be needed in the toolkit though? Is it just some 
docs explaining how to deploy the sidecar with the exporter, or actual 
code changes?


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/3883faab86ed1a93de7ad2d8f8598ea5%40Jahingo.com.

Re: [prometheus-developers] What if Prometheus to Scape Anything from Anywhere with embedded Zero Trust?

2022-06-14 Thread Stuart Clark


On 14/06/2022 11:58, Bjoern Rabenstein wrote:

On 10.06.22 17:48, Rudford Hamon wrote:

Yes :) What would be the best approach to see adoption and letting the
community collectively know/try?




Pitching a commercial product there is frowned upon, but as long as
you are sticking to an OSS project like OpenZiti, and your posts stay
relevant and to the point, I would assume it's OK to spread the news
via those channels.


The only things I'd say is to ensure you have the right expectations. 
There might be some people on the list who are interested, but I'd 
expect the vast majority probably don't have the time/interest/need for 
such a solution. So you might have a few people asking for a bit more 
information, but I wouldn't expect much to happen after your posting.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/3e748eb5-1cbe-c024-a122-d11240a9ee9a%40Jahingo.com.

Re: [prometheus-developers] [cloudwatch exporter] API usage optimization

2022-04-19 Thread Stuart Clark


On 2022-04-19 16:14, Or Shachar wrote:

Hi,

Our company recently hit this issue [1] - with `/metrics` endpoint
taking > 20s to load.
I started working on a solution and I saw that big contributions
should start with a short discussion here.
Pardon for starting the work in advance.

I'd really appreciate a short review of the suggested solution.
https://github.com/prometheus/cloudwatch_exporter/pull/414

Also - I wonder if we should migrate to
https://github.com/nerdswords/yet-another-cloudwatch-exporter even
though it's not under the official Prometheus organization.

Thanks in advance!

keywords:
- getMetricStatistics
- getMericData
- cloudwatch



So are you intending on trying to move to using the bulk multi-request 
API call where possible to reduce the number of calls & make things 
quicker/cheaper?


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/0bb2614136d3f3e87a7e6416cd4224b1%40Jahingo.com.

Re: [prometheus-developers] Next release schedule

2022-04-18 Thread Stuart Clark


On 12/04/2022 20:50, Sergey Leminov wrote:


Hello
The latest release was made more then a year ago. Just wonder if there 
are plans on releasing next version soon?

The latest release of what?

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/460c6ac2-f614-7ffe-1eb3-d1f531b0ba32%40Jahingo.com.

Re: [prometheus-developers] Is this alerting architecture crazy?

2021-11-21 Thread Stuart Clark


On 20/11/2021 23:42, Tony Di Nucci wrote:

Yes, the diagram is a bit of a simplification but not hugely.

There may be multiple instances of AlertRouter however they will share 
a database.  Most likely things will be kept simple (at least 
initially) where each instance holds no state of its own.  Each active 
alert in the DB will be uniquely identified by the alert fingerprint 
(which the AlertManager API provides, i.e. a hash of the alert groups 
labels).  Each non-active alert will have a composite key (where one 
element is the alert group fingerprint).


In this architecture I see AlertManager having the responsibilities of 
capturing, grouping, inhibiting and silencing alerts.  The AlertRouter 
will have the responsibilities of; enriching alerts, routing based on 
business rules, monitoring/guaranteeing delivery and enabling analysis 
of alert history.


Due to my requirements, I think I need something like the 
AlertRouter.  The question is really, am I better to push from 
AlertManager to AlertRouter, or to have AlertRouter pull from 
AlertManager.  My current opinion is that pulling comes with more 
benefits but since I've not seen anyone else doing this I'm concerned 
there could be good reasons (I'm not aware of) for not doing this.


If you really must have another system connected to Alertmanager having 
it respond to webhook notifications would be the much simpler option. 
You'd still need to run multiple copies of you application behind a load 
balancer (and have a clustered database) for HA, but at least you'd not 
have the complexity of each instance having to discover all the 
Alertmanager instances, query them and then deduplicate amongst the 
different instances (again something that Alertmanager does itself already).


I'm still struggling to see why you need an extra system at all - it 
feels very much like you'd be increasing complexity significantly which 
naturally decreases reliability (more bits to break, have bugs or act in 
unexpected ways) and slow things down (as there is another "hop" for an 
alert to pass through). All of the things you mention can be done 
already through Alertmanager, or could be done pretty simply with a 
webhook receiver (without the need for any additional state storage, etc.)


* Adding data to an alert could be done with a simple webhook receiver, 
that accepts an alert and then forwards it on to another API with extra 
information added (no need for any state)
* Routing can be done within Alertmanager, or for more complex cases 
could again be handled by a stateless webhook receiver
* With regards to "guaranteeing" delivery I don't see your suggestion in 
allowing that (I believe it would actually make that less likely overall 
due to the added complexity and likelihood of bugs/unhandled cases). 
Alertmanager already does a good job of retrying on errors (and updating 
metrics if that happens) but not much can be done if the final system is 
totally down for long periods of time (and for many systems if that 
happens old alerts aren't very useful once it is back, as they may have 
already resolved).
* Alertmanager and Prometheus already expose a number of useful metrics 
(make sure your Prometheus is scraping itself & all the connected 
Alertmanagers) which should give you lots of useful information about 
alert history (with the advantage of that data being with the monitoring 
system you already know [with whatever you have connected like 
dashboards, alerts, etc.])


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/894f2f0c-2a8e-dc83-d4fa-cf4a1d605db9%40Jahingo.com.

Re: [prometheus-developers] Is this alerting architecture crazy?

2021-11-20 Thread Stuart Clark

It sounds like you are planning on creating a fairly complex system that 
duplicates a reasonable amount of what Alertmanager already does. I'm presuming 
your diagram is a simplification and that the application is itself a cluster, 
so each instance would be querying each instance of Alertmanager? Would your 
storage be part of the clustering system (similar to Alertmanager) or another 
cluster of something like a relational database? 

On 20 November 2021 11:28:30 GMT, Tony Di Nucci  wrote:
>There are other things I need to do as well, alert enrichment, complex 
>routing, etc.  which means that I think some additional system is needed 
>between AlertManager and the final destination in any case.
>
>The main question in my mind is really; are there reasons why I should 
>prefer to have AlertManager push to this new system over having this new 
>system pull?  
>
>My reasons for preferring a pull based architecture are:
>* Just by looking at the AlertRouter we can get a reasonable understanding 
>of overall health.  If alerts are pushed to the router then it alone can't 
>tell the difference between no alerts firing and it not receiving alerts 
>that have fired.
>* Backpressure is a natural property of the system.
>
>With this extra context, what do you think?
>
>On Saturday, November 20, 2021 at 11:08:58 AM UTC Tony Di Nucci wrote:
>
>> Thanks for the feedback.
>>
>> > What gives you the impression that the Alertmanager is "best effort"?
>> Sorry, best-effort probably wasn't the right term to use.  I am aware of 
>> there being retries however these could still all fail and I'm thinking I 
>> wouldn't be made aware of the issue for potentially quite a long time.
>>
>> My understanding is that an 
>> *alertmanager_notification_requests_failed_total* counter will be 
>> incremented each time there is a failed send attempt however from this 
>> alone I can't tell the difference between a single alert that's 
>> consistently failing and a small number of alerts which are all failing.  I 
>> think this means that I've got to wait until 
>> *alertmanager_notifications_failed_total 
>> *is incremented before considering an alert to have failed (and this can 
>> take many minutes) and then a bit of exploration is needed to figure out 
>> which alert(s) failed.  Depending on the criticality of the alert it may be 
>> fine for it to take some minutes before we're made aware of a delivery 
>> problem, in other cases though it won't be.
>>
>> A couple of things I didn't really touch on originally which will also 
>> help explain where my head is:
>> * I have a requirement to be able to measure accurate latency per alert 
>> through the alerting pipeline, i.e. for each alert I need to know the 
>> amount of time it was known to AlertManager before it was successfully 
>> written to the destination.
>> * I have a requirement to be able to analyse historic alerts.
>>
>>
>>
>> On Saturday, November 20, 2021 at 10:33:12 AM UTC sup...@gmail.com wrote:
>>
>>> Also, the alertmanager does have an "even store", it's a shared state 
>>> between all instances.
>>>
>>> If you're interested in changing some of the behavior of the retry 
>>> mechanisms or how this works, feel free to open specific issues. You don't 
>>> need to build an entirely new system, we can add new features to the 
>>> existing Alertmanager clustering framework.
>>>
>>> On Sat, Nov 20, 2021 at 11:29 AM Ben Kochie  wrote:
>>>
 What gives you the impression that the Alertmanager is "best effort"?

 The alertmanager provides a reasonably robust HA solution (gossip 
 clustering). The only thing best-effort here is actually deduplication. 
 The 
 Alertmanager design is "at least once" delivery, so it's robust against 
 network split-brain issues. So in the event of a failure, you may get 
 duplicate alerts, not none.

 When it comes to delivery, the Alertmanager does have retries. If a 
 connection to PagerDuty or other receivers has an issue, it will retry. 
 There are also metrics for this, so you can alert on failures to alternate 
 channels.

 What you likely need is a heartbeat setup. Because services like 
 PagerDuty and Slack do have outages, you can't guarantee delivery if 
 they're down.

 The method here is to have an end-to-end "always firing heartbeat" 
 alert, which goes to a system/service like healthchecks.io or 
 deadmanssnitch.com. These will trigger an alert in the absence of your 
 heartbeat. Letting you know that some part of the pipeline has failed.

 On Sat, Nov 20, 2021 at 11:02 AM Tony Di Nucci  
 wrote:

> Cross-posted from 
> https://discuss.prometheus.io/t/is-this-alerting-architecture-crazy/610
>
> In relation to alerting, I’m looking for a way to get strong alert 
> delivery guarantees (and if delivery is not possible I want to know about 
> it quickly).
>
> Unless I’m mistaken AlertManager only

Re: [prometheus-developers] Is it possible that timeseries can get replaced with a new scrapped one

2021-09-15 Thread Stuart Clark


On 2021-09-15 13:09, TECHAX wrote:

Thank you so much. Can you please tell me,Is it possible to change
this 5 minutes to less time?



It is possible but that isn't generally advised. Why are you wanting to 
adjust that value?


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/7e6815fc18fe0505d8b84908f8cfb152%40Jahingo.com.

Re: [prometheus-developers] Is it possible that timeseries can get replaced with a new scrapped one

2021-09-15 Thread Stuart Clark


On 2021-09-15 12:38, Prince wrote:

HI everyone,
I am new t Prometheus, can anyone let me know:
Is it possible to replace the existing time series with new scrapped
data.
For example:
at metric end:
metric_example(name:"abc") 1
metric_example(name:"xyz") 2

So for this Prometheus server will have two timeseries as:
 metric_example{instance:"192.168.47.53",job:"example",name:"abc"} 1
metric_example{instance:"192.168.47.53",job:"example",name:"xyz"} 2

IS IT POSSIBLE THAT ON PROMETHEUS THE FIRST TIMESERIES GET REPLACED
WITH THE SECOND ONE, WHEN THE SECOND ONE GETS SCRAPPED FROM METRIC
END?



Each combinations of labels (including the metric name) is stored as a 
separate time series. So in your example there are two time series, as 
the "name" label is different. You are free to stop (or start) sending a 
particular label combination (or even a whole metric). After 5 minutes a 
time series that is no longer being presented during the scrapes will be 
marked as "stale" and would stop appearing on graphs, alerts, etc. 
(unless you do a query which covers a period before it goes stale).


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/df3dbec31f8346b1ef18cdaabd09455d%40Jahingo.com.

Re: [prometheus-developers] Prometheus scraping at metric-endpoint

2021-09-09 Thread Stuart Clark


On 2021-09-09 11:06, Prince wrote:

Hi everyone,
Can anyone please let me know, Is it possible to collect multiple
gauge values for a single time series at the metric end-point?
For eg: the text-format exposition

my_metric{lname:"abc"} 1.5 2.5
** here metric name is: my_metric
 label name is: lname
 label value is:  abc
First gauge value: 1.5
Second gauge value: 2.5



Each metric has exactly one value, so if you are wanting to store 
multiple values that would be multiple metrics, each with a unique name.


So for example my_metric_temperature_degrees & my_metric_height_meters

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/4235167312193c2bc8a7aedf0c99d7a2%40Jahingo.com.

Re: [prometheus-developers] Adding timestamps to Gauge Metric

2021-08-31 Thread Stuart Clark


On 2021-08-30 07:19, Prince wrote:

So that means in the Prometheus graph the data will be getting
displayed from the time of scraping and at a regular intervals (scrape
interval).
Example: my_metric  1669.574 1630299163151(data and it's timestamp).
So this data 1669.574 will be displayed at starting scrape time not at
this  1630299163151 time.

**  163029916315 this is older time than the starting scrape time.



In general you shouldn't set the timestamp for a metric at all. There 
are very few use cases where it should be used, with the main one being 
when connecting another scrape based metric system to Prometheus (e.g. 
CloudWatch).


For everything else you set the metric to the latest value (for a gauge) 
and it will then update Prometheus during the next scrape. If you must 
know the exact time of the last event (for example to alert if events 
stop happening) you'd have a gauge whos value is that timestamp. But in 
none of those situations would you set the metrics timestamp.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/59b49e4f36b915a8eceb31486d81198e%40Jahingo.com.

Re: [prometheus-developers] Adding timestamps to Gauge Metric

2021-08-29 Thread Stuart Clark

The key things you just said were "event" and "logs", both of which are not the 
metrics that Prometheus is designed for. Now it is possible to convert 
events/logs into metrics, but this sounds different to what you are wanting. 
Metrics created from logs would have regularly scraped metrics (where no 
timestamp is included) which might contain a counter of the number of events or 
the value of the last event (or maybe even a gauge containing the last 
timestamp of the event). These are then perfect for alerting when events stop 
happening, happen too often or produce values outside of allowed parameters.

If instead you are wanting to be able to store individual data points when 
events happen (which might be at any point, not aligned with a regular scrape 
interval) as it sounds then you want something different to Prometheus. You can 
use a standard SQL or no-SQL database (such as MySQL or DynamoDB) or a time 
series database (such as InfluxDB or Timescale DB). For many of the different 
options you can visualise them using Grafana, which allows you to show data 
from both Prometheus and your event store 

On 29 August 2021 14:08:02 BST, Prince  wrote:
>So let's suppose we are monitoring an event in such a way that the logs of 
>this event are in a file and there is some value for that event and 
>correspondingly time at when we got that value.
>So in the Prometheus graph we can not show the value at that time 
>Because I have a similar situation where I have a value and timestamp for 
>that value. But when the Prometheus server starts running it shows the 
>value at the current time not at it's a timestamp.
>On Sunday, August 29, 2021 at 5:42:32 PM UTC+5:30 sup...@gmail.com wrote:
>
>> Yup, Prometheus is a monitoring system, not a general use time-series 
>> database.
>>
>> Specifically, Prometheus has a "look back window" where Prometheus will 
>> take the timestamp of your query and look back in time for samples to match.
>>
>> This is fundamental to how PromQL works.
>>
>> So no, what you are asking for is not possible. PromQL graph queries base 
>> the X-axis on the input parameters of the range query. The start, end, and 
>> step.
>>
>> See the documentation on the API: 
>> https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries
>>
>> On Sun, Aug 29, 2021 at 1:09 PM Stuart Clark  
>> wrote:
>>
>>> That very much depends on whatever tool you are using to display graphs.
>>>
>>> However it is sounding like Prometheus may not be the right system 
>>> depending on what you are trying to do. Prometheus is a metric system, 
>>> which works by sampling the current state of a system at regular time 
>>> periods, meaning the exact timestamp doesn't generally matter.
>>>
>>> It sounds like you are instead wanting to record events - things that 
>>> happen at a specific period of time, not at a regular frequency. For that 
>>> use case you should look at an event store - something like Elasticsearch, 
>>> InfluxDB or a standard relational or no-SQL database. 
>>>
>>> On 29 August 2021 11:15:44 BST, Prince  wrote:
>>>>
>>>> Is it possible to give a custom timestamp in Prometheus for X-axis 
>>>> <https://stackoverflow.com/questions/68095611/is-it-possible-to-give-custom-timestamp-in-prometheus-for-x-axis>
>>>> ?
>>>> Example: I* am getting **my_metric_name *152.401 163013412 *at 
>>>> metric endpoint but in Prometheus graph, I am getting the value *152.401* 
>>>> when 
>>>> it is scraped, but I want it should be displayed at 
>>>> *163013412(Saturday, 
>>>> August 28, 2021, 7:00:00.012 AM) *this time in Prometheus graph.*
>>>>
>>>> Is it possible? if yes, can you please let me know how?
>>>> Thank you.
>>>>
>>>> On Sunday, August 29, 2021 at 11:06:27 AM UTC+5:30 Prince wrote:
>>>>
>>>>> Thank you, understood. I have used the following:
>>>>>
>>>>> in Collect()
>>>>>  *  t := time.Date(2021, time.August, 28, 07, 0, 0, 12345678, 
>>>>> time.UTC)*
>>>>> *s := prometheus.MustNewConstMetric(c.metric, 
>>>>> prometheus.GaugeValue, float64(s.value))*
>>>>> *ch<- prometheus.NewMetricWithTimestamp(t,s)*
>>>>>
>>>>> *I am getting **my_metric_name 152.401 163013412 *(both things 
>>>>> the value and timestamp), but I am not getting this timestamp in the 
>>>>> x-axis 
>>>>> of prometheus graph. Can You please

Re: [prometheus-developers] Adding timestamps to Gauge Metric

2021-08-29 Thread Stuart Clark

That very much depends on whatever tool you are using to display graphs.

However it is sounding like Prometheus may not be the right system depending on 
what you are trying to do. Prometheus is a metric system, which works by 
sampling the current state of a system at regular time periods, meaning the 
exact timestamp doesn't generally matter.

It sounds like you are instead wanting to record events - things that happen at 
a specific period of time, not at a regular frequency. For that use case you 
should look at an event store - something like Elasticsearch, InfluxDB or a 
standard relational or no-SQL database. 

On 29 August 2021 11:15:44 BST, Prince  wrote:
>Is it possible to give a custom timestamp in Prometheus for X-axis 
>
>?
>Example: I* am getting **my_metric_name *152.401 163013412 *at metric 
>endpoint but in Prometheus graph, I am getting the value *152.401* when it 
>is scraped, but I want it should be displayed at *163013412(Saturday, 
>August 28, 2021, 7:00:00.012 AM) *this time in Prometheus graph.*
>
>Is it possible? if yes, can you please let me know how?
>Thank you.
>
>On Sunday, August 29, 2021 at 11:06:27 AM UTC+5:30 Prince wrote:
>
>> Thank you, understood. I have used the following:
>>
>> in Collect()
>>  *  t := time.Date(2021, time.August, 28, 07, 0, 0, 12345678, time.UTC)*
>> *s := prometheus.MustNewConstMetric(c.metric, prometheus.GaugeValue, 
>> float64(s.value))*
>> *ch<- prometheus.NewMetricWithTimestamp(t,s)*
>>
>> *I am getting **my_metric_name 152.401 163013412 *(both things 
>> the value and timestamp), but I am not getting this timestamp in the x-axis 
>> of prometheus graph. Can You please let me how can I get that timestamps in 
>> The x-axis of prometheus graph?
>>
>> On Wednesday, August 25, 2021 at 12:49:26 PM UTC+5:30 
>> juliu...@promlabs.com wrote:
>>
>>> So NewMetricWithTimestamp() returns a Metric interface object that you 
>>> can then emit from a Collector's Collect() method. See this example from 
>>> cadvisor: 
>>> https://github.com/google/cadvisor/blob/19df107fd64fa31efc90e186af91b97f38d205e9/metrics/prometheus.go#L1931-L1934
>>>
>>> You can see more usage example here: 
>>> https://sourcegraph.com/search?q=context:global+prometheus.NewMetricWithTimestamp=literal
>>>
>>> In general, it seems like you are building an exporter (a process 
>>> that proxies/translates existing values into the Prometheus format, in your 
>>> case those existing values are coming from a file), so you are not 
>>> instrumenting the exporting process itself, and thus you probably don't 
>>> want to use the "NewGauge()" / "mygauge.WithLabelValues().Set()" functions 
>>> that are for direct instrumentation of a process. Instead, you'll want to 
>>> implement a Collector interface that just returns a set of proxied metrics, 
>>> as outlined here:
>>>
>>> * 
>>> https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#hdr-Custom_Collectors_and_constant_Metrics
>>> * https://prometheus.io/docs/instrumenting/writing_exporters/#collectors
>>>
>>> On Tue, Aug 24, 2021 at 7:43 PM Prince  wrote:
>>>
 Thank you, As I understood the  NewMetricWithTimestamp() takes two 
 parameters one the time and the other one is metric. So as my metric name 
 is go_duration, so I did this :

 *1st way:*
 *go func(){*
 go_duration.WithLabelValues("type").Set(12345.678)
 prometheus.NewMetricWithTimestamp(time_var, go_duration )
 *}()*

 *2nd way: *
 *go func(){*

 prometheus.NewMetricWithTimestamp(time_var,go_duration.WithLabelValues("type").Set(12345.678))
 *}()*


 Using 1st way not getting the timestamp only values are getting scarped.
 Using 2nd way getting error as: 
 "go_duration.WithLabelValues("type").Set(12345.678) 
 used as a value"
 On Tuesday, August 24, 2021 at 10:15:57 PM UTC+5:30 
 juliu...@promlabs.com wrote:

> Hi,
>
> You should be able to use the NewMetricWithTimestamp() function for 
> this: 
> https://pkg.go.dev/github.com/prometheus/client_golang/prometheus?utm_source=godoc#NewMetricWithTimestamp
>
> Note that client-side timestamps should only be used in exceptional 
> circumstances, and if you still expect those timestamps to be regularly 
> updated (because otherwise Prometheus will just collect a dot here and 
> there and mostly show empty graphs). If that is not the case, consider 
> omitting the client-side timestamp and instead sending a metric that 
> includes the last-update timestamp in its sample value (like the 
> node_exporter does for the mtime metric in its "textfile" collector 
> module: 
> https://github.com/prometheus/node_exporter/blob/b6215e649cdfc0398ca98df8e63f3773f1725840/collector/textfile.go#L38
> )
>
> Regards,
> Julius
>
> On Mon, Aug

Re: [prometheus-developers] "dead" metrics

2021-06-06 Thread Stuart Clark


On 04/06/2021 13:09, 'Christian' via Prometheus Developers wrote:

Hi @all,

don't know if this is the right place to ask.

How does Prometheus in general deal with "dead" metrics.
With "dead" I mean the node exporter will deliver the same value over 
and over again,

because the logic behind is broken for some reason.

From a developer perspective this would be easy to deal with by 
putting a timestamp inside the metric to check if we are looking on 
old values and then raise an alert.


I guess that's a complete newbie question and there is something in 
place already ?
I think it really depends on what you mean by "dead metrics" from the 
node exporter. Are you meaning custom metrics you are adding via the 
textfile collector? If so, you also get a timestamp metric for when the 
file last changed, which you could alert against (if not changed within 
a certain time period). If you are meaning something different you could 
alert using the changes function - alert if the changes over the past 
time period were 0.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/5347da7e-6097-3242-506c-6998a9b653a7%40Jahingo.com.

Re: [prometheus-developers] Servers names

2021-03-22 Thread Stuart Clark


On 2021-03-22 14:58, Tuna Bozkır wrote:

Hello, I want to take all servers name and I tried:
up{job="Server_All"}
And output is :

up{Active="true",DN="CN=SvTSTD12,OU=Test,OU=Xxx,OU=Comp,DC=e..,DC=b..",Domain="Est",Site="",instance="svss",job="Server_All"}
:1

But I want to take one of label not a value(CN: SvTS...).
How can I do that ? Thank you for helping.



What tool are you using to extract this information? If Grafana then you 
can choose which labels to display.


Alternatively if you are just wanting a listing of all unique DN labels 
you could do something like "sum by (DN) (up)"


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/a82a6e64b52a5cc8571b920cf4912fd4%40Jahingo.com.

Re: [prometheus-developers] Remote-write drop samples | design doc

2021-03-22 Thread Stuart Clark

On 2021-03-22 11:47, Harkishen Singh wrote:

Thank you everyone for the suggestions!

I agree with the age-based solutions, but such a solution is useful to
particularly those systems that have a limitation on time. Many don't
have that. But seeing the scenario, can we have both, so if users have
a remote-storage system that respects time, then they can use the
time-based dropping logic. If the user has a remote-storage that can
accept a sample with any timestamp (past or future), he can use the
retries count method. This will avoid recurring errors, like the null
byte.

We can have something like LIMITRETRYPOLICY as TIME or RETRIES. If its
TIME, we choose the max time (taken as input). If the policy is
RETRIES, then a count would be the input for the maximum retries. That
way, we solve both the problems and leave it up to the user to
consider it, based on the storage system he is using.

Does that look good to go, or we do just the age-based way?

The time based isn't just about handling remote write receivers than can
only ingest samples up to a certain age, but also to encapsulate policy
about what still matters.

Even if my receiver can ingest metrics from any time it is quite
possible that I don't care about data older than a certain period. For
example I might be doing something ML related that can be used for
autoremediation, so I want all the data but after 30 minutes it becomes
irrelivant. So even though it might accept older data I can set the
limit to 30 mins so Prometheus just drops it instead of trying to resend
(possibly unblocking more recent data in the process).

--
Stuart Clark

Re: [prometheus-developers] Creating a time series from data in a log file

2021-03-09 Thread Stuart Clark


On 2021-03-09 09:28, Andrew Fielden wrote:

Thanks for the suggestion, mtail looks useful. However the log files
are only produced when the system being monitored is sent a SIGINT
signal. I wonder if mtail can handle that?



Is it that the logs are only produced when you send SIGINT for a 
"current snapshot" of the status of the application, or is it that logs 
are emitted for a period of time?


If the second, Prometheus doesn't have the ability to ingest chunks of 
data over periods of time (just the "current" data). There is some 
experimental work ongoing for one-time backfilling of data, but that 
equally probably wouldn't be suitable for regular ingestion.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/b9defe3fb9c8d67be37ca6d6eced0fcd%40Jahingo.com.

Re: [prometheus-developers] Remote-write drop samples | design doc

2021-03-01 Thread Stuart Clark

On 01/03/2021 07:25, Harkishen Singh wrote:

Hi Tom,

I have tried to answer the comments. Please comment on their
satisfactoriness. I am happy for a call if required (or discussion
gets tough).

I think, the lossless nature can be controlled by the user based on
the config (limit_retries), and let the users have more control, as to
whether they are happy to compromise a bit, if the retry is too much,
since as such, if the retrying happens forever, then I don't think
that is helpful (it will never be accepted by the remote storage).
Also as Chris mentioned, some users might prefer to have few gaps and
give more priority to recent data, like for alerting. So, I think this
approach gives more flexibility to the user, at the same time, making
it optional (or by setting the retry count high enough).

Under what situations would retries happen forever?

If the receiver is available but cannot accept the data (for example due
to metric size limits or age of the samples) I would expect it to reject
with a 4XX code (permanent failure) which wouldn't trigger any retries.

Alternatively if the receiver is either unavailable or broken it could
result in "infinite" retries, but in that situation it feels like an age
based limit instead of retry limit would be better - a short retry limit
will reject samples that have just been scraped just as quickly as
samples that are days old. Instead it sounds like an age based limit
would be better - some systems have restrictions over what age can be
ingested (e.g. Timestream) or administrators could decide older data has
no usefulness (e.g. if the receiver is used for alerting or anomaly
detection. While the system should still reject such old samples once it
is working again a time based limit would at least reduce the network
impact once the receiver is back online (no need to send tons of data
that we know will be rejected).

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/cd97f615-e479-e4be-e85d-672b15c337d8%40Jahingo.com.

Re: [prometheus-developers] Should we still announce the dev summit notes?

2021-02-25 Thread Stuart Clark


On 25/02/2021 19:10, Richard Hartmann wrote:

Dear all,

do you find it useful that we announce the dev summit notes now that
we have them on the public calendar and YouTube?

For the current one, please see
https://docs.google.com/document/d/11LC3wJcVk00l8w5P3oLQ-m3Y37iom6INAMEu2ZAGIIE

Yes I think it is useful, especially as I couldn't stay for the final 
part (so your email was a quick reminder to read over what I missed) :-)


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/326e2d61-df19-3f6d-72c2-626b2dd4bb57%40Jahingo.com.

Re: [prometheus-developers] Re: auto discover of targets in Prometheus

2021-02-21 Thread Stuart Clark

On 20/02/2021 15:08, Gajendra D Ambi wrote:

Hi Team,
Yes we do have a IPAM available. When we run prometheus node exporter
agent it starts showing up metrics at ip/metrics. I was hoping for a
way to give a range of ip addresses to prometheus, rather than adding
them each as a static target. What would be the best way for us then!
with 100s of baremetal servers, Also these baremetals are managed by
openstack, I do see openstack auto discovery but will it also discover
bare metal servers managed by openstack or just the VM instances?
because prometheus wasn't explicit on that part in the documentation.
If you could clarify in here or documentation then it would be awesome.

The OpenStack SD can do both -
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config
- It has support for both hypervisor and instance for all Nova servers.

You can use the File SD
(https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config)
in conjunction with your IPAM to manage scrape configs too. A common
method is to run a regular scheduled job to interrogate the IPAM for
new, changed or removed devices and then produce JSON or YAML files for
your jobs - depending on the level of information you could produce
lists for the SNMP exporter (e.g. network equipment), node exporter
(Unix servers), MySQL exporter (and other databases), IPMI exporter, etc.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/91d867e1-00db-1fdc-bf08-984ef9970d2c%40Jahingo.com.

Re: [prometheus-developers] Re: auto discover of targets in Prometheus

2021-02-19 Thread Stuart Clark

On 19/02/2021 22:19, Gajendra Ambi wrote:

Hi,
I too just discovered that it does not do dynamic discovery. I have to
tell prometheus to see it by using file based discovery in prometheus.
This is very unfortunate where we have 100s of nodes and growing and
each group of devs who own them get to add them when they want.
Currently it seems We have to do it manually. Thanks for the quick info.

Prometheus needs some way of obtaining a list of the targets to scrape
for each job.

For cloud based instances that is fairly easy - Prometheus can ask them
to list all instances and using features like tags you can decide what
is scraped.

For virtual machines you could do something similar. For physical
machines it really depends on how you have things setup. Do you have a
IPAM/asset management system? Do you run a tool like Consul?

If you ignore Prometheus for a second: If you were asked to produce a
list of all the servers with details about what is running on them (so
you knew what metrics to fetch) how would you do that?

Whatever the answer, that's what you should aim to have Prometheus use.
If the answer is to produce a manual list, then unfortunately that's
what you need to do - Prometheus has to look somewhere for the list.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/9bc46dfc-25a0-39b5-4d50-55b4c9a252bd%40Jahingo.com.

Re: [prometheus-developers] Difference in scope between smokeping_prober and blackbox_exporter?

2021-02-17 Thread Stuart Clark

On 17/02/2021 23:01, 'Marcelo Magallón' via Prometheus Developers wrote:

Hi,

I'm trying to understand the difference in scope between
blackbox_exporter and smokeping_prober.

I was thinking of extending blackbox_exporter with
functionality similar to smokeping_prober.

In the context of BBE, the existing ping prober sends exactly one ICMP
packet per probe and the interval is controlled by the prometheus
scrape interval.

With smokeping_prober it sends ICMP packets at regular intervals and
it builds a histogram to be collected by prometheus at the scrape
interval.

For BBE, what I was thinking is allowing the user to specify a number
of ICMP packets to be sent (and an interval) so that it can present
min / max / avg / dev / loss metrics. The number and the interval
would have to be very restricted to avoid very long scrape times.

The reason for this is that I don't need the continuous pinging
functionality provided by smokeping_prober (very short interval over a
long period of time) but I also can't do with just the current BBE
functionality (relatively long interval over a long period of time).
What I'm looking for is a small number of repetitions spaced at a
comparatively long interval so that I can derive a more representative
packet loss metric (1 packet lost out of 5 over a 10 minute interval
is not the same as 1 packet loss out of 5 over 5 seconds).

Thoughts?

Is there any reason you can't use the Blackbox Exporter as it currently
is, just decreasing the scrape interval? Prometheus can scrape as
infrequently as every 2 minutes or as frequently as several times a second.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/0774c999-31f9-4b2c-b202-6554d561a29e%40Jahingo.com.

Re: [prometheus-developers] Encrypted gmail password in place of plain text password in alertmanager.yml

2021-02-11 Thread Stuart Clark

On 2021-02-11 06:06, Harsh Kumar Palsania wrote:

Hi all,

Is there a way where we can supply an encrypted gmail password in
plance of plain text password in alertmanayer.yml of alertmanager?

For pretty much all password authentication systems the password is
stored on the server as a hash. As a result it is required to send a
plain text version of the password from the client to allow it to be
matched (as you can convert plain text to hash but not the other way).
The only real exception is digest authentication, which is basically
never used (as it needs clear text passwords on the server which is a
huge security issue).

As a result Alertmanager needs to have the plain text password available
to send to the server.

You can use disk level encryption or store your password in an encrypted
secret store before it is deployed to the server/pod, but when it is
actually read by Alertmanager it has to be plain text.

As with all secrets in config files or environment variables you would
protect them using the permission system for wherever you are running
Alertmanager (e.g. run Alertmanager as a specific user and prevent other
users from reading the config file) or any other security features of
that system (e.g. using Secrets instead of ConfigMaps within
Kubernetes).

All standalone server systems have this requirement.

--
Stuart Clark

Re: [prometheus-developers] SNMP monitoring | PUSH Method feasibility

2021-02-03 Thread Stuart Clark


On 03/02/2021 08:59, Ben Kochie wrote:
I think the point is that commvault only implements traps, and has no 
metrics that can be polled via the snmp_exporter.


In this case, Prometheus is not really a possible monitoring solution, 
as it's not an event store.


Or you could write a custom exporter which accepts the trap events and 
then publishes some metrics based on them (number of times the trap 
happened, etc.)


I've no idea if that would be any use however!

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/c3ea6c16-840b-d77d-7510-e0f8d47019c3%40Jahingo.com.

Re: [prometheus-developers] Prometheus integration with RESt API

2021-02-02 Thread Stuart Clark


On 02/02/2021 15:45, s.saurab...@gmail.com wrote:


Hi Everyone ...

Is it possible to integrate target device whose REST API is exposed 
with prometheus ??? Pls suggest options if possible.



Are you meaning that you are wanting to fetch metrics from this REST API?

If so, this is where an exporter would be used. In whichever language 
you choose and the appropriate Prometheus client library you can call 
your API to fetch the metrics and then reply to Prometheus. You would 
then be able to scrape those metrics.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/53edf054-ea50-9125-c6eb-6e96140f497f%40Jahingo.com.

Re: [prometheus-developers] Prometheus

2021-01-28 Thread Stuart Clark

On 28/01/2021 10:21, Brian Brazil wrote:

Hi all,

I first got involved in Prometheus back in 2014 when I was looking for a
monitoring system, and nothing that was out there seemed to quite cut it.
Since then I've worked across and helped expand the ecosystem. I've
reviewed
and merged 2542 PRs, in addition to creating 670 PRs containing 1867
commits
myself. I've created exporters and client libraries, wrote extensive
docs, and made

a multitude of improvements to Prometheus from performance to features.

All of this takes a non-trivial amount of my time, and there's more to
life
than maintaining open source projects. Accordingly I have decided to
step back
and resign from prometheus-team, in order to focus my efforts more on
other things

including Robust Perception.
I will of course still be part of the ecosystem, helping out fixing
bugs, answering
questions, and so on. While I look forward to reducing my workload, I
know Prometheus

will remain in good hands.

I'd like to echo the comments from all the people who have themselves
done so much for Prometheus. You have helped in so many ways and in my
mind are one of the faces I picture when thinking of Prometheus. You
will be missed, but I look forward to seeing you around once travel is
legal again.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/7d02cf4a-85b7-0bda-0cd9-a8f87cb4c902%40Jahingo.com.

Re: [prometheus-developers] SNMP monitoring | PUSH Method feasibility

2021-01-28 Thread Stuart Clark


On 28/01/2021 13:02, s.saurab...@gmail.com wrote:


Hi Everyone ...

We have commvault solution in our environment which we have to 
integrate with Prometheus for Hardware monitoring. However it is found 
that commvault don't support PULL mechanism on which prometheus works. 
Is there any other alternate through which we can integrate prometheus 
with commvault. Pls suggest


The email subject mentions SNMP. That is fully supported using the SNMP 
Exporter: https://github.com/prometheus/snmp_exporter


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/cca55c97-d7b6-a994-cbbf-1495a5693e0e%40Jahingo.com.

Re: [prometheus-developers] Enable auto merge

2020-12-22 Thread Stuart Clark

But it does rely on approval meaning "good to merge", rather than "only when 
someone else approves". 

On 22 December 2020 16:28:18 GMT, Julien Pivotto  
wrote:
>So, to clarify, this is the equivalent to merge, just it wait for green
>ci.
>
>Le mar. 22 déc. 2020 à 17:17, Julien Pivotto
> a
>écrit :
>
>> Approve and auto merge are different. Auto merge is another value of
>the
>> merge button, next to squash etc.
>>
>> Le mar. 22 déc. 2020 à 16:59, Bjoern Rabenstein 
>a
>> écrit :
>>
>>> On 16.12.20 21:33, Julien Pivotto wrote:
>>> >
>>> > Can we enable the new github feature, auto-merge, in prometheus
>>> > repositories?
>>> >
>>> > It waits for everything to be green before merging.
>>>
>>> Auto-merge assumes that all tests green and one valid approval means
>>> "please merge". But I don't think that's true. I often approve a PR
>to
>>> express "looks good to me but others might still chime in". That
>could
>>> be the maintainer of the repo (or some other person specifically
>>> qualified to review the PR). Those people should have the final
>call.
>>>
>>> Or in other words: Having approval and merge as separate
>>> human-initiated steps models the semantics just right, IMHO.
>>>
>>> --
>>> Björn Rabenstein
>>> [PGP-ID] 0x851C3DA17D748D03
>>> [email] bjo...@rabenste.in
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>Groups
>>> "Prometheus Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>send an
>>> email to prometheus-developers+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>>
>https://groups.google.com/d/msgid/prometheus-developers/20201222155935.GL17627%40jahnn
>>> .
>>>
>>
>
>-- 
>You received this message because you are subscribed to the Google
>Groups "Prometheus Developers" group.
>To unsubscribe from this group and stop receiving emails from it, send
>an email to prometheus-developers+unsubscr...@googlegroups.com.
>To view this discussion on the web visit
>https://groups.google.com/d/msgid/prometheus-developers/CAFJ6V0rJcO785RzuPsgPQBV7LhYoLnnYVAdNr4ZzBsM8zn1Udg%40mail.gmail.com.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CEB2EC96-27C3-4244-BB31-1D5F6EF40EDA%40Jahingo.com.

Re: [prometheus-developers] Enable auto merge

2020-12-22 Thread Stuart Clark

On 22/12/2020 15:59, Bjoern Rabenstein wrote:

On 16.12.20 21:33, Julien Pivotto wrote:

Can we enable the new github feature, auto-merge, in prometheus
repositories?

It waits for everything to be green before merging.

Auto-merge assumes that all tests green and one valid approval means
"please merge". But I don't think that's true. I often approve a PR to
express "looks good to me but others might still chime in". That could
be the maintainer of the repo (or some other person specifically
qualified to review the PR). Those people should have the final call.

Or in other words: Having approval and merge as separate
human-initiated steps models the semantics just right, IMHO.

From a technical perspective someone authorised approving a PR does
mean it can be merged (anyone with write permissions can now click the
merge button from that point onwards without any additional review). The
requirement for a specific set of people can be indicated via CODEOWNERS
or multiple people via the branch permissions settings.

However we might not be "correctly" modelling what we actually want in
the various settings, so GitHub could incorrectly merge something as
things currently stand.

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/ed2499eb-9892-645e-697d-479dde9a5f3c%40Jahingo.com.

Re: [prometheus-developers] Integrate Push Notifications with AlertManager

2020-06-28 Thread Stuart Clark


On 28/06/2020 00:08, Josh Wolff wrote:

Hello,

I want to integrate push notifications with Prometheus's AlertManager

I am the founder of Spontit (https://api.spontit.com), and we enable 
people to send limitless, free push notifications. I was reading about 
AlertManager and I think an integration of our API would definitely 
provide a great option for developers.


Does anyone have any thoughts on this, how I should go about this, or 
any ideas in general?


Take a look at the webhook API details: 
https://prometheus.io/docs/alerting/latest/configuration/#webhook_config


If you create a receiver which then forwards alerts to your service it 
can be added here: 
https://prometheus.io/docs/operating/integrations/#alertmanager-webhook-receiver


--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/be89b0e2-e876-9868-8c30-2b9b3446e813%40Jahingo.com.

Re: [prometheus-developers] Re: Helm charts home

2020-06-23 Thread Stuart Clark

On 2020-06-23 09:30, Augustin Husson wrote:

@Mingchin Hsieh Sorry I didn't get your point. To enable the CI/CD
with circle-ci, yes you need to have the admin right. Otherwise to see
how the CI/CD is working you don't need any special right.
@Cedric that's nice ! I didn't know about it. Thanks a lot :)

I think here we need to have a vote, because I think now it just
matters of what to do.

To the @Prometheus Developers can you please vote on the following
proposition ?

1. ONE HELM-CHART REPOSITORY PER ORGANIZATION
* one repository PROMETHEUS-HELM-CHARTS will be created in the
organization PROMETHEUS that will contain the helm chart of:
* prometheus
* alertManager
* node-exporter
* other helm chart relative to the repository contained in the
current organization
* one repository PROMETHEUS-HEM-CHARTS will be created in the
organization PROMETHEUS-COMMUNITY that will contain the helm chart of:
* jira-alerting
* other helm chart relative to the repository contained in the
current organization

2. ONE HELM CHART FOR EVERYTHING
We will create a repository prometheus-helm-charts in
prometheus-community that will contain everything.

3. ONE REPOSITORY PER HELM-CHARTS IN THE ORG PROMETHEUS-COMMUNITY
* prometheus-community/prometheus-helm-chart
* prometheus-community/node-exporter-helm-chart
* prometheus-community/alert-manager-helm-chart
* prometheus-community/jira-alert-helm-chart

I hope I didn't forget any proposition and it's well summarize. Please
reply if you think there is something missing.

on my side I'm more in FAVOR OF THE PROPOSITION 1.

If you went with a repo per chart, rather than option 3 it would
probably be nicer to have a new organisation specifically for helm
charts (prometheus-community-helm or something)

--
Stuart Clark

Re: [prometheus-developers] Re: Helm charts home

2020-06-19 Thread Stuart Clark


On 2020-06-19 15:09, Mingchin Hsieh wrote:

Hi,

I sort of agree with Stuart's idea; only a little tweak: adding
helm-chart as prefix or suffix. For example,

Prefix approach -
helm-chart-prometheus-adapter
helm-chart-prometheus-blackbox-exporter
helm-chart-prometheus-cloudwatch-exporter
helm-chart-prometheus-consul-exporter
helm-chart-prometheus-couchdb-exporter
helm-chart-prometheus-mongodb-exporter
helm-chart-prometheus-mysql-exporter
helm-chart-prometheus-nats-exporter
helm-chart-prometheus-node-exporter
helm-chart-prometheus-operator
helm-chart-prometheus-postgres-exporter
helm-chart-prometheus-pushgateway
helm-chart-prometheus-rabbitmq-exporter
helm-chart-prometheus-redis-exporter
helm-chart-prometheus-snmp-exporter
helm-chart-prometheus-to-sd
helm-chart-prometheus

Suffix approach -
prometheus-adapter-helm-chart
prometheus-blackbox-exporter-helm-chart
prometheus-cloudwatch-exporter-helm-chart
prometheus-consul-exporter-helm-chart
prometheus-couchdb-exporter-helm-chart
prometheus-mongodb-exporter-helm-chart
prometheus-mysql-exporter-helm-chart
prometheus-nats-exporter-helm-chart
prometheus-node-exporter-helm-chart
prometheus-operator-helm-chart
prometheus-postgres-exporter-helm-chart
prometheus-pushgateway-helm-chart
prometheus-rabbitmq-exporter-helm-chart
prometheus-redis-exporter-helm-chart
prometheus-snmp-exporter-helm-chart
prometheus-to-sd-helm-chart
prometheus-helm-chart

This is due to there are some existing repos in prometheus-community
that focus on each component implementation level (e.g. docker image
or stand-alone service). Mixing together might be harder to put on
hub.helm.sh [1]. But, the owners of prometheus-community hold their
right for the final decision.

BTW, would any prometheus-community owners / members explain the
current testing infrastructure? Currently helm chart testing infra is
based on Google Bazel + CircleCI. There's some limitation over there,
e.g. the chart owners / approvers debug the testing infra is hard. I
think all the current prometheus related helm chart owners would like
to know how hard would be for migration / automation.

Best,
Mingchin

On Fri, Jun 19, 2020 at 8:55 PM Stuart Clark
 wrote:


On 2020-06-19 13:30, André Bauer wrote:

Hey guys,

great to see there is already some effort to move the chart out of

the

stable repo :)

As i understand that "prometheus" is not the perfect fit for the

chart

name, as it also installs other components from the prometheus eco
system, i'm also not the biggest fan of umbrella charts.
From our experience at kiwigrid this can lead to updating issues.
For example you'd need to update proemtheus server but because of

the

umbrella it could alreadya fail and exit in the alertmanager

update

step.
Therefore we switched to single chart installs now as you're able

to

update single components, without the need to run the update for

all

charts under the umbrella, which is much more error resistent from

our

experience.

Nevertheless an umbrella chart might be good starting point for
testing Prometheus with all of its available components.

Where i see problems is to deprecate the chart in stable and

change

the way the chart works in the new repo.
Maybe such changes should be done in an earlier step in the stable
chart repo?
At least doumentation of the upgrade path should be clear and
possible, without the need to have manual steps like pvc backup /
restore because the name of the pvc changed.



There are a number of existing charts in the stable repo, which are
mostly for installing indivitual pieces:

prometheus-adapter
prometheus-blackbox-exporter
prometheus-cloudwatch-exporter
prometheus-consul-exporter
prometheus-couchdb-exporter
prometheus-mongodb-exporter
prometheus-mysql-exporter
prometheus-nats-exporter
prometheus-node-exporter
prometheus-operator
prometheus-postgres-exporter
prometheus-pushgateway
prometheus-rabbitmq-exporter
prometheus-redis-exporter
prometheus-snmp-exporter
prometheus-to-sd
prometheus

I'd suggest as a first step to just move them all exactly as they
are
into the prometheus/prometheus-community organisation, and then look
at
making changes later...



Sorry I wasn't clear. You'd expect all those to live in the same repo as 
different directories, rather than different repos. You also need 
somewhere to publish the charts to (e.g. Chartmuseum)


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/p

Re: [prometheus-developers] Re: Helm charts home

2020-06-19 Thread Stuart Clark


On 2020-06-19 13:30, André Bauer wrote:

Hey guys,

great to see there is already some effort to move the chart out of the
stable repo :)

As i understand that "prometheus" is not the perfect fit for the chart
name, as it also installs other components from the prometheus eco
system, i'm also not the biggest fan of umbrella charts.
From our experience at kiwigrid this can lead to updating issues.
For example you'd need to update proemtheus server but because of the
umbrella it could alreadya fail and exit in the alertmanager update
step.
Therefore we switched to single chart installs now as you're able to
update single components, without the need to run the update for all
charts under the umbrella, which is much more error resistent from our
experience.

Nevertheless an umbrella chart might be good starting point for
testing Prometheus with all of its available components.

Where i see problems is to deprecate the chart in stable and change
the way the chart works in the new repo.
Maybe such changes should be done in an earlier step in the stable
chart repo?
At least doumentation of the upgrade path should be clear and
possible, without the need to have manual steps like pvc backup /
restore because the name of the pvc changed.



There are a number of existing charts in the stable repo, which are 
mostly for installing indivitual pieces:


prometheus-adapter
prometheus-blackbox-exporter
prometheus-cloudwatch-exporter
prometheus-consul-exporter
prometheus-couchdb-exporter
prometheus-mongodb-exporter
prometheus-mysql-exporter
prometheus-nats-exporter
prometheus-node-exporter
prometheus-operator
prometheus-postgres-exporter
prometheus-pushgateway
prometheus-rabbitmq-exporter
prometheus-redis-exporter
prometheus-snmp-exporter
prometheus-to-sd
prometheus

I'd suggest as a first step to just move them all exactly as they are 
into the prometheus/prometheus-community organisation, and then look at 
making changes later...


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/c4449085447c54f8ef66a04905ee397e%40Jahingo.com.

Re: [prometheus-developers] Re: Helm charts home

2020-06-09 Thread Stuart Clark


On 09/06/2020 13:15, David Karlsen wrote:

+1
Single chart for single components, and then an umbrella-chart can 
bring all of them together - then people can select whatever is most 
appropriate.





Prometheus Operator & generic Prometheus are two different things, and 
the existing Helm repo reflects that.


You can use the various individual Helm charts to install Prometheus, 
Alertmanager, Grafana and various exporters and then manually plumb 
things together.


Alternatively Prometheus Operator mixes in some extra magic using 
slightly customised deployments (using the Prometheus Operator chart) to 
allow decentralised configuration using different CRDs (ServiceMonitors 
for what to scrape, alert rules, instances, etc.)


So both have their place.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/2417983d-7501-c3e3-f34f-715792cdbae6%40Jahingo.com.

Re: [prometheus-developers] [VOTE] Allow listing non-SNMP exporters for devices that can already be monitored via the SNMP Exporter

2020-05-29 Thread Stuart Clark

On 29/05/2020 16:00, Bjoern Rabenstein wrote:

On 28.05.20 21:30, Julius Volz wrote:

I therefore call a vote for the following proposal:

Allow adding exporters to https://prometheus.io/docs/instrumenting/exporters/
although the devices or applications that they export data for can already be
monitored via SNMP (and thus via the SNMP Exporter). This proposal does not
affect other criteria that we may use in deciding whether to list an exporter
or not.

YES

It would obviously be better if those exporter listing decisions would
"just work" with best judgement and we didn't need to vote about
individual guideline. However, the discussion in
https://github.com/prometheus/docs/pull/1640 circled back to the SNMP
Exporter argument multiple times. The single person on the one side of
the argument explained their concerns, they were considered, but
failed to convince. With the room leaning so obviously to the other
side, one might ask why that circling back had to happen. The vote can
help here to prune at least one branch of the meandering
discussion. In particular with the often used reasoning that "that's
how we did it before", it's good to know if perhaps "that's not how we
want to do it in the future".

Having said that, I do believe that we should have a more fundamental
discussion about revising "our" criteria of accepting exporter
listings. My impression is that the way it is done right now doesn't
represent our collective intentions very well. Even worse, I am fairly
certain that the process is partially defeating its purpose. In
particular, instead of encouraging the community to join efforts, we
are causing even more fragmentation. Which is really tragic, given how
much time and effort Brian invests in the review work. Kickstarting
such a discussion has been on my agenda for a long time, but given how
my past attempts to move the needle went, it appeared to be a quite
involved effort, for which I'm lacking the capacity. (Others told me
similar things, which reminds me of the "capitulation" topic in
RFC7282, where people cease to express their point of view because
"they don't have the energy to argue against it". Votes, like this
particular one, might then just be an attempt to get out of the many
branches and loops created by persistently upholding objections that
most of the room considers addressed already.)

Do people make use of the "other exporters list" on the wiki?

Would it make sense to make that a bit more known so there is another
less structured place to put things?

https://github.com/prometheus/prometheus/wiki/Default-port-allocations

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/9dad6419-3356-504c-4ed4-d8d9d7814c16%40Jahingo.com.

Re: [prometheus-developers] Re: Changing Prometheus from lazy to rough consensus?

2020-05-29 Thread Stuart Clark


On 2020-05-28 23:36, Richard Hartmann wrote:

The current situation is what I want to avoid with my suggestion.

Short-circuiting discussions directly into votes, potentially after
mere hours, is not healthy for the project long term and likely
reflects frustrations built up over time.

Voting results are also harder to adapt and evolve as they will need
new votes, not just consensus, to change. This might not be a problem
in the two current well-scoped votes at the moment, but it will become
one more and more over time.

While I can understand the underlying motivations for quick voting on
ALL the things I would much prefer to fix the default mechanism,
consensus, over moving to a different one, voting ,as the new default.



That sounds like a different issue to changing the definition of 
consensus.


Maybe the voting process should have a third option which isn't a yes or 
a no, but a "go back to discussion"?


Or you have some criteria (time, etc) which is needed to call a vote 
(although that can also bring difficulties).


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/eb5a924486fb1eee911f6679c40f6c5e%40Jahingo.com.

Re: [prometheus-developers] move https/ package to client_golang

2020-05-27 Thread Stuart Clark


On 27/05/2020 07:50, Brian Brazil wrote:
On Wed, 27 May 2020 at 07:05, Ben Kochie <mailto:sup...@gmail.com>> wrote:


I was thinking about building an "exporter kit" repo that would
include some helpful functions to reduce the amount of boilerplate
needed to write exporters.


I've thought such a thing would be useful for a long time, though my 
presumption was always that it would end up in client_golang as it's 
not too far from instrumentation.


In general I'm not a big fan of widespread proliferation of repos, 
particularly if it's lots of tiny repos. Even in the previous cases 
where we managed to get the layering largely right, it still was quite 
a pain in terms of overhead and release management if the repos were 
being actively developed. A single toolkit-y repo I could live with, 
I'd be concerned if we were talking repos beyond that.



How does the release management/overhead differ between several single 
purpose repos and a single repo containing independent things in 
different directories?



--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/f85f5a2e-fcdf-82e6-bcb7-878f9e90c3b7%40Jahingo.com.

Re: [prometheus-developers] Changing Prometheus from lazy to rough consensus?

2020-05-27 Thread Stuart Clark


On 27/05/2020 00:51, Bjoern Rabenstein wrote:



Yes that joins what I said just before. However it might be difficult
to give one person so much "power".

At least the "parliament" (i.e. prometheus-team) could just elect
another chair if the current one goes wild.


In a lot of organisations with chair based systems which work well the 
opposite is often true.


The chair is there to facilitate the debate and therefore doesn't 
express their own opinion. As a result, if the judgement call has to be 
made (such as calling rough consensus) their is trust from all sides 
that it is a reasonable decision not just following their personal belief.


(I've also seen and had similar advice for face-to-face meetings - it 
works best if the chair/facilitator is not one of the active participants)


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/5fad44d8-15db-dd15-9a61-21eaeb80b19f%40Jahingo.com.

Re: [prometheus-developers] cpu utilization

2020-02-21 Thread Stuart Clark

If you want to reduce cpu usage you can decrease the number of targets to 
scrape or make the scrape interval longer. Equally if the cpu is being used for 
queries look at recording rules and other data users.

It might help if you explain the problems you are seeing and what you are 
hoping to achieve? Prometheus needs a certain level of cpu, memory and disk to 
work so it may just be that you aren't allocating enough resources for the size 
of your monitoring. 

On 21 February 2020 08:57:43 GMT, adi garg  wrote:
>Hey Ben, thanks for replying. Is there any flag in prometheus to
>control 
>this CPU usage or we have to depend on node-exporter metrics only. Can
>you 
>give some areas we should look into once CPU starts taking too much
>time 
>for prometheus work. Like what could be the possible reasons for sudden
>
>increase?
>On Friday, February 21, 2020 at 1:41:28 PM UTC+5:30, Ben Kochie wrote:
>>
>> Prometheus is meant to be run once to monitor many services on many
>nodes. 
>> In a typical non-containerized environment, people generally dedicate
>a 
>> node to it.
>>
>> As Stuart says, it depends on your rate of ingestion. It also depends
>on 
>> the query and rule load. But typically you should be able handle
>200,000 
>> samples per second per CPU.
>>
>> On Fri, Feb 21, 2020 at 6:26 AM adi garg > 
>> wrote:
>>
>>> Thanks Stuart. Is it possible to control prometheus once its cpu
>usage 
>>> starts to get over and starts affecting the other processes?
>>>
>>> On Thursday, February 20, 2020 at 4:01:07 PM UTC+5:30, Stuart Clark
>wrote:
>>>>
>>>> That will very much depend on what you are planning on doing.
>>>>
>>>> The more jobs you scrape and the shorter scrape interval or more
>metrics 
>>>> will use more cpu.
>>>>
>>>> Equally more queries and more complex queries over longer time
>periods 
>>>> would use more cpu. 
>>>>
>>>> On 20 February 2020 08:51:55 GMT, adi garg 
>wrote:
>>>>>
>>>>> Hello experts, Can you please share some insights on what
>percentage 
>>>>> of  cpu is taken by prometheus generally. And will it effect the
>other 
>>>>> processes running on the system?
>>>>>
>>>>> Regards,
>>>>> Aditya Garg
>>>>>
>>>>>
>>>> -- 
>>>> Sent from my Android device with K-9 Mail. Please excuse my
>brevity.
>>>>
>>> -- 
>>> You received this message because you are subscribed to the Google
>Groups 
>>> "Prometheus Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>send an 
>>> email to prometheus-developers+unsubscr...@googlegroups.com
>
>>> .
>>> To view this discussion on the web visit 
>>>
>https://groups.google.com/d/msgid/prometheus-developers/2d49ccba-f4e0-49db-990d-cc460bedcd56%40googlegroups.com
>
>>>
><https://groups.google.com/d/msgid/prometheus-developers/2d49ccba-f4e0-49db-990d-cc460bedcd56%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>>
>>
>
>-- 
>You received this message because you are subscribed to the Google
>Groups "Prometheus Developers" group.
>To unsubscribe from this group and stop receiving emails from it, send
>an email to prometheus-developers+unsubscr...@googlegroups.com.
>To view this discussion on the web visit
>https://groups.google.com/d/msgid/prometheus-developers/a3615bbc-fb26-473d-9ffe-d84b8ce35a7b%40googlegroups.com.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/FFAFF837-D1BB-439F-8315-3FEF311395AF%40Jahingo.com.

Re: [prometheus-developers] Is there any way to give memory specifications by ourselves in prometheus?

2020-02-20 Thread Stuart Clark

Probably not. Prometheus will write scraped data to the WAL and if that fails 
it won't be stored anywhere. Equally if a file is only partially written 
(either a full block or something in the WAL) you may end up losing all the 
data in that file (Prometheus will check for corruption on startup). 

On 20 February 2020 08:25:14 GMT, adi garg  wrote:
>Thanks Stuart. Is it possible to restart the prometheus without data
>loss 
>from that point?
>
>On Thursday, February 20, 2020 at 1:39:03 PM UTC+5:30, Stuart Clark
>wrote:
>>
>> If you are meaning the disk storage, Prometheus would stop working
>and you 
>> might also encounter some corruption.
>>
>> You can control disk usage by setting the retention period or by
>setting 
>> the maximum space to remain available.
>>
>> I'd strongly suggest putting the storage on a separate mount and also
>
>> using the node exporter with alerts to ensure you don't run out of
>space. 
>>
>> On 20 February 2020 05:18:03 GMT, adi garg > > wrote:
>>>
>>> Thanks Julius and Julien. Awesome answers. This is related to RAM,
>but 
>>> what will happen if the secondary storage is not sufficient to take
>the 
>>> metrics. What will happen in that case?
>>>
>>> On Wednesday, February 19, 2020 at 10:32:43 PM UTC+5:30, Julius Volz
>
>>> wrote:
>>>>
>>>> While an explicit memory limit is not configurable, there are a
>number 
>>>> of knobs in Prometheus that one can configure that limit resource
>usage 
>>>> along certain dimensions, for example 
>>>> https://www.robustperception.io/limiting-promql-resource-usage.
>>>>
>>>> There's also a setting that prevents a maximum number of samples 
>>>> ingested per scrape.
>>>>
>>>> On Wed, Feb 19, 2020 at 5:34 PM Julien Pivotto
> 
>>>> wrote:
>>>>
>>>>> On 19 Feb 08:24, adi garg wrote:
>>>>> > So what will happen if prometheus crosses the RAM limit, will it
>die? 
>>>>> Or is 
>>>>> > it gonna affect the other processes running on the system?
>>>>>
>>>>>
>>>>> It is the operating system that will chose. Prometheus will
>probably be
>>>>> terminated by the operating system.
>>>>>
>>>>> > 
>>>>> > On Wednesday, February 19, 2020 at 9:41:36 PM UTC+5:30, Julien 
>>>>> Pivotto 
>>>>> > wrote:
>>>>> > >
>>>>> > > On 19 Feb 08:08, adi garg wrote: 
>>>>> > > > Hello experts, 
>>>>> > > > 
>>>>> > > > Is there any way to give memory specifications by ourselves
>in 
>>>>> > > prometheus? 
>>>>> > >
>>>>> > > Hello, 
>>>>> > >
>>>>> > > No, this is not possible. Prometheus will use the memory it
>needs. 
>>>>> > >
>>>>> > > Regards, 
>>>>> > >
>>>>> > > > 
>>>>> > > > -- 
>>>>> > > > You received this message because you are subscribed to the 
>>>>> Google 
>>>>> > > Groups "Prometheus Developers" group. 
>>>>> > > > To unsubscribe from this group and stop receiving emails
>from it, 
>>>>> send 
>>>>> > > an email to prometheus-developers+unsubscr...@googlegroups.com
>
>>>>> > > . 
>>>>> > > > To view this discussion on the web visit 
>>>>> > > 
>>>>>
>https://groups.google.com/d/msgid/prometheus-developers/0fb6ec18-c2a8-437f-9d2b-5cc2b9f5b330%40googlegroups.com.
>
>>>>>
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > -- 
>>>>> > >  (o-Julien Pivotto 
>>>>> > >  //\Open-Source Consultant 
>>>>> > >  V_/_   Inuits - https://www.inuits.eu 
>>>>> > >
>>>>> > 
>>>>> > -- 
>>>>> > You received this message because you are subscribed to the
>Google 
>>>>> Groups "Prometheus Developers" group.
>>>>> > To unsubscribe from this group and stop receiving emails from
>it, 
>>>>> send an email to
>prometheus-developers+unsubscr...@googlegroups.com.
>>>>> > To view this dis

Re: [prometheus-developers] Is there any way to give memory specifications by ourselves in prometheus?

2020-02-20 Thread Stuart Clark

If you are meaning the disk storage, Prometheus would stop working and you 
might also encounter some corruption.

You can control disk usage by setting the retention period or by setting the 
maximum space to remain available.

I'd strongly suggest putting the storage on a separate mount and also using the 
node exporter with alerts to ensure you don't run out of space. 

On 20 February 2020 05:18:03 GMT, adi garg  wrote:
>Thanks Julius and Julien. Awesome answers. This is related to RAM, but
>what 
>will happen if the secondary storage is not sufficient to take the
>metrics. 
>What will happen in that case?
>
>On Wednesday, February 19, 2020 at 10:32:43 PM UTC+5:30, Julius Volz
>wrote:
>>
>> While an explicit memory limit is not configurable, there are a
>number of 
>> knobs in Prometheus that one can configure that limit resource usage
>along 
>> certain dimensions, for example 
>> https://www.robustperception.io/limiting-promql-resource-usage.
>>
>> There's also a setting that prevents a maximum number of samples
>ingested 
>> per scrape.
>>
>> On Wed, Feb 19, 2020 at 5:34 PM Julien Pivotto > > wrote:
>>
>>> On 19 Feb 08:24, adi garg wrote:
>>> > So what will happen if prometheus crosses the RAM limit, will it
>die? 
>>> Or is 
>>> > it gonna affect the other processes running on the system?
>>>
>>>
>>> It is the operating system that will chose. Prometheus will probably
>be
>>> terminated by the operating system.
>>>
>>> > 
>>> > On Wednesday, February 19, 2020 at 9:41:36 PM UTC+5:30, Julien
>Pivotto 
>>> > wrote:
>>> > >
>>> > > On 19 Feb 08:08, adi garg wrote: 
>>> > > > Hello experts, 
>>> > > > 
>>> > > > Is there any way to give memory specifications by ourselves in
>
>>> > > prometheus? 
>>> > >
>>> > > Hello, 
>>> > >
>>> > > No, this is not possible. Prometheus will use the memory it
>needs. 
>>> > >
>>> > > Regards, 
>>> > >
>>> > > > 
>>> > > > -- 
>>> > > > You received this message because you are subscribed to the
>Google 
>>> > > Groups "Prometheus Developers" group. 
>>> > > > To unsubscribe from this group and stop receiving emails from
>it, 
>>> send 
>>> > > an email to prometheus-developers+unsubscr...@googlegroups.com 
>>>  
>>> > > . 
>>> > > > To view this discussion on the web visit 
>>> > > 
>>>
>https://groups.google.com/d/msgid/prometheus-developers/0fb6ec18-c2a8-437f-9d2b-5cc2b9f5b330%40googlegroups.com.
>
>>>
>>> > >
>>> > >
>>> > >
>>> > > -- 
>>> > >  (o-Julien Pivotto 
>>> > >  //\Open-Source Consultant 
>>> > >  V_/_   Inuits - https://www.inuits.eu 
>>> > >
>>> > 
>>> > -- 
>>> > You received this message because you are subscribed to the Google
>
>>> Groups "Prometheus Developers" group.
>>> > To unsubscribe from this group and stop receiving emails from it,
>send 
>>> an email to prometheus-developers+unsubscr...@googlegroups.com 
>>> .
>>> > To view this discussion on the web visit 
>>>
>https://groups.google.com/d/msgid/prometheus-developers/df7bdc80-b95f-4f02-88d1-664d3cf3c651%40googlegroups.com
>>> .
>>>
>>>
>>> -- 
>>>  (o-Julien Pivotto
>>>  //\Open-Source Consultant
>>>  V_/_   Inuits - https://www.inuits.eu
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google
>Groups 
>>> "Prometheus Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>send an 
>>> email to prometheus-developers+unsubscr...@googlegroups.com
>
>>> .
>>> To view this discussion on the web visit 
>>>
>https://groups.google.com/d/msgid/prometheus-developers/20200219163423.GA6819%40oxygen
>>> .
>>>
>>
>
>-- 
>You received this message because you are subscribed to the Google
>Groups "Prometheus Developers" group.
>To unsubscribe from this group and stop receiving emails from it, send
>an email to prometheus-developers+unsubscr...@googlegroups.com.
>To view this discussion on the web visit
>https://groups.google.com/d/msgid/prometheus-developers/92f1ffed-e5a6-4d5e-91fa-318363caa710%40googlegroups.com.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/5742AE1C-3B37-4814-9749-9B96F5498386%40Jahingo.com.

49 matches

Mail list logo