[prometheus-users] Re: Question on promtool and amtool

2020-09-13 Thread Brian Candler
On Saturday, 12 September 2020 21:20:43 UTC+1, kiran wrote:
>
> Hello all
>
> 1. Is promtool automatically installed with Prometheus?
>

It's supplied as part of the distribution: 

# tar -tzf prometheus-2.20.1.linux-amd64.tar.gz
prometheus-2.20.1.linux-amd64/
prometheus-2.20.1.linux-amd64/LICENSE
prometheus-2.20.1.linux-amd64/NOTICE
prometheus-2.20.1.linux-amd64/tsdb
prometheus-2.20.1.linux-amd64/prometheus
prometheus-2.20.1.linux-amd64/console_libraries/
prometheus-2.20.1.linux-amd64/console_libraries/prom.lib
prometheus-2.20.1.linux-amd64/console_libraries/menu.lib
*prometheus-2.20.1.linux-amd64/promtool*
prometheus-2.20.1.linux-amd64/prometheus.yml
prometheus-2.20.1.linux-amd64/consoles/
prometheus-2.20.1.linux-amd64/consoles/node.html
prometheus-2.20.1.linux-amd64/consoles/prometheus-overview.html
prometheus-2.20.1.linux-amd64/consoles/prometheus.html
prometheus-2.20.1.linux-amd64/consoles/node-overview.html
prometheus-2.20.1.linux-amd64/consoles/index.html.example
prometheus-2.20.1.linux-amd64/consoles/node-disk.html
prometheus-2.20.1.linux-amd64/consoles/node-cpu.html
 
Whether it's "installed" or not depends on who or what is doing the 
installing.

2. If Prometheus is installed in a docker container, how to use promtool to 
> validate Prometheus.yml file
>

Use "docker exec" to run promtool. (Or "kubectl exec" if it's running in 
kubernetes).
 

> 3. Is amtool automatically installed with alertmanager?
>

Same as before, it's included in the distribution tarball.
 

> 4. If alertmanager is installed in a docker container, how to use amtool 
> to validate alertmanager.yml file
>

Same as before, "docker exec". 

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/fa6e3c7c-a902-43b4-a22e-01b6202e717eo%40googlegroups.com.


Re: [prometheus-users] prometheus delete old data files

2020-09-13 Thread Ben Kochie
TSDB blocks are automatically cleaned up, but it does this on the 2 hour
block management schedule. Blocks also must be fully expired (maxTime)
before they are deleted.

You probably just need to wait for the maxTime on the oldest block to
expire. Look in the meta.json in the TSDB block directories.

On Sun, Sep 13, 2020 at 3:35 AM Johny  wrote:

> I am reducing data retention from 20 days to 10 days in my Prometheus
> nodes (v. 2.17). When I change *storage.tsdb.retention.time *to 10d and
> restart my instances, this does not get delete data older than 10 days. Is
> there a command to force cleanup?
>
> In general, what is best practice to delete older data in Prometheus?
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/e883c687-6534-4562-a775-575e81289e4bn%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmovu0GN9WHsho%3DGGoVpYo1Y3WzG-obzB1ffd_VTDBE2nQ%40mail.gmail.com.


Re: [prometheus-users] Re: Global Labels in Alerts

2020-09-13 Thread Ben Kochie
This is the use case for external labels, they are attached to all alerts.

On Wed, Sep 9, 2020 at 12:44 PM Brian Candler  wrote:

> To add labels to *every* alert sent from this prometheus instance, see
> alert_relabel_configs:
>
> https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alert_relabel_configs
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/9b3eade4-98ee-484e-b9a1-8a29f4e36e17o%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmqZ%3DVKe3gQ8i%2Bi48Siqap%2BqTSonbFQwJ_Q3Jy2L5Cu3fQ%40mail.gmail.com.


[prometheus-users] Re: Monitor number of seconds since metric change as prometheus time series

2020-09-13 Thread Weston Greene
I feel like this answer gives directly what you need minus one step, so 
forgive me if I'm misunderstanding. The one step it doesn't explicitly say 
is a second rule for `time() - stat__change__timestamp`. 
Here is an example directly from my working solution:

```rules.yaml
  - record: stat__change__timestamp  
# timestamp of when the metric last changed  
expr: 
  timestamp(changes({exported_job=~"visor_.*", 
alertname="", offset="", original_name!="", 
original_stat=""}[${SCRAPE_INTERVAL_AND_A_HALF}]) > 0)
or ignoring(stat, monitor, original_stat)
  stat__change__timestamp
labels:
  stat: true
  original_stat: stat__change__timestamp  # This keeps the 
stat__offset of this metric unique from the original

  - record: stat__change__seconds_since  
# number of seconds since the metric value changed  # this 
will highlight whether a script is not recording correctly or if a metric 
is stagnant
expr: 
  time() - stat__change__timestamp
labels:
  stat: true
  original_stat: stat__change__seconds_since  # This keeps 
the stat__offset of this metric unique from the original
```

An alternative to `changes()` (pulled from a different prometheus server I 
manage, hence the different label criteria):
```rules.yaml
timestamp(
  (
  
kafka_consumer_group_lag{topic!~".*verification_id|.*submission_id|.*__leader|.*-changelog|.*_Internal.*",
 
group!="BifrostMonitor_Bifrost_MongoTopicDumper"}
   -
  
 
kafka_consumer_group_lag{topic!~".*verification_id|.*submission_id|.*__leader|.*-changelog|.*_Internal.*",
 
group!="BifrostMonitor_Bifrost_MongoTopicDumper"} offset 
${SCRAPE_INTERVAL_DOUBLE}
 ) != 0
   )
```

When I say `SCRAPE_INTERVAL`, I mean 
```prometheus.yaml
  global:
scrape_interval: ${SCRAPE_INTERVAL} # Default is every minute.
evaluation_interval: ${EVALUATION_INTERVAL} # default is every minute.
  alerting:
 ...
```

I can't remember why I chose `_AND_A_HALF` for `changes()` and yet 
`_DOUBLE` for subtracting the offset. Don't think it much matters.

On Wednesday, September 9, 2020 at 6:41:13 AM UTC-4 t1hom7as wrote:

> I am actually trying to do something very similar, but I can't really tell 
> if it is the same or not.
> Basically, I have a metric that gives me the status of up or down, being 1 
> or 0 respectively in the value field. 
>
> I would like to somehow find out from when the value went FROM 0 TO 1, so 
> how long it has been. 
> In this case, how long since it changed to 1 to the current timestamp, 
> therefore I should be able to measure the uptime value of that metric.  
>
> Open to ideas, as I can't seem to get this working, eventually I would 
> like to present this into grafana so I can show the uptime of that metric.  
>
> On Friday, 3 April 2020 at 10:01:52 UTC+1 weston...@gmail.com wrote:
>
>> ANSWERED! 
>> From Stackoverflow:
>>
>> Summing up our discussion: the evaluation interval is too big; after 5 
>> minutes, a metric becomes [stale][1]. This means that when the expression 
>> is evaluated, the right hand side of your `OR` expression is no longer 
>> considered by Prometheus and thus is always empty.
>>
>> Your second issue is that your record rule is adding some labels to the 
>> original metric and you get some complaint by Prometheus. This is not 
>> because the labels already exists: in [recording rules][3], labels 
>> overwrite the existing labels.
>>
>> The issue is your `OR` expression: it should specify an `ignoring()` 
>> [matching clause][2] for ignoring the added labels or you will get the 
>> labels from both sides of the `OR` expression:
>>
>> > `vector1 or vector2` results in a vector that contains all original 
>> elements (label sets + values) of vector1 and additionally all elements of 
>> vector2 ***which do not have matching label sets in vector1***.
>>
>> Since you get both side of the `OR`, when Prometheus tries to add the 
>> labels to the left hand side, it conflicts with the right hand side which 
>> already exists.
>>
>> Your expression should be something like:
>> ```yaml
>> expr: |
>>   timestamp(changes(metric-name[450s]) > 0)
>> or ignoring(stat,monitor)
>>   last-update
>> ```
>> Or use an `ON(label1,label2,...)` clause on a discriminating label set 
>> which avoids changing the expression whenever you change the labels.
>>
>>
>>   [1]: 
>> https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness
>>   [2]: 
>> https://prometheus.io/docs/prometheus/latest/querying/operators/#one-to-one-vector-matches
>>   [3]: 
>> https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#rule
>>
>>
>> On Wednesday, April 1, 202

[prometheus-users] prometheus-openstack-exporter vs go-openstack exporter

2020-09-13 Thread Ehsn sa
Hi every body. can some one help me about these two different 
implementation of open stack exporter? which one would you suggest? and 
why? is there any metric to compare these two??
thanks

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f43e626f-ff37-4e64-b1db-396909f3571bn%40googlegroups.com.


[prometheus-users] Re: Question on promtool and amtool

2020-09-13 Thread kiran
Thank you, Brian

On Sunday, September 13, 2020, Brian Candler  wrote:

> On Saturday, 12 September 2020 21:20:43 UTC+1, kiran wrote:
>>
>> Hello all
>>
>> 1. Is promtool automatically installed with Prometheus?
>>
>
> It's supplied as part of the distribution:
>
> # tar -tzf prometheus-2.20.1.linux-amd64.tar.gz
> prometheus-2.20.1.linux-amd64/
> prometheus-2.20.1.linux-amd64/LICENSE
> prometheus-2.20.1.linux-amd64/NOTICE
> prometheus-2.20.1.linux-amd64/tsdb
> prometheus-2.20.1.linux-amd64/prometheus
> prometheus-2.20.1.linux-amd64/console_libraries/
> prometheus-2.20.1.linux-amd64/console_libraries/prom.lib
> prometheus-2.20.1.linux-amd64/console_libraries/menu.lib
> *prometheus-2.20.1.linux-amd64/promtool*
> prometheus-2.20.1.linux-amd64/prometheus.yml
> prometheus-2.20.1.linux-amd64/consoles/
> prometheus-2.20.1.linux-amd64/consoles/node.html
> prometheus-2.20.1.linux-amd64/consoles/prometheus-overview.html
> prometheus-2.20.1.linux-amd64/consoles/prometheus.html
> prometheus-2.20.1.linux-amd64/consoles/node-overview.html
> prometheus-2.20.1.linux-amd64/consoles/index.html.example
> prometheus-2.20.1.linux-amd64/consoles/node-disk.html
> prometheus-2.20.1.linux-amd64/consoles/node-cpu.html
>
> Whether it's "installed" or not depends on who or what is doing the
> installing.
>
> 2. If Prometheus is installed in a docker container, how to use promtool
>> to validate Prometheus.yml file
>>
>
> Use "docker exec" to run promtool. (Or "kubectl exec" if it's running in
> kubernetes).
>
>
>> 3. Is amtool automatically installed with alertmanager?
>>
>
> Same as before, it's included in the distribution tarball.
>
>
>> 4. If alertmanager is installed in a docker container, how to use amtool
>> to validate alertmanager.yml file
>>
>
> Same as before, "docker exec".
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/prometheus-users/fa6e3c7c-a902-43b4-a22e-
> 01b6202e717eo%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAOnWYZWXwFFWRUzCY5H13VJFtEY2YBvDj4Qre_AVqy2rDZStiw%40mail.gmail.com.


[prometheus-users] Prometheus/AlertManager to Discord

2020-09-13 Thread Adam
Hi all,

I am using alertmanager notifications with slack and email and everything 
is ok. Now I am trying to send the same notifications to a Discord server 
too via Webhooks.

Using the below configurations in "/etc/alertmanager/alertmanager.yml", I 
can receive alerts on Slack and Email but not Discord. Please note that I 
tested the Discord webhook URL in other tools and it works fine.

"global:

  smtp_smarthost: 'smtp.gmail.com:587'

  smtp_from: 'AlertManager '

  smtp_require_tls: true

  smtp_hello: 'alertmanager'

  smtp_auth_username: 'xyz'

  smtp_auth_password: 'XXX'


  slack_api_url: 'https://hooks.slack.com/services/XYZ/ABC/EFG5'


route:

  group_by: ['instance', 'alert']

  group_wait: 30s

  group_interval: 5m

  repeat_interval: 1h

  receiver: 'R1'


receivers:

  - name: 'R1'

#email_configs:

#  - to: 'a...@gmail.com'

slack_configs:

  # https://prometheus.io/docs/alerting/configuration/#slack_config

  - channel: 'system_events'

  - username: 'AlertManager'

webhook_configs:

  - url: 'https://discordapp.com/api/webhooks/XYZ/ABC'
My questions is:
1-Why I am not receiving alerts Discord?
2-Is there a better way to achieve the same thing(Discord Integration)?

Thank you in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f8974a6f-60e7-4c7f-b3bb-55be75a42a4fn%40googlegroups.com.


[prometheus-users] Re: Prometheus/AlertManager to Discord

2020-09-13 Thread Brian Candler
On Sunday, 13 September 2020 14:59:39 UTC+1, Adam wrote:
>
> 1-Why I am not receiving alerts Discord?
>

You'll need to check logs to answer that.  Not knowing anything about 
discord, I'd say the most likely problems are:

1. discord's API  does not 
accept the JSON object structure which prometheus sends in its webhook POST 
(if so, I'd expect discord to send a 400 response).  The format that 
Prometheus sends is fixed, so if Discord needs a different format, you'd 
need to write some sort of proxy which translates it.
 
2. discord's API requires some sort of authentication, e.g. a bearer token, 
which you've not configured (if so, I'd expect discord to send a 401 or 403 
response)

The response code should be visible in alertmanager logs.
 

> 2-Is there a better way to achieve the same thing(Discord Integration)?
>

Discord appears to be a commercial service so probably best to ask their 
support , or on their discussion 
group if there is one.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/bea62f15-32ef-4372-8dcb-0d12e07b14ddo%40googlegroups.com.


[prometheus-users] Re: Prometheus/AlertManager to Discord

2020-09-13 Thread Adam
Thank you for pointing me to the right direction. I checked logs and found 
the following error message. 

*cancelling notify retry for \"webhook\" due to unrecoverable error: 
unexpected status code 400: 
https://discordapp.com/api/webhooks/YYY/XXX"*

I googled this error and found this thread 
https://stackoverflow.com/questions/55792653/adding-custom-webhook-configuration-in-alertmanager
 
but I am not sure how can I change HTTP METHOD to POST.

Thank you

On Sunday, September 13, 2020 at 6:11:48 PM UTC+2 b.ca...@pobox.com wrote:

> On Sunday, 13 September 2020 14:59:39 UTC+1, Adam wrote:
>>
>> 1-Why I am not receiving alerts Discord?
>>
>
> You'll need to check logs to answer that.  Not knowing anything about 
> discord, I'd say the most likely problems are:
>
> 1. discord's API  does not 
> accept the JSON object structure which prometheus sends in its webhook POST 
> (if so, I'd expect discord to send a 400 response).  The format that 
> Prometheus sends is fixed, so if Discord needs a different format, you'd 
> need to write some sort of proxy which translates it.
>  
> 2. discord's API requires some sort of authentication, e.g. a bearer 
> token, which you've not configured (if so, I'd expect discord to send a 401 
> or 403 response)
>
> The response code should be visible in alertmanager logs.
>  
>
>> 2-Is there a better way to achieve the same thing(Discord Integration)?
>>
>
> Discord appears to be a commercial service so probably best to ask their 
> support , or on their discussion 
> group if there is one.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/80429f97-0314-4bdf-b0e9-d7cb73fd129an%40googlegroups.com.


[prometheus-users] Redis Exporter | data not coming over grafana

2020-09-13 Thread Aman Gupta
Can anyone help me with redis exporter metrics
redis version: 3.0.3
redis-exporter version: 1.10.0
exporter running at : 9121
grafana dashboard: 763

prometheus config:
  - job_name: 'redis_exporter'
static_configs:
- targets: ['DNS:9121']

target is healthy under prometheus, but still data is not coming over 
grafana.

When i see, the metrics

curl http://DNS:9121/metrics | grep redis


redis_exporter_build_info{build_date="2018-09-20-18:15:12",commit_sha="8bb0b841e9a70b0348f69483e58fea01d521c47a",golang_version="go1.10.4",version="v0.21.2"}
 
1
redis_exporter_last_scrape_duration_seconds 0.000292674
redis_exporter_last_scrape_error 1
redis_exporter_scrapes_total 125
redis_up{addr="redis://localhost:6379",alias=""} 0


issue: only very few metrics are coming, because of which data is not 
coming over grafana.

metrics  required:
1. redis_uptime_in_seconds
2. redis_connected_clients
3. redis_memory_used_bytes/ redis_memory_max_bytes

none of the data is coming, can anyone help. What am i missing

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0096d116-fbd3-4450-b617-2618c3be8683n%40googlegroups.com.


Re: [prometheus-users] Disable remote write retry

2020-09-13 Thread Ruben Papovyan
@bwplotka,
Thanks for your response 
I see errors in cortex distributer 400 and 500 errors
400 will NOT be sent again however 500 will be resend and it caused outage 

this is two types of errors that i see in distributor, no error logs in 
ingesters (only 400 errors in ingesters)

```
level=warn ts=2020-09-11T15:14:15.55091129Z caller=logging.go:62 
traceID=1e80d0d72c7dfb18 msg="POST /api/prom/push (500) 11.40001159s 
Response: \"context canceled\\n\" ws: false; Connection: close; 
Content-Encoding: snappy; Content-Length: 74202; Content-Type: 
application/x-protobuf; User-Agent: Prometheus/2.16.0; X-Forwarded-For: 
10.254.178.57; X-Forwarded-Host: cortex.devops.app.umusic.net; 
X-Forwarded-Port: 80; X-Forwarded-Proto: http; 
X-Prometheus-Remote-Write-Version: 0.1.0; X-Real-Ip: 10.254.178.57; 
X-Request-Id: aa786f8ba1483741acdcbb8503f9fb0d; X-Scheme: http; 
X-Scope-Orgid: eks-11; "
level=warn ts=2020-09-11T15:14:09.942532161Z caller=logging.go:62 
traceID=69a628f39a21de24 msg="POST /api/prom/push (500) 6.100572749s 
Response: \"rpc error: code = DeadlineExceeded desc = context deadline 
exceeded\\n\" ws: false; Connection: close; Content-Encoding: snappy; 
Content-Length: 5908; Content-Type: application/x-protobuf; User-Agent: 
Prometheus/2.13.1; X-Forwarded-For: 10.104.33.77; X-Forwarded-Host: 
cortex.devops.app.umusic.net; X-Forwarded-Port: 80; X-Forwarded-Proto: 
http; X-Prometheus-Remote-Write-Version: 0.1.0; X-Real-Ip: 10.104.33.77; 
X-Request-Id: 3859a4b2f0e3b3badc281b95c9d7b852; X-Scheme: http; 
X-Scope-Orgid: eks-13; "
```

On prom log i see 400 so cortex gateway is not hiding real status code 
Prometheus logs:
ts=2020-09-11T15:32:05.667Z caller=dedupe.go:112 component=remote 
level=error remote_name=435af2 url=
http://cortex.devops.local.int/api/prom/push/aws10-eks msg="non-recoverable 
error" count=361 err="context canceled"
ts=2020-09-11T15:32:05.667Z caller=dedupe.go:112 component=remote 
level=error remote_name=435af2 url=
http://cortex.devops.local.int/api/prom/push/aws10-eks msg="non-recoverable 
error" count=60 err="context canceled"
ts=2020-09-11T15:32:05.635Z caller=dedupe.go:112 component=remote 
level=error remote_name=435af2 url=
http://cortex.devops.local.int/api/prom/push/aws10-eks msg="Failed to flush 
all samples on shutdown"
ts=2020-09-11T15:32:02.947Z caller=dedupe.go:112 component=remote 
level=error remote_name=435af2 url=
http://cortex.devops.local.int/api/prom/push/aws10-eks msg="non-recoverable 
error" count=1000 err="server returned HTTP status 400 Bad Request: 
user=aws10-eks: sample timestamp out of order; last timestamp: 
1599838222.874, incoming timestamp: 1599838162.874 for series 
{__name__=\"kube_pod_status_ready\", 
app_kubernetes_io_instance=\"kube-state-metrics\", 
app_kubernetes_io_managed_by=\"H"
ts=2020-09-11T15:32:02.665Z caller=dedupe.go:112 component=remote 
level=error remote_name=435af2 url=
http://cortex.devops.local.int/api/prom/push/aws10-eks msg="non-recoverable 
error" count=1000 err="server returned HTTP status 400 Bad Request: 
user=aws10-eks: sample timestamp out of order; last timestamp: 
1599838222.874, incoming timestamp: 1599838162.874 for series 
{__name__=\"kube_secret_info\", 
app_kubernetes_io_instance=\"kube-state-metrics\", 
app_kubernetes_io_managed_by=\"Helm\","

ts=2020-09-11T15:01:22.707Z caller=dedupe.go:112 component=remote 
level=error remote_name=435af2 url=
http://cortex.devops.local.int/api/prom/push/aws10-eks msg="Remote storage 
resharding" from=3 to=5
level=info ts=2020-09-11T15:00:08.014Z caller=head.go:731 component=tsdb 
msg="WAL checkpoint complete" first=232 last=234 duration=1.254153897s
level=info ts=2020-09-11T15:00:06.759Z caller=head.go:661 component=tsdb 
msg="head GC completed" duration=77.995686ms
level=info ts=2020-09-11T15:00:06.314Z caller=compact.go:496 component=tsdb 
msg="write block" mint=159982560 maxt=159983280 
ulid=01EHYTWDPEX8SSCGBQT4PVCP95 duration=2.908463458s
ts=2020-09-11T14:36:42.706Z caller=dedupe.go:112 component=remote 
level=info remote_name=435af2 url=
http://cortex.devops.local.int/api/prom/push/aws10-eks msg="Remote storage 
resharding" from=2 to=3

I will be troubleshooting cortex installation and configuration 

But i also want to increase resend retries time so I don't end up in same 
situation.

What is right value for 30 min in prom config (min_backoff: 30m ) is 
this right ? 

Im open if you have any recommendation for cortex (what can be 
misconfigured so i'm getting messages above in distributer )

Thank you,
Ruben

On Saturday, September 12, 2020 at 11:52:12 PM UTC-7 bwpl...@gmail.com 
wrote:

> Hey, 
>
> Unless there is some bug on the receiving side (maybe your front proxy 
> masking the actual status code) or Cortex - both Cortex and Thanos Receive 
> in cases of not accepting write for reasons like this (something that there 
> is no point retrying for) returns the status code that tells Prometheus to 
> drop those requests and not retry.
>
> Kind Regards,

[prometheus-users] Query to get list of containers and their instances

2020-09-13 Thread kiran
Hello all

I wanted to run a query to get list of containers along with their
instances.
I am using this query, but need to eliminate one specific
container(mnt-cadvisor) from the query result.

sum by (instance,name) ({job='cadvisor', name!~""})

Is this the right query? I am able to get the result, but is there a better
way than using grouping?

sum by (instance,name) ({job='cadvisor', name!~"mnt-cadvisor|"})

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAOnWYZXUnyLN3Gym5V10vH4n5TXmwncFRHdKVYoqLvRXHWNoRA%40mail.gmail.com.


Re: [prometheus-users] Query to get list of containers and their instances

2020-09-13 Thread Julien Pivotto
On 13 Sep 18:06, kiran wrote:
> Hello all
> 
> I wanted to run a query to get list of containers along with their
> instances.
> I am using this query, but need to eliminate one specific
> container(mnt-cadvisor) from the query result.
> 
> sum by (instance,name) ({job='cadvisor', name!~""})
> 
> Is this the right query? I am able to get the result, but is there a better
> way than using grouping?
> 
> sum by (instance,name) ({job='cadvisor', name!~"mnt-cadvisor|"})


You can use

up{job='cadvisor', name!="mnt-cadvisor|"}

> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/prometheus-users/CAOnWYZXUnyLN3Gym5V10vH4n5TXmwncFRHdKVYoqLvRXHWNoRA%40mail.gmail.com.

-- 
Julien Pivotto
@roidelapluie

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/20200913220847.GA358170%40oxygen.


[prometheus-users] Alert Manager source URL to point to TLS URL of Prometheus

2020-09-13 Thread sunils...@gmail.com
Hi, 

I have setup prometheus with TLS using Nginx in front of Prometheus as 
below path. 

Request --> nginx --> Prometheus . 

Now the challenge is , AlertManager is directly associated with PRometheus 
and when I access AlertManager , All the source links ate pointing to 
Prometheus directly . 
Can we configure alertmanager source URL to TLS URL of Prometheus  ? 

Thanks 

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a91e085f-c3a7-441f-aeba-5a824f7c9393n%40googlegroups.com.


Re: [prometheus-users] Alert Manager source URL to point to TLS URL of Prometheus

2020-09-13 Thread Wesley Peng
You can setup Nginx to proxy both Alertmanager and Prometheus itself on 
different http port.


regards.


sunils...@gmail.com wrote:
Now the challenge is , AlertManager is directly associated with 
PRometheus and when I access AlertManager , All the source links ate 
pointing to Prometheus directly .

Can we configure alertmanager source URL to TLS URL of Prometheus  ?


--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/427ffa62-06f1-9e59-d613-310ddcf1a32b%40pobox.com.


Re: [prometheus-users] Alert Manager source URL to point to TLS URL of Prometheus

2020-09-13 Thread Sagar
Hi Wesley,
Thanks for your quick response on this .
My AlertManage is running behind Nginx .

But on the Alertmanager page, as shown in below pic, the *source *link is
pointing to the direct URL of Prometheus (without nginx).

[image: image.png]

On Mon, Sep 14, 2020 at 1:50 PM Wesley Peng  wrote:

> You can setup Nginx to proxy both Alertmanager and Prometheus itself on
> different http port.
>
> regards.
>
>
> sunils...@gmail.com wrote:
> > Now the challenge is , AlertManager is directly associated with
> > PRometheus and when I access AlertManager , All the source links ate
> > pointing to Prometheus directly .
> > Can we configure alertmanager source URL to TLS URL of Prometheus  ?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/427ffa62-06f1-9e59-d613-310ddcf1a32b%40pobox.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CANegE50ksJZzyssaC52g0nuAR-ppCNCSZ%3DOpnU341hbQ01bYGg%40mail.gmail.com.


[prometheus-users] dead process metric still update timestamp

2020-09-13 Thread 钟振华
Hi,when I use pushgateway to montior process in host machine,pushgateway's 
structure like
>job="monitor_job" instance="1"
 >host_process_memory  last updated *
Labels  
  value
pid="1",processName="a"
   22
pid="2",processName="b"
   23
pid="3",processName="a"
   26
And I use processName as query label ,when pid="1" is dead ,and *a* process 
pid number change to 3,the data of pid="1" still update timestamp with last 
value pushed pushed,like 
0.015115737915039062 @1600063693.994
0.015115737915039062 @1600063708.996
0.015115737915039062 @1600063723.994
0.015115737915039062 @1600063738.997
0.015115737915039062 @1600063753.998
0.015115737915039062 @1600063768.995
0.015115737915039062 @1600063783.994
0.015115737915039062 @1600063798.995
is there a way to solve the problem 


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4cf62046-0590-44c8-b6d4-5078ad5f9a70n%40googlegroups.com.


Re: [prometheus-users] Alert Manager source URL to point to TLS URL of Prometheus

2020-09-13 Thread Julien Pivotto
Hello,

You can set the nginx url with the --web.external-url parameter of
Prometheus.

Le lun. 14 sept. 2020 à 08:03, Sagar  a écrit :

> Hi Wesley,
> Thanks for your quick response on this .
> My AlertManage is running behind Nginx .
>
> But on the Alertmanager page, as shown in below pic, the *source *link is
> pointing to the direct URL of Prometheus (without nginx).
>
> [image: image.png]
>
> On Mon, Sep 14, 2020 at 1:50 PM Wesley Peng  wrote:
>
>> You can setup Nginx to proxy both Alertmanager and Prometheus itself on
>> different http port.
>>
>> regards.
>>
>>
>> sunils...@gmail.com wrote:
>> > Now the challenge is , AlertManager is directly associated with
>> > PRometheus and when I access AlertManager , All the source links ate
>> > pointing to Prometheus directly .
>> > Can we configure alertmanager source URL to TLS URL of Prometheus  ?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/427ffa62-06f1-9e59-d613-310ddcf1a32b%40pobox.com
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CANegE50ksJZzyssaC52g0nuAR-ppCNCSZ%3DOpnU341hbQ01bYGg%40mail.gmail.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAFJ6V0rd3zTMaH1kdBuLXA8LiWyJbbiKEL0v4jghYZKk_6%3DrUg%40mail.gmail.com.


[prometheus-users] Error in generating config file using generator

2020-09-13 Thread sabarish narayanan

Hi,
I am trying to generate snmp.yml by using the generator. This is my 
generator file :-

modules:
  XPPC-MIB:
walk: 
  - .1.3.6.1.4.1.935.1.1.1.8.3.2.0
version: 3
max_repetitions: 25
retries: 3
timeout: 10s
auth:
  username: 
  security_level: authNoPriv
  password: *
  auth_protocol: MD5

I am getting the following error :-

level=info ts=2020-09-14T06:12:47.382Z caller=net_snmp.go:142 msg="Loading 
MIBs" from=$HOME/.snmp/mibs:/usr/share/snmp/mibs
level=info ts=2020-09-14T06:12:47.530Z caller=main.go:52 msg="Generating 
config for module" module=XPPC-MIB
level=error ts=2020-09-14T06:12:47.538Z caller=main.go:130 msg="Error 
generating config netsnmp" err="cannot find oid 
'.1.3.6.1.4.1.935.1.1.1.8.3.2.0' to walk"

If I try snmptranslate, it works :-

snmptranslate -mALL .1.3.6.1.4.1.935.1.1.1.8.3.2.0
XPPC-MIB::upsThreePhaseOutputVoltageR.0

snmpwalk to that oid also gives the value of the oid :-

snmpwalk -l authNoPriv -u  -a MD5 -A * XX.XX.XX.XX 
.1.3.6.1.4.1.935.1.1.1.8.3.2.0
SNMPv2-SMI::enterprises.935.1.1.1.8.3.2.0 = INTEGER: 2323

So I'm pretty sure that netsnmp is working fine.
The XPPC-MIB.txt file is in /usr/share/snmp/mibs.

I'm working on Centos 8.

What am I doing wrong? Any help is appreciated.
Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2d3d4933-6f2a-499c-a1ac-215ebb649de7n%40googlegroups.com.