Re: [prometheus-users] Re: node_exporter stopping when run as systemd service on Linux

2021-11-10 Thread Venkata Bhagavatula
Hi Brian,

Are you getting any error when you start it manually as the user?
/opt/node_exporter/node_exporter
--collector.textfile.directory=/var/lib/node_exporter --collector.systemd
--collector.ntp


Thanks n Regards,
Chalapathi

On Wed, Nov 10, 2021 at 12:39 AM Brian Candler  wrote:

> Here is the node_exporter.service that I use:
>
> 
> [Unit]
> Description=Prometheus Node Exporter
> Documentation=https://github.com/prometheus/node_exporter
> After=network-online.target
>
> [Service]
> User=x
> EnvironmentFile=/etc/default/node_exporter
> ExecStart=/opt/node_exporter/node_exporter $OPTIONS
> Restart=on-failure
> RestartSec=5
>
> [Install]
> WantedBy=multi-user.target
> 
>
> and /etc/default/node_exporter:
>
> 
> OPTIONS='--collector.textfile.directory=/var/lib/node_exporter
> --collector.systemd --collector.ntp'
> 
>
> It doesn't have any problem with premature shutdown.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/88690113-44bd-4337-a56d-b2a771e78434n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPt2SE%2B91XE_nXdWdeoc4-%2B-tnFJz3gXBb7VJfE8UT%2BR0w%40mail.gmail.com.


[prometheus-users] Query regarding Alertmanager

2021-11-09 Thread Venkata Bhagavatula
 Hi All,

Can you please confirm if the understanding below is correct?

1. when an alert is fired in prometheus, it fills startsAt to the time when
it got fired. Even when the alert is resolved, startsAt will point to the
time at which this alert was fired?
2. Alert manager sends the same startsAt to the receiver(in mycase webhook)
for both active and resolved state?



Thanks n Regards,
Chalapathi

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPtcYHq6JAmZ_Gc%3D%2BggPYVDT3md23C5kXtygKRUnLhhDNA%40mail.gmail.com.


[prometheus-users] Some timeseries getting dropped

2021-08-16 Thread Venkata Bhagavatula
Hi All,

I am seeing some time series from the target are getting dropped and not
written to tsdb. Initially i thought it might be related to wrong metric
relabel configs, but i have removed all the relabel configs, even then the
time series is not being written.
Target is giving the time series.

Following the is curl output from the target:
istio_requests_total{response_code="200",reporter="destination",
*source_workload="ind-cpro"*,source_workload_namespace="csd-ns",source_principal="unknown",source_app="cpro",source_version="unknown",destination_workload="ind-nrfp",destination_workload_namespace="csd-ns",destination_principal="unknown",destination_app="nrfp",destination_version="unknown",destination_service="csd-nnrfp.csd-ns.svc.cluster.local",destination_service_name="csd-nnrfp",destination_service_namespace="csd-ns",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="none",source_canonical_service="cpro",destination_canonical_service="nrfp",source_canonical_revision="2.15.1",destination_canonical_revision="latest"}
3017
istio_requests_total{response_code="200",reporter="source",
*source_workload="ind-nrfp"*,source_workload_namespace="csd-ns",source_principal="unknown",source_app="nrfp",source_version="unknown",destination_workload="unknown",destination_workload_namespace="unknown",destination_principal="unknown",destination_app="unknown",destination_version="unknown",destination_service="
abc.default.svc.cluster.local",destination_service_name="abc.default.svc.cluster.local",destination_service_namespace="unknown",request_protocol="grpc",response_flags="-",grpc_response_status="0",connection_security_policy="unknown",source_canonical_service="nrfp",destination_canonical_service="unknown",source_canonical_revision="latest",destination_canonical_revision="latest"}
276

Prometheus job configuration is as follows(ip address is masked):
- honor_labels: true
  job_name: chal
  metrics_path: /stats/prometheus
  static_configs:
- targets:
  - abc.xyz.def.pqr:15090

In the above time series from target, the time series with
*source_workload="ind-nrfp"
*is not present in prometheus.  There are no warnings/errors seen in the
debug logs

Can you let me know how to debug this issue?

Thanks n Regards,
Chalapathi.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPtESH13%3DZUFBs714mTay71i6U02yCL4LLJ2g0c-236PZQ%40mail.gmail.com.


Re: [prometheus-users] Re: regarding pprof

2021-07-27 Thread Venkata Bhagavatula
Hi Ben
No there is no federation between node0 and node1. This is HA setup.

thanks n Regards,
chalapathi.

On Sat, Jul 24, 2021 at 8:38 PM Ben Kochie  wrote:

> Are you using federation to replicate the data from node0 to node1?
>
> That could be a major cause of excess memory use.
>
> On Sat, Jul 24, 2021, 08:22 Venkata Bhagavatula 
> wrote:
>
>> Queries are happening only on Node1. Node0 is only scrapping targets.
>>
>> On Fri, Jul 23, 2021 at 4:54 PM Stuart Clark 
>> wrote:
>>
>>> On 23/07/2021 12:20, Venkata Bhagavatula wrote:
>>>
>>> Forgot to mention, we are using prometheus version 2.16.0 and cannot
>>> update to the latest version.
>>> attaching the heapdump for both nodes.
>>>
>>> On Fri, Jul 23, 2021 at 4:41 PM Venkata Bhagavatula <
>>> venkat.cha...@gmail.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> In one of our production setups, we have configured prometheus HA on
>>>> Virtual machines(node0, node1). I see that node0 prometheus takes around
>>>> 5gb of ram and node1 takes just 1gb of ram.
>>>>
>>>>
>>> Are you performing queries on both nodes or just node0?
>>>
>>> --
>>> Stuart Clark
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/CABXnQPtnoGU4b8r5VhCjetuO745Ukx%2BJw9fBurd8qxvPx_yi-Q%40mail.gmail.com
>> <https://groups.google.com/d/msgid/prometheus-users/CABXnQPtnoGU4b8r5VhCjetuO745Ukx%2BJw9fBurd8qxvPx_yi-Q%40mail.gmail.com?utm_medium=email_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPs4AngMuM7RTJSQZDW5osvUriA0kYPvWFf%3DvWQwCi909g%40mail.gmail.com.


Re: [prometheus-users] Re: regarding pprof

2021-07-24 Thread Venkata Bhagavatula
Queries are happening only on Node1. Node0 is only scrapping targets.

On Fri, Jul 23, 2021 at 4:54 PM Stuart Clark 
wrote:

> On 23/07/2021 12:20, Venkata Bhagavatula wrote:
>
> Forgot to mention, we are using prometheus version 2.16.0 and cannot
> update to the latest version.
> attaching the heapdump for both nodes.
>
> On Fri, Jul 23, 2021 at 4:41 PM Venkata Bhagavatula <
> venkat.cha...@gmail.com> wrote:
>
>> Hi All,
>>
>> In one of our production setups, we have configured prometheus HA on
>> Virtual machines(node0, node1). I see that node0 prometheus takes around
>> 5gb of ram and node1 takes just 1gb of ram.
>>
>>
> Are you performing queries on both nodes or just node0?
>
> --
> Stuart Clark
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPtnoGU4b8r5VhCjetuO745Ukx%2BJw9fBurd8qxvPx_yi-Q%40mail.gmail.com.


[prometheus-users] regarding pprof

2021-07-23 Thread Venkata Bhagavatula
Hi All,

In one of our production setups, we have configured prometheus HA on
Virtual machines(node0, node1). I see that node0 prometheus takes around
5gb of ram and node1 takes just 1gb of ram.

user has changed min.block-duration to 30m and max block duration to 2h. I
told them to not modify them as these are only for developers. Will this
cause a RAM usage difference? But then both nodes should use more RAM.

I checked that both prometheus are using the same configuration file.  I
see that the number of allocs  in node0 are more than that of node1 under
the "/debug/pprof" page.

I collected the heap dump from "curl http://localhost:9090/debug/pprof/heap
".

tried to check the top heap allocations using  "go tool pprof heap_node0"
Following is the output:

File: prometheus
Type: inuse_space
Time: Jul 22, 2021 at 4:36am (EDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 81.25MB, 68.48% of *118.65MB total*
Dropped 68 nodes (cum <= 0.59MB)
Showing top 10 nodes out of 101
  flat  flat%   sum%cum   cum%
  16MB 13.49% 13.49%   16MB 13.49%
github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk
   12.34MB 10.40% 23.89%13.94MB 11.75%  compress/flate.NewWriter
   8MB  6.75% 30.63%8MB  6.75%
github.com/prometheus/prometheus/tsdb/chunkenc.(*bstream).writeByte
7.50MB  6.32% 36.96% 7.50MB  6.32%
github.com/prometheus/prometheus/pkg/labels.(*Builder).Labels
7.50MB  6.32% 43.28% 7.50MB  6.32%
github.com/prometheus/prometheus/tsdb.newMemSeries
7.40MB  6.24% 49.51% 7.40MB  6.24%
github.com/prometheus/prometheus/scrape.(*scrapeCache).trackStaleness
6.50MB  5.48% 55.00% 6.50MB  5.48%
github.com/prometheus/prometheus/tsdb/chunkenc.(*bstream).writeBit
   6MB  5.06% 60.05%6MB  5.06%
github.com/prometheus/prometheus/pkg/textparse.(*PromParser).Metric
5.50MB  4.64% 64.69%9MB  7.59%
github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
4.50MB  3.79% 68.48%22.50MB 18.97%
github.com/prometheus/prometheus/tsdb.(*memSeries).cut


top output says that only ~120MB is in inuse bytes. What about the rest of
the memory? I read in one of blog(
https://source.coveo.com/2021/03/03/prometheus-memory/) that it is  cached
memory allocated by mmap. Is it OK to have that much memory in cache?

Currently I am trying to plot a graph for go_memstats_heap_.*_bytes.

Can you let me know how to debug further with respect to RAM usage
difference between two nodes? In production setup there is a management
application which monitors the total ram usage of the node and if it
reaches a threshold then some action would be performed on that node.

Thanks n Regards,
chalapathi

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPuOxtYzgUCjpGg%2BpYQiFFgC0JbESAqL_MmCqM9QhxE2Zg%40mail.gmail.com.


[prometheus-users] % cpu utilization

2021-07-13 Thread Venkata Bhagavatula
Hi All,

I am trying to add an alert rule, which raises an alert when %utilization
of container cpu usage is above a certain threshold. Our scrape interval is
1min

Following is expression, i started with:
sum(rate (container_cpu_usage_seconds_total{container=~'.+'}[4m])*100) by
(namespace,container,pod) > 80

Here we saw "sum(irate (container_cpu_usage_seconds_total{container=~'.+'}
[2m])*100) by (namespace,container,pod) > 80", it is showing values as
greater than 100.  Below is the snapshot:
[image: image.png]

Later after referring to the below ticket:
https://github.com/google/cadvisor/issues/2026, changed the expression to
the below:
*(sum(rate(container_cpu_usage_seconds_total{image!="",
container!="POD"}[4m])) by (pod, container, namespace) /
sum(container_spec_cpu_quota{ image!="",
container!="POD"}/container_spec_cpu_period{ image!="", container!="POD"})
by (pod, container,namespace) ) * 100 > 80*

Here also we see the values showing greater than 100. Can you please let us
know how to get the percentage of cpu utilization per container?

Thanks n Regards,
Chalapathi.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPvDNC-pR%2BiMZMTa9tUv7BPA8OQhcttkiUYe%2BBfEqua7CQ%40mail.gmail.com.


[prometheus-users] Regarding retention.size

2020-11-06 Thread Venkata Bhagavatula
Hi All,

Can you please clarify the below queries?
1. Does retention.size consider chunks_head size while limiting the tsdb
size to configured limit?
2. I am upgrading from prometheus version 2.16.0 to 2.20.1, In 2.20.1 wal
compression is enabled by default. So the old wal segments are not
compressed, but the newer ones are compressed. while limiting the size to
the configured one, does prometheus consider old wal segments which are
uncompressed?


Thanks n Regards,
Chalapathi.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPvh7cmmA0qABBq4QwdyCkg%3DruQKWjYA3qHo7m6pS7hoRA%40mail.gmail.com.


[prometheus-users] Wal inclusion in retention.size

2020-08-21 Thread Venkata Bhagavatula
Hi,

In the https://prometheus.io/docs/prometheus/2.20/storage/ link, Following
is mentioned regarding retention.size

--storage.tsdb.retention.size: [EXPERIMENTAL] This determines the maximum
number of bytes that storage blocks can use (note that this does not
include the WAL size, which can be substantial). The oldest data will be
removed first. Defaults to 0 or disabled. This flag is experimental and can
be changed in future releases. Units supported: B, KB, MB, GB, TB, PB, EB.
Ex: "512MB"

But where in Release notes of 2.15.0, see that it is mentioned that
retention.size includes wal also.
[ENHANCEMENT] TSDB: WAL size is now used for size based retention
calculation. #5886

Does documentation needs correction?

Thanks n Regards,
Chalapathi.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPvNKU%3DGCgW3vzxmxB3mo-KQhG1vHtpAjye-CoHZPpE8RA%40mail.gmail.com.


Re: [prometheus-users] Re: Alert manager GUI does not open using IPV6 address

2020-07-29 Thread Venkata Bhagavatula
Hi Brian,

Thanks for the response.
Can i raise a issue in alertmanager linking to the above Elm/Url issue? 
Once the Elm/url is fixed, then version can be upgraded in alertmanager.

Thanks n Regards,
Chalapathi

On Tuesday, July 28, 2020 at 11:07:40 PM UTC+5:30, Brian Candler wrote:
>
> I note that a change to elm/url was approved 4 days ago (but not yet 
> merged):
> https://github.com/elm/url/pull/35
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/aec31730-ec77-4bbd-b43f-50fe4a5deafbo%40googlegroups.com.


Re: [prometheus-users] Re: Alert manager GUI does not open using IPV6 address

2020-07-28 Thread Venkata Bhagavatula
Hi  Brian,

I have installed alertmanager on a VM. I am still seeing the same issue.
Following is the command line argument we are using for starting
alertmanager:
 */bin/alertmanager --config.file=/etc/alertmanager/alertmanager.yml
--storage.path=/alertmanager --log.level=debug --cluster.listen-address=""
--web.listen-address="[x:x:x::x]:9093" --web.external-url="http://
[x:x:x::x] :9093/*

while searching google found the following issue on Elm/Url module:
https://github.com/elm/url/issues/12 . Can you let us know if it has any
relevance?

To rule out the routing/proxy issues, we have had a sample javascript page.
The sample page is loading fine..

Can you please let us know how to proceed further?

Thanks n Regards,
Chalapathi.

On Mon, Jul 13, 2020 at 7:33 PM Brian Candler  wrote:

> Alternatively, it could be that you haven't told alertmanager the URL to
> generate links with, using the command line option:
>
> --web.external-url=http://[x:x:x::x]:32109
>
> Some of the links it generates may use that URL.  You'll be able to tell
> by looking in your browser console logs, to see if it tries to access that
> address without using port 32109.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/3b040b65-4315-4025-b5ee-7fee00d0e649o%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPtVBJ9Lw%2BtGCVQsbj-ifcu25kEk63DzPCy2og2oEgyTRw%40mail.gmail.com.


[prometheus-users] Re: prometheus not scrapping targets when timestamp field is present

2020-07-06 Thread Venkata Bhagavatula
Hi All,
Can any one respond to my queries? Also we observed the following:
1. If for eg timestamp(epoch) in the scrape is 12:00:00, then prometheus is
not scrapping the targets
2. If for eg timestamp(epoch) in the scrape is 12:00:01, then prometheus is
scrapping the targets.

Thanks & regards,
Chalapathi

On Thu, Jul 2, 2020 at 3:03 PM Venkata Bhagavatula 
wrote:

> Hi,
>
> We are using prometheus version 2.11.1, In our application, the scrape
> target has timestamp field.  when timestamp field is present, then
> prometheus is not scrapping any metrics.
> Following is the output of the curl request for scrape target:
>
>- *cmd: curl  http://:24231/metrics*
>
> meas_gauge{id="Filtered",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
> 0.0 159368040
> meas_gauge{id="Rejected",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
> 0.0 159368040
> meas_gauge{id="ReprocessedIn",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
> 0.0 159368040
> meas_gauge{id="Created",HOST="test",STREAM="Smoke_stream",NODE="MFE2"} 0.0
> 159368040
> meas_gauge{id="Duplicated",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
> 0.0 159368040
> meas_gauge{id="Stored",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
> 336.0 159368040
> meas_gauge{id="Retrieved",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
> 354.0 159368040
> meas_gauge{id="ReducedInMerging",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
> 0.0 159368040
>
>
>
>- I checked that time is in sync between the prometheus node and the
>target node.
>- Following is the epoch time on the prometheus node:
>
> *cmd: date +'%s%3N'*
> *1593681793979*
>
>
>- Epoch difference between the prometheus node and the time stamp
>present in the sample is more than an hour.
>
> difference = ( 1593681793979 -  159368040) / 1000 = 1393sec = 23min
>
> Scrape_interval is configured as 300s
> honor_timestamps is set to true.
>
> Can you let us know why prometheus is not able to scrape the targets? Is
> it due to the timestamp difference between prometheus and target?
> How much difference will prometheus tolerate?
>
> Thanks n Regards,
> Chalapathi
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPuB5iWDhDw06OLOepmz5_XgC2a%3DC9uuVaDKcczR9B-%2BAA%40mail.gmail.com.


[prometheus-users] prometheus not scrapping targets when timestamp field is present

2020-07-02 Thread Venkata Bhagavatula
Hi,

We are using prometheus version 2.11.1, In our application, the scrape
target has timestamp field.  when timestamp field is present, then
prometheus is not scrapping any metrics.
Following is the output of the curl request for scrape target:

   - *cmd: curl  http://:24231/metrics*

meas_gauge{id="Filtered",HOST="test",STREAM="Smoke_stream",NODE="MFE2"} 0.0
159368040
meas_gauge{id="Rejected",HOST="test",STREAM="Smoke_stream",NODE="MFE2"} 0.0
159368040
meas_gauge{id="ReprocessedIn",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
0.0 159368040
meas_gauge{id="Created",HOST="test",STREAM="Smoke_stream",NODE="MFE2"} 0.0
159368040
meas_gauge{id="Duplicated",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
0.0 159368040
meas_gauge{id="Stored",HOST="test",STREAM="Smoke_stream",NODE="MFE2"} 336.0
159368040
meas_gauge{id="Retrieved",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
354.0 159368040
meas_gauge{id="ReducedInMerging",HOST="test",STREAM="Smoke_stream",NODE="MFE2"}
0.0 159368040



   - I checked that time is in sync between the prometheus node and the
   target node.
   - Following is the epoch time on the prometheus node:

*cmd: date +'%s%3N'*
*1593681793979*


   - Epoch difference between the prometheus node and the time stamp
   present in the sample is more than an hour.

difference = ( 1593681793979 -  159368040) / 1000 = 1393sec = 23min

Scrape_interval is configured as 300s
honor_timestamps is set to true.

Can you let us know why prometheus is not able to scrape the targets? Is it
due to the timestamp difference between prometheus and target?
How much difference will prometheus tolerate?

Thanks n Regards,
Chalapathi

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPtZiz%3DTvhYjGHuonVParMt68pKSa-qdUS6D_2GB--xPgA%40mail.gmail.com.


[prometheus-users] Re: Restricting Prometheus to a particular Namespace

2020-05-29 Thread Venkata Bhagavatula
Able to solve the issue. There is a configuration error in one config file
where namespaces were not added. Also if we add node role, then
clusterrole, clusterolebinding is needed, as node resource is cluster
scoped.

Thanks n Regards,
Chalapathi

On Tue, May 26, 2020 at 10:31 PM Venkata Bhagavatula <
venkat.cha...@gmail.com> wrote:

> Hi All,
>
> Currently Prometheus needs ClusterRole and ClusterRoleBinding for
> scrapping the metrics on Kubernetes. We want to restrict the prometheus to
> a particular namespace.
> So we changed RBAC to using Role and RoleBinding and in the
> Prometheus configuration we added namespaces to kubernetes_sd_configs
> section. we see that we are able to scrape metrics
> from the configured namespace, but continuously seeing the errors saying
> access forbidden to *v1.Pod etc. Currently my cluster is down. will share
> the exact error once it is available.
>
> Following is the Prometheus configuration:
>   - job_name: 'kubernetes-apiservers'
>
> kubernetes_sd_configs:
>   - role: endpoints
> namespaces:
>  names: ['admin']
>
> Please let me know whether we can do with Role and RoleBinding?
>
> Thanks n Regards,
> Chalapathi.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPtWZNzbN-1OMpMQSBMFJNYrpMFsG7hp7zHS_W%2BZyvXTsg%40mail.gmail.com.


[prometheus-users] Re: Deleting timeseries

2020-03-10 Thread Venkata Bhagavatula
Hi All,

I tried the following today, if i delete few metrics individually, those
metrics are deleted. But if i give expression to match all metrics then
none of them are getting deleted.
Can you let me know if i am missing something?

Thanks n Regards,
Chalapathi.

On Tue, Mar 3, 2020 at 12:53 PM Venkata Bhagavatula 
wrote:

> Hi,
>
> I was using the  admin rest api for deleting the timeseries. Following is
> what i executed:
>
> *curl  -XPOST   -g
> 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={__name__=~
> <http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]=%7B__name__=~>".+"}'*
> followed by
> *curl  -XPOST  'http://localhost:9090/api/v1/admin/tsdb/clean_tombstones
> <http://localhost:9090/api/v1/admin/tsdb/clean_tombstones>'*
>
> As mentioned in the document, 204 is returned for both timeseries.
>
> I was expecting that all the timeseries would be deleted from the disk.
> But this is not the case.
> i could see that still timeseries data is present.
>
> I am using prometheus version 2.11.1.
> Can we delete all the timeseries from the disk? using the above Rest API.
>
> Thanks n Regards,
> Chalapathi.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPsZhbVH3hJqpq3et47TpNL_MqOCM%3DRO8bFfPORxzQo_eA%40mail.gmail.com.


Re: [prometheus-users] usage of rate function on recording metric

2020-03-10 Thread Venkata Bhagavatula
Hi Stuart, Julien,

Following is being done in the application side for the metrics given in
the above mails.:

Some of the metrics used in these charts have multiple labels. Due to the
usage of multiple labels and the possible different values of these labels,
the cardinality of the metrics can be very high. So to avoid an exponential
growth of number of metrics combination that Prometheus ends up scrapping,
the application cleans up counters that are not incremented for a some
period . So at some point some of the metrics (which have some current
value) are removed.

can this be the reason why we see a drop of the counter value in the above
charts ?


Thanks n Regards,

Chalapathi.








On Mon, Mar 9, 2020 at 5:09 PM Julien Pivotto 
wrote:

> On 09 Mar 11:35, Stuart Clark wrote:
> > On 2020-03-09 11:21, Venkata Bhagavatula wrote:
> > > Hi Stuart,
> > >
> > > sorry for the late reply, i was on vacation.
> > > I will check the recording rule.
> > > The reduction was same for both recorded metric vs original metric.
> > > can you correct my understanding?
> > > After the rule is evaluated, will the type of metric be treated as
> > > Gauge?
> > >
> >
>
> Hi there,
>
> Please note that metric types in Prometheus are informative. You can in
> theory run rate() on both gauges and counters, even if you should only
> do it on counters.
>
> Regards
>
> --
>  (o-Julien Pivotto
>  //\Open-Source Consultant
>  V_/_   Inuits - https://www.inuits.eu
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPs2HbKE7puPY3dMkABmODrve56UYWhDPW7GZuHWfs8gww%40mail.gmail.com.


Re: [prometheus-users] usage of rate function on recording metric

2020-03-09 Thread Venkata Bhagavatula
Hi Stuart,

sorry for the late reply, i was on vacation.
I will check the recording rule.
The reduction was same for both recorded metric vs original metric. can you
correct my understanding?
After the rule is evaluated, will the type of metric be treated as Gauge?

Thanks n Regards,
Chalapathi

On Thu, Feb 27, 2020 at 12:59 PM Stuart Clark 
wrote:

> The graph showed a reduction at various points. Is there a bug in the
> recording rule calculation that can cause reduction which needs fixing?
>
> On 27 February 2020 06:24:11 GMT, Venkata Bhagavatula <
> venkat.cha...@gmail.com> wrote:
>>
>> Hi ,
>> Thanks for the response. When we a recording rule, what will be the
>> metric type of this derived counter?. In the increase function
>> documentation it was mentioned that
>> increase(v range-vector) calculates the increase in the time series in
>> the range vector. Breaks in monotonicity (such as counter resets due to
>> target restarts) are automatically adjusted for.
>>
>> Also why it worked on the original metric as on both dervied and original
>> metric has reduction?
>>
>> Thanks n Regards,
>> Chalapathi
>>
>>
>> On Wed, Feb 26, 2020 at 10:49 PM Stuart Clark 
>> wrote:
>>
>>> Counters must only increase. Any reduction is seen as a counter reset.
>>>
>>> If this isn't a counter use the derivative function rather than rate
>>>
>>> On 26 February 2020 14:05:31 GMT, Venkata Bhagavatula <
>>> venkat.cha...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> In our application, there is one metric that we are deriving from
>>>> another metric using recording rules.
>>>> When we plot the graphs of recording metric and the original metric in
>>>> grafana, we see the graph to follow the same trend. But when we applied
>>>> increase , then we have seen recording metric is having huge spikes,
>>>> whereas original metric is not having these spikes.
>>>>
>>>> following is the plotted graph:
>>>> [image: image.png]
>>>> The bottom panels show the increase of both these  metrics over time.
>>>> As you can see, there are points where the metric values goes down.
>>>> Prometheus handles these as resets for metric type “Counter”, and the
>>>> increase function handles it gracefully.
>>>>
>>>> Can you let us know how these recording metrics are treated in
>>>> prometheus? and any pointers on how to debug this issue.
>>>>
>>>> Thanks n Regards,
>>>> Chalapathi.
>>>>
>>>>
>>> --
>>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPuBknw9miVXT0q22YwY%2BiHRLPskaQ45fteARLEzuL9XKg%40mail.gmail.com.


[prometheus-users] Deleting timeseries

2020-03-02 Thread Venkata Bhagavatula
Hi,

I was using the  admin rest api for deleting the timeseries. Following is
what i executed:

*curl  -XPOST   -g
'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={__name__=~
".+"}'*
followed by
*curl  -XPOST  'http://localhost:9090/api/v1/admin/tsdb/clean_tombstones
'*

As mentioned in the document, 204 is returned for both timeseries.

I was expecting that all the timeseries would be deleted from the disk. But
this is not the case.
i could see that still timeseries data is present.

I am using prometheus version 2.11.1.
Can we delete all the timeseries from the disk? using the above Rest API.

Thanks n Regards,
Chalapathi.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPt-vdCb4G2JLyjy0GR8GXE7w4%3D_8BRjLHB2VzCoNLuRKw%40mail.gmail.com.


Re: [prometheus-users] usage of rate function on recording metric

2020-02-26 Thread Venkata Bhagavatula
Hi ,
Thanks for the response. When we a recording rule, what will be the metric
type of this derived counter?. In the increase function documentation it
was mentioned that
increase(v range-vector) calculates the increase in the time series in the
range vector. Breaks in monotonicity (such as counter resets due to target
restarts) are automatically adjusted for.

Also why it worked on the original metric as on both dervied and original
metric has reduction?

Thanks n Regards,
Chalapathi


On Wed, Feb 26, 2020 at 10:49 PM Stuart Clark 
wrote:

> Counters must only increase. Any reduction is seen as a counter reset.
>
> If this isn't a counter use the derivative function rather than rate
>
> On 26 February 2020 14:05:31 GMT, Venkata Bhagavatula <
> venkat.cha...@gmail.com> wrote:
>>
>> Hi,
>>
>> In our application, there is one metric that we are deriving from another
>> metric using recording rules.
>> When we plot the graphs of recording metric and the original metric in
>> grafana, we see the graph to follow the same trend. But when we applied
>> increase , then we have seen recording metric is having huge spikes,
>> whereas original metric is not having these spikes.
>>
>> following is the plotted graph:
>> [image: image.png]
>> The bottom panels show the increase of both these  metrics over time. As
>> you can see, there are points where the metric values goes down. Prometheus
>> handles these as resets for metric type “Counter”, and the increase
>> function handles it gracefully.
>>
>> Can you let us know how these recording metrics are treated in
>> prometheus? and any pointers on how to debug this issue.
>>
>> Thanks n Regards,
>> Chalapathi.
>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABXnQPtEbb3dCR6oX%2Bn4%3DdEpSah2oF0gQiw%2BMUvX0OJO7DGPGw%40mail.gmail.com.