[prometheus-users] PromCon 2023 in Berlin

2023-07-27 Thread Matthias Loibl
Hey everyone,

many have been asking for a while.

We’re pleased to announce that PromCon 2023 EU is happening in Berlin
Germany September 28th & 29th at Radialsystem.

Tickets are available via the PromCon website:
https://promcon.io/2023-berlin/register/

We are looking for talk proposals until August 18th:
https://promcon.io/2023-berlin/submit/

If your company wants to sponsor PromCon you can also find more information
on our website: https://promcon.io/2023-berlin/sponsor/

If there are any other questions, please reach out to our mailing list:
promcon-organiz...@googlegroups.com

We are super excited to see everyone in Berlin!

Cheers,
Matthias on behalf of the PromCon Organizers

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAN4-%2B%2BNnhwzcJ%3D%2Bcog-iDFhjXRuNPQuTDfVQ2358gUMN17HuFw%40mail.gmail.com.


[prometheus-users] Re: Unusual traffic in prometheus nodes.

2023-07-27 Thread Brian Candler
As Stuart says, that looks correct, assuming your metrics don't have any 
labels other than the ones you've excluded. You'd save a lot of typing just 
by doing:

sum(scrape_samples_scraped)

which is expected to return a single value, with no labels (as it's summed 
across all timeseries of this metric).

The value 7,525,871,918 does seem quite high - what was it before?  You can 
set an execution time for this query in the PromQL browser, or draw a graph 
this expression over time, to see historical values.

You could also look at
count(scrape_samples_scraped)

or more simply
count(up)

and see if that has jumped up: it would imply that lots more targets have 
been added (e.g. more pods are being monitored).

If not, then as well as Stuart's suggestion of graphing 
"scrape_samples_scraped" by itself to see if one particular target is 
generating way more metrics than usual, you could try different summary 
variants like

sum by (instance,job) (scrape_samples_scraped)
sum by (clusterName) (scrape_samples_scraped)
... etc

and see if there's a spike in any of these.  This may help you drill down 
to the offending item(s).

On Thursday, 27 July 2023 at 15:51:24 UTC+1 Uvais Ibrahim wrote:

> Hi Brain,
>
> This is the query that I have used.
>
> sum(scrape_samples_scraped)without(app,app_kubernetes_io_managed_by,clusterName,release,environment,instance,job,k8s_cluster,kubernetes_name,kubernetes_namespace,ou,app_kubernetes_io_component,app_kubernetes_io_name,app_kubernetes_io_version,kustomize_toolkit_fluxcd_io_name,kustomize_toolkit_fluxcd_io_namespace,application,name,role,app_kubernetes_io_instance,app_kubernetes_io_part_of,control_plane,beta_kubernetes_io_arch,beta_kubernetes_io_instance_type,
>  
> beta_kubernetes_io_os, failure_domain_beta_kubernetes_io_region, 
> failure_domain_beta_kubernetes_io_zone,kubernetes_io_arch, 
> kubernetes_io_hostname, kubernetes_io_os, node_kubernetes_io_instance_type, 
> nodegroup, topology_kubernetes_io_region, 
> topology_kubernetes_io_zone,chart,heritage,revised,transit,component,namespace,
>  
> pod_name, pod_template_hash, security_istio_io_tlsMode, 
> service_istio_io_canonical_name, 
> service_istio_io_canonical_revision,k8s_app,kubernetes_io_cluster_service,kubernetes_io_name,route_reflector)
>
> Which simply excluded every label but still I am getting a result like this
>
> {}  7525871918
>
>
> It shouldn't return any results right?
>
> Prometheus version: 2.36.2
>
> By increased traffic I meant that, the prometheus servers are getting high 
> traffic from a specific point of time. Currently prometheus is getting 13 
> million packets earlier it was like 2 to 3 M packets on an average. And the 
> prometheus endpoint is not public.
>
>
> On Thursday, July 27, 2023 at 6:06:10 PM UTC+5:30 Brian Candler wrote:
>
>> scrape_samples_scraped always has the labels which prometheus itself adds 
>> (i.e. job and instance).
>>
>> Extraordinary claims require extraordinary evidence. Are you saying that 
>> the PromQL query *scrape_samples_scraped{job="",instance=""}* returns a 
>> result?  If so, what's the number?  What do you mean by "with increased 
>> size" - increased as compared to what? And what version of prometheus are 
>> you running?
>>
>> In any case, what you see with scrape_samples_scraped may be completely 
>> unrelated to the "high traffic" issue.  Is your prometheus server exposed 
>> to the Internet? Maybe someone is accessing it remotely.  Even if not, you 
>> can use packet capture to work out where the traffic is going to and from.  
>> A tool like https://www.sniffnet.net/ may be helpful.
>>
>> On Thursday, 27 July 2023 at 13:14:25 UTC+1 Uvais Ibrahim wrote:
>>
>>> Hi,
>>>
>>> Since last night, my Prometheus EC2 servers are getting high traffic 
>>> unusually. When I was checking in Prometheus I can see this 
>>> metric scrape_samples_scraped with with increased size but without any 
>>> labels. What could be the reason?
>>>
>>>
>>> Thanks,
>>> Uvais Ibrahim
>>>
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/811fba5c-1bd3-4677-b276-84116180a1acn%40googlegroups.com.


Re: [prometheus-users] Re: Unusual traffic in prometheus nodes.

2023-07-27 Thread Stuart Clark

On 27/07/2023 15:51, Uvais Ibrahim wrote:

Hi Brain,

This is the query that I have used.

sum(scrape_samples_scraped)without(app,app_kubernetes_io_managed_by,clusterName,release,environment,instance,job,k8s_cluster,kubernetes_name,kubernetes_namespace,ou,app_kubernetes_io_component,app_kubernetes_io_name,app_kubernetes_io_version,kustomize_toolkit_fluxcd_io_name,kustomize_toolkit_fluxcd_io_namespace,application,name,role,app_kubernetes_io_instance,app_kubernetes_io_part_of,control_plane,beta_kubernetes_io_arch,beta_kubernetes_io_instance_type, 
beta_kubernetes_io_os, failure_domain_beta_kubernetes_io_region, 
failure_domain_beta_kubernetes_io_zone,kubernetes_io_arch, 
kubernetes_io_hostname, kubernetes_io_os, 
node_kubernetes_io_instance_type, nodegroup, 
topology_kubernetes_io_region, 
topology_kubernetes_io_zone,chart,heritage,revised,transit,component,namespace, 
pod_name, pod_template_hash, security_istio_io_tlsMode, 
service_istio_io_canonical_name, 
service_istio_io_canonical_revision,k8s_app,kubernetes_io_cluster_service,kubernetes_io_name,route_reflector)


Which simply excluded every label but still I am getting a result like 
this


{}  7525871918

I'm not sure what you are expecting, as that sounds about right. The 
query is adding together all the different variants of the 
scrape_samples_scraped metric (removing all the different labels), so if 
that is indeed a list of every label the query is going to return a 
value without any associated labels.


You want to be instead just graphing the raw scrape_samples_scraped 
metric (no sum or without) and see how it varies over time. Is there a 
particular job or target which has a huge increase in the graph, or new 
series appearing? As to why that might happen it could be many different 
reasons, but ideas could include:


* new version of software which increases number of exposed metrics (or 
more granular labels)
* bug in software where a label is set to something with high 
cardinality (e.g. there is a "path" label from a web app, which means a 
potentially infinite cardinality, and you could have had a web scan 
producing millions of combinations)
* lots of changes to the targets, such as new instances of software or 
high churn of applications restarting


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/7423743a-8b29-625d-4472-6aa710cf1179%40Jahingo.com.


[prometheus-users] Re: Unusual traffic in prometheus nodes.

2023-07-27 Thread Uvais Ibrahim
Hi Brain,

This is the query that I have used.

sum(scrape_samples_scraped)without(app,app_kubernetes_io_managed_by,clusterName,release,environment,instance,job,k8s_cluster,kubernetes_name,kubernetes_namespace,ou,app_kubernetes_io_component,app_kubernetes_io_name,app_kubernetes_io_version,kustomize_toolkit_fluxcd_io_name,kustomize_toolkit_fluxcd_io_namespace,application,name,role,app_kubernetes_io_instance,app_kubernetes_io_part_of,control_plane,beta_kubernetes_io_arch,beta_kubernetes_io_instance_type,
 
beta_kubernetes_io_os, failure_domain_beta_kubernetes_io_region, 
failure_domain_beta_kubernetes_io_zone,kubernetes_io_arch, 
kubernetes_io_hostname, kubernetes_io_os, node_kubernetes_io_instance_type, 
nodegroup, topology_kubernetes_io_region, 
topology_kubernetes_io_zone,chart,heritage,revised,transit,component,namespace, 
pod_name, pod_template_hash, security_istio_io_tlsMode, 
service_istio_io_canonical_name, 
service_istio_io_canonical_revision,k8s_app,kubernetes_io_cluster_service,kubernetes_io_name,route_reflector)

Which simply excluded every label but still I am getting a result like this

{}  7525871918


It shouldn't return any results right?

Prometheus version: 2.36.2

By increased traffic I meant that, the prometheus servers are getting high 
traffic from a specific point of time. Currently prometheus is getting 13 
million packets earlier it was like 2 to 3 M packets on an average. And the 
prometheus endpoint is not public.


On Thursday, July 27, 2023 at 6:06:10 PM UTC+5:30 Brian Candler wrote:

> scrape_samples_scraped always has the labels which prometheus itself adds 
> (i.e. job and instance).
>
> Extraordinary claims require extraordinary evidence. Are you saying that 
> the PromQL query *scrape_samples_scraped{job="",instance=""}* returns a 
> result?  If so, what's the number?  What do you mean by "with increased 
> size" - increased as compared to what? And what version of prometheus are 
> you running?
>
> In any case, what you see with scrape_samples_scraped may be completely 
> unrelated to the "high traffic" issue.  Is your prometheus server exposed 
> to the Internet? Maybe someone is accessing it remotely.  Even if not, you 
> can use packet capture to work out where the traffic is going to and from.  
> A tool like https://www.sniffnet.net/ may be helpful.
>
> On Thursday, 27 July 2023 at 13:14:25 UTC+1 Uvais Ibrahim wrote:
>
>> Hi,
>>
>> Since last night, my Prometheus EC2 servers are getting high traffic 
>> unusually. When I was checking in Prometheus I can see this 
>> metric scrape_samples_scraped with with increased size but without any 
>> labels. What could be the reason?
>>
>>
>> Thanks,
>> Uvais Ibrahim
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c6f46be3-6aca-42b9-9164-7eca6b598dddn%40googlegroups.com.


[prometheus-users] Re: Unusual traffic in prometheus nodes.

2023-07-27 Thread Brian Candler
scrape_samples_scraped always has the labels which prometheus itself adds 
(i.e. job and instance).

Extraordinary claims require extraordinary evidence. Are you saying that 
the PromQL query *scrape_samples_scraped{job="",instance=""}* returns a 
result?  If so, what's the number?  What do you mean by "with increased 
size" - increased as compared to what? And what version of prometheus are 
you running?

In any case, what you see with scrape_samples_scraped may be completely 
unrelated to the "high traffic" issue.  Is your prometheus server exposed 
to the Internet? Maybe someone is accessing it remotely.  Even if not, you 
can use packet capture to work out where the traffic is going to and from.  
A tool like https://www.sniffnet.net/ may be helpful.

On Thursday, 27 July 2023 at 13:14:25 UTC+1 Uvais Ibrahim wrote:

> Hi,
>
> Since last night, my Prometheus EC2 servers are getting high traffic 
> unusually. When I was checking in Prometheus I can see this 
> metric scrape_samples_scraped with with increased size but without any 
> labels. What could be the reason?
>
>
> Thanks,
> Uvais Ibrahim
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5ec6dd0e-380b-464b-8e3f-4d813920e51cn%40googlegroups.com.


[prometheus-users] Unusual traffic in prometheus nodes.

2023-07-27 Thread Uvais Ibrahim
Hi,

Since last night, my Prometheus EC2 servers are getting high traffic 
unusually. When I was checking in Prometheus I can see this 
metric scrape_samples_scraped with with increased size but without any 
labels. What could be the reason?


Thanks,
Uvais Ibrahim



-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e1bf98cb-7bbe-4f18-bf68-281ca5148c0fn%40googlegroups.com.