[prometheus-users] PromCon 2023 in Berlin
Hey everyone, many have been asking for a while. We’re pleased to announce that PromCon 2023 EU is happening in Berlin Germany September 28th & 29th at Radialsystem. Tickets are available via the PromCon website: https://promcon.io/2023-berlin/register/ We are looking for talk proposals until August 18th: https://promcon.io/2023-berlin/submit/ If your company wants to sponsor PromCon you can also find more information on our website: https://promcon.io/2023-berlin/sponsor/ If there are any other questions, please reach out to our mailing list: promcon-organiz...@googlegroups.com We are super excited to see everyone in Berlin! Cheers, Matthias on behalf of the PromCon Organizers -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAN4-%2B%2BNnhwzcJ%3D%2Bcog-iDFhjXRuNPQuTDfVQ2358gUMN17HuFw%40mail.gmail.com.
[prometheus-users] Re: Unusual traffic in prometheus nodes.
As Stuart says, that looks correct, assuming your metrics don't have any labels other than the ones you've excluded. You'd save a lot of typing just by doing: sum(scrape_samples_scraped) which is expected to return a single value, with no labels (as it's summed across all timeseries of this metric). The value 7,525,871,918 does seem quite high - what was it before? You can set an execution time for this query in the PromQL browser, or draw a graph this expression over time, to see historical values. You could also look at count(scrape_samples_scraped) or more simply count(up) and see if that has jumped up: it would imply that lots more targets have been added (e.g. more pods are being monitored). If not, then as well as Stuart's suggestion of graphing "scrape_samples_scraped" by itself to see if one particular target is generating way more metrics than usual, you could try different summary variants like sum by (instance,job) (scrape_samples_scraped) sum by (clusterName) (scrape_samples_scraped) ... etc and see if there's a spike in any of these. This may help you drill down to the offending item(s). On Thursday, 27 July 2023 at 15:51:24 UTC+1 Uvais Ibrahim wrote: > Hi Brain, > > This is the query that I have used. > > sum(scrape_samples_scraped)without(app,app_kubernetes_io_managed_by,clusterName,release,environment,instance,job,k8s_cluster,kubernetes_name,kubernetes_namespace,ou,app_kubernetes_io_component,app_kubernetes_io_name,app_kubernetes_io_version,kustomize_toolkit_fluxcd_io_name,kustomize_toolkit_fluxcd_io_namespace,application,name,role,app_kubernetes_io_instance,app_kubernetes_io_part_of,control_plane,beta_kubernetes_io_arch,beta_kubernetes_io_instance_type, > > beta_kubernetes_io_os, failure_domain_beta_kubernetes_io_region, > failure_domain_beta_kubernetes_io_zone,kubernetes_io_arch, > kubernetes_io_hostname, kubernetes_io_os, node_kubernetes_io_instance_type, > nodegroup, topology_kubernetes_io_region, > topology_kubernetes_io_zone,chart,heritage,revised,transit,component,namespace, > > pod_name, pod_template_hash, security_istio_io_tlsMode, > service_istio_io_canonical_name, > service_istio_io_canonical_revision,k8s_app,kubernetes_io_cluster_service,kubernetes_io_name,route_reflector) > > Which simply excluded every label but still I am getting a result like this > > {} 7525871918 > > > It shouldn't return any results right? > > Prometheus version: 2.36.2 > > By increased traffic I meant that, the prometheus servers are getting high > traffic from a specific point of time. Currently prometheus is getting 13 > million packets earlier it was like 2 to 3 M packets on an average. And the > prometheus endpoint is not public. > > > On Thursday, July 27, 2023 at 6:06:10 PM UTC+5:30 Brian Candler wrote: > >> scrape_samples_scraped always has the labels which prometheus itself adds >> (i.e. job and instance). >> >> Extraordinary claims require extraordinary evidence. Are you saying that >> the PromQL query *scrape_samples_scraped{job="",instance=""}* returns a >> result? If so, what's the number? What do you mean by "with increased >> size" - increased as compared to what? And what version of prometheus are >> you running? >> >> In any case, what you see with scrape_samples_scraped may be completely >> unrelated to the "high traffic" issue. Is your prometheus server exposed >> to the Internet? Maybe someone is accessing it remotely. Even if not, you >> can use packet capture to work out where the traffic is going to and from. >> A tool like https://www.sniffnet.net/ may be helpful. >> >> On Thursday, 27 July 2023 at 13:14:25 UTC+1 Uvais Ibrahim wrote: >> >>> Hi, >>> >>> Since last night, my Prometheus EC2 servers are getting high traffic >>> unusually. When I was checking in Prometheus I can see this >>> metric scrape_samples_scraped with with increased size but without any >>> labels. What could be the reason? >>> >>> >>> Thanks, >>> Uvais Ibrahim >>> >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/811fba5c-1bd3-4677-b276-84116180a1acn%40googlegroups.com.
Re: [prometheus-users] Re: Unusual traffic in prometheus nodes.
On 27/07/2023 15:51, Uvais Ibrahim wrote: Hi Brain, This is the query that I have used. sum(scrape_samples_scraped)without(app,app_kubernetes_io_managed_by,clusterName,release,environment,instance,job,k8s_cluster,kubernetes_name,kubernetes_namespace,ou,app_kubernetes_io_component,app_kubernetes_io_name,app_kubernetes_io_version,kustomize_toolkit_fluxcd_io_name,kustomize_toolkit_fluxcd_io_namespace,application,name,role,app_kubernetes_io_instance,app_kubernetes_io_part_of,control_plane,beta_kubernetes_io_arch,beta_kubernetes_io_instance_type, beta_kubernetes_io_os, failure_domain_beta_kubernetes_io_region, failure_domain_beta_kubernetes_io_zone,kubernetes_io_arch, kubernetes_io_hostname, kubernetes_io_os, node_kubernetes_io_instance_type, nodegroup, topology_kubernetes_io_region, topology_kubernetes_io_zone,chart,heritage,revised,transit,component,namespace, pod_name, pod_template_hash, security_istio_io_tlsMode, service_istio_io_canonical_name, service_istio_io_canonical_revision,k8s_app,kubernetes_io_cluster_service,kubernetes_io_name,route_reflector) Which simply excluded every label but still I am getting a result like this {} 7525871918 I'm not sure what you are expecting, as that sounds about right. The query is adding together all the different variants of the scrape_samples_scraped metric (removing all the different labels), so if that is indeed a list of every label the query is going to return a value without any associated labels. You want to be instead just graphing the raw scrape_samples_scraped metric (no sum or without) and see how it varies over time. Is there a particular job or target which has a huge increase in the graph, or new series appearing? As to why that might happen it could be many different reasons, but ideas could include: * new version of software which increases number of exposed metrics (or more granular labels) * bug in software where a label is set to something with high cardinality (e.g. there is a "path" label from a web app, which means a potentially infinite cardinality, and you could have had a web scan producing millions of combinations) * lots of changes to the targets, such as new instances of software or high churn of applications restarting -- Stuart Clark -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7423743a-8b29-625d-4472-6aa710cf1179%40Jahingo.com.
[prometheus-users] Re: Unusual traffic in prometheus nodes.
Hi Brain, This is the query that I have used. sum(scrape_samples_scraped)without(app,app_kubernetes_io_managed_by,clusterName,release,environment,instance,job,k8s_cluster,kubernetes_name,kubernetes_namespace,ou,app_kubernetes_io_component,app_kubernetes_io_name,app_kubernetes_io_version,kustomize_toolkit_fluxcd_io_name,kustomize_toolkit_fluxcd_io_namespace,application,name,role,app_kubernetes_io_instance,app_kubernetes_io_part_of,control_plane,beta_kubernetes_io_arch,beta_kubernetes_io_instance_type, beta_kubernetes_io_os, failure_domain_beta_kubernetes_io_region, failure_domain_beta_kubernetes_io_zone,kubernetes_io_arch, kubernetes_io_hostname, kubernetes_io_os, node_kubernetes_io_instance_type, nodegroup, topology_kubernetes_io_region, topology_kubernetes_io_zone,chart,heritage,revised,transit,component,namespace, pod_name, pod_template_hash, security_istio_io_tlsMode, service_istio_io_canonical_name, service_istio_io_canonical_revision,k8s_app,kubernetes_io_cluster_service,kubernetes_io_name,route_reflector) Which simply excluded every label but still I am getting a result like this {} 7525871918 It shouldn't return any results right? Prometheus version: 2.36.2 By increased traffic I meant that, the prometheus servers are getting high traffic from a specific point of time. Currently prometheus is getting 13 million packets earlier it was like 2 to 3 M packets on an average. And the prometheus endpoint is not public. On Thursday, July 27, 2023 at 6:06:10 PM UTC+5:30 Brian Candler wrote: > scrape_samples_scraped always has the labels which prometheus itself adds > (i.e. job and instance). > > Extraordinary claims require extraordinary evidence. Are you saying that > the PromQL query *scrape_samples_scraped{job="",instance=""}* returns a > result? If so, what's the number? What do you mean by "with increased > size" - increased as compared to what? And what version of prometheus are > you running? > > In any case, what you see with scrape_samples_scraped may be completely > unrelated to the "high traffic" issue. Is your prometheus server exposed > to the Internet? Maybe someone is accessing it remotely. Even if not, you > can use packet capture to work out where the traffic is going to and from. > A tool like https://www.sniffnet.net/ may be helpful. > > On Thursday, 27 July 2023 at 13:14:25 UTC+1 Uvais Ibrahim wrote: > >> Hi, >> >> Since last night, my Prometheus EC2 servers are getting high traffic >> unusually. When I was checking in Prometheus I can see this >> metric scrape_samples_scraped with with increased size but without any >> labels. What could be the reason? >> >> >> Thanks, >> Uvais Ibrahim >> >> >> >> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c6f46be3-6aca-42b9-9164-7eca6b598dddn%40googlegroups.com.
[prometheus-users] Re: Unusual traffic in prometheus nodes.
scrape_samples_scraped always has the labels which prometheus itself adds (i.e. job and instance). Extraordinary claims require extraordinary evidence. Are you saying that the PromQL query *scrape_samples_scraped{job="",instance=""}* returns a result? If so, what's the number? What do you mean by "with increased size" - increased as compared to what? And what version of prometheus are you running? In any case, what you see with scrape_samples_scraped may be completely unrelated to the "high traffic" issue. Is your prometheus server exposed to the Internet? Maybe someone is accessing it remotely. Even if not, you can use packet capture to work out where the traffic is going to and from. A tool like https://www.sniffnet.net/ may be helpful. On Thursday, 27 July 2023 at 13:14:25 UTC+1 Uvais Ibrahim wrote: > Hi, > > Since last night, my Prometheus EC2 servers are getting high traffic > unusually. When I was checking in Prometheus I can see this > metric scrape_samples_scraped with with increased size but without any > labels. What could be the reason? > > > Thanks, > Uvais Ibrahim > > > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/5ec6dd0e-380b-464b-8e3f-4d813920e51cn%40googlegroups.com.
[prometheus-users] Unusual traffic in prometheus nodes.
Hi, Since last night, my Prometheus EC2 servers are getting high traffic unusually. When I was checking in Prometheus I can see this metric scrape_samples_scraped with with increased size but without any labels. What could be the reason? Thanks, Uvais Ibrahim -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e1bf98cb-7bbe-4f18-bf68-281ca5148c0fn%40googlegroups.com.