we have around 50 nodes with 64 gig of ram. by the way, we found that our backend added a metric that spammed the prometheus until it crashed :) they removed the metric and the server seems to be stable. still using around 30gb of ram but at least not crashing
ב-יום שני, 23 באוגוסט 2021 בשעה 16:25:21 UTC+3, [email protected] כתב/ה: > Seems about correct for that many series. Kubernetes use includes a lot of > label data/cardinality that requires extra memory for tracking. > > How big is your cluster in terms of total memory for all nodes? > > On Mon, Aug 23, 2021 at 2:18 PM Yaron B <[email protected]> wrote: > >> that makes sense but if I look at the numbers in the url you gave me: >> Number of Series 2514033 >> Number of Chunks 3098707 >> Number of Label Pairs 1088507 >> and use them in memory calculator I found, it shows me much less ram than >> what I am using now. >> >> do you see any number here that should be a red light for me? something >> that is not right? >> ב-יום שני, 23 באוגוסט 2021 בשעה 14:58:36 UTC+3, [email protected] כתב/ה: >> >>> Prometheus needs memory to buffer incoming data before writing it to >>> disk. The more you scrape, the more it needs. >>> >>> You can see a summary of this information on prometheus:9090/tsdb-status >>> >>> On Mon, Aug 23, 2021 at 1:55 PM Yaron B <[email protected]> wrote: >>> >>>> can anyone understand from this image why is the server is using so >>>> much ? >>>> production-prometheus-server-869bffc459-r92nh >>>> 1186m 54937Mi >>>> thats crazy! >>>> ב-יום שני, 23 באוגוסט 2021 בשעה 13:35:18 UTC+3, Yaron B כתב/ה: >>>> >>>>> at the moment we did add some scrape jobs that bumped the memory usage >>>>> from around 30gb to 40gb but we are not sure why the self scraping takes >>>>> so >>>>> much ram. >>>>> its not a new implementation, we did notice it is using a lot of >>>>> memory but it didn't crash on us so we let it run. today >>>>> as you can see in the attached image, it crashed, skyrocket the memory >>>>> usage to 60gb ,then we started to disable jobs until the server didn't >>>>> crash anymore but it is using more than it used in the last 15 days >>>>> >>>>> >>>>> ב-יום שני, 23 באוגוסט 2021 בשעה 13:29:59 UTC+3, Stuart Clark כתב/ה: >>>>> >>>>>> On 23/08/2021 11:23, Yaron B wrote: >>>>>> >>>>>> I am attaching the heap.svg if someone can help me figure out what is >>>>>> using the memory >>>>>> ב-יום שני, 23 באוגוסט 2021 בשעה 12:23:33 UTC+3, Yaron B כתב/ה: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> we are facing an issue with the prometheus server memory usage. >>>>>>> when starting the server it starts with around 30GB of ram , even >>>>>>> without any jobs configured other than the self one. >>>>>>> in the image attached you can see the heap size usage for the >>>>>>> prometheus job. >>>>>>> is there a way to reduce this size? when we add our kubernetes >>>>>>> scrape job we reach our node limit and get OOMKilled. >>>>>>> >>>>>> So at the moment it isn't scraping anything other than itself via the >>>>>> /metrics endpoint? >>>>>> >>>>>> Is this a brand new service (i.e. no existing data stored on disk)? >>>>>> >>>>>> Is there anything querying the server (e.g. Grafana dashboards, etc.)? >>>>>> >>>>>> -- >>>>>> Stuart Clark >>>>>> >>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Prometheus Users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/prometheus-users/0659c262-daeb-452e-8dc4-4df8df22021dn%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/prometheus-users/0659c262-daeb-452e-8dc4-4df8df22021dn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Prometheus Users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/prometheus-users/b17b8d09-fe23-4c43-b85e-c2f4d7a87539n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/prometheus-users/b17b8d09-fe23-4c43-b85e-c2f4d7a87539n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/8fefbe26-6cf0-498a-96d2-0bb21f536ee5n%40googlegroups.com.

