we have around 50 nodes with 64 gig of ram.

by the way, we found that our backend added a metric that spammed the 
prometheus until it crashed :)
they removed the metric and the server seems to be stable.
still using around 30gb of ram but at least not crashing

ב-יום שני, 23 באוגוסט 2021 בשעה 16:25:21 UTC+3, [email protected] כתב/ה:

> Seems about correct for that many series. Kubernetes use includes a lot of 
> label data/cardinality that requires extra memory for tracking.
>
> How big is your cluster in terms of total memory for all nodes?
>
> On Mon, Aug 23, 2021 at 2:18 PM Yaron B <[email protected]> wrote:
>
>> that makes sense but if I look at the numbers in the url you gave me:
>> Number of Series 2514033
>> Number of Chunks 3098707
>> Number of Label Pairs 1088507
>> and use them in memory calculator I found, it shows me much less ram than 
>> what I am using now.
>>
>> do you see any number here that should be a red light for me? something 
>> that is not right?
>> ב-יום שני, 23 באוגוסט 2021 בשעה 14:58:36 UTC+3, [email protected] כתב/ה:
>>
>>> Prometheus needs memory to buffer incoming data before writing it to 
>>> disk. The more you scrape, the more it needs.
>>>
>>> You can see a summary of this information on prometheus:9090/tsdb-status
>>>
>>> On Mon, Aug 23, 2021 at 1:55 PM Yaron B <[email protected]> wrote:
>>>
>>>> can anyone understand from this image why is the server is using so 
>>>> much ?
>>>> production-prometheus-server-869bffc459-r92nh                    
>>>>  1186m        54937Mi
>>>> thats crazy!
>>>> ב-יום שני, 23 באוגוסט 2021 בשעה 13:35:18 UTC+3, ‪Yaron B‬‏ כתב/ה:
>>>>
>>>>> at the moment we did add some scrape jobs that bumped the memory usage 
>>>>> from around 30gb to 40gb but we are not sure why the self scraping takes 
>>>>> so 
>>>>> much ram.
>>>>>  its not a new implementation, we did notice it is using a lot of 
>>>>> memory but it didn't crash on us so we let it run. today 
>>>>> as you can see in the attached image, it crashed, skyrocket the memory 
>>>>> usage to 60gb ,then we started to disable jobs until the server didn't 
>>>>> crash anymore but it is using more than it used in the last 15 days
>>>>>
>>>>>
>>>>> ב-יום שני, 23 באוגוסט 2021 בשעה 13:29:59 UTC+3, Stuart Clark כתב/ה:
>>>>>
>>>>>> On 23/08/2021 11:23, Yaron B wrote:
>>>>>>
>>>>>> I am attaching the heap.svg if someone can help me figure out what is 
>>>>>> using the memory 
>>>>>> ב-יום שני, 23 באוגוסט 2021 בשעה 12:23:33 UTC+3, ‪Yaron B‬‏ כתב/ה:
>>>>>>
>>>>>>> Hi, 
>>>>>>>
>>>>>>> we are facing an issue with the prometheus server memory usage.
>>>>>>> when starting the server it starts with around 30GB of ram , even 
>>>>>>> without any jobs configured other than the self one.
>>>>>>> in the image attached you can see the heap size usage for the 
>>>>>>> prometheus job. 
>>>>>>> is there a way to reduce this size? when we add our kubernetes 
>>>>>>> scrape job we reach our node limit and get OOMKilled.
>>>>>>>
>>>>>> So at the moment it isn't scraping anything other than itself via the 
>>>>>> /metrics endpoint?
>>>>>>
>>>>>> Is this a brand new service (i.e. no existing data stored on disk)?
>>>>>>
>>>>>> Is there anything querying the server (e.g. Grafana dashboards, etc.)?
>>>>>>
>>>>>> -- 
>>>>>> Stuart Clark
>>>>>>
>>>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Prometheus Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/prometheus-users/0659c262-daeb-452e-8dc4-4df8df22021dn%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/prometheus-users/0659c262-daeb-452e-8dc4-4df8df22021dn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/b17b8d09-fe23-4c43-b85e-c2f4d7a87539n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/b17b8d09-fe23-4c43-b85e-c2f4d7a87539n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8fefbe26-6cf0-498a-96d2-0bb21f536ee5n%40googlegroups.com.

Reply via email to