[prometheus-users] Prometheus High RAM Investigation

Shubham Shrivastav Tue, 25 Jan 2022 21:58:01 -0800

Hi all, 

I've been investigating Prometheus memory utilization over the last couple 
of days.


Based on *pprof* command outputs, I do see a lot of memory utilized by 
*getOrSet* function, but according to docs, it's just for creating new 
series, so not sure what I can do about it.


Pprof "top" output: 
https://pastebin.com/bAF3fGpN

Also, to figure out if I have any metrics that I can remove I ran ./tsdb 
analyze on memory *(output here: https://pastebin.com/twsFiuRk)*

I did find some metrics having more cardinality than others but the 
difference was not very massive.

With ~100 nodes our RAM takes around 15 Gigs.

We're getting* average Metrics Per node: 8257*

Our estimation is around 200 nodes, which will make our RAM go through the 
roof.

Apart from distributing our load over multiple Prometheus nodes, are there 
any alertnatives?

TIA,
Shubham

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/201d2d39-f5ca-4dfa-948b-5c6b54aa4dc4n%40googlegroups.com.

[prometheus-users] Prometheus High RAM Investigation

Reply via email to