I have a prometheus pod running on my cluster requesting 24gb of memory. It 
runs great most of the time but when it needs to read a large WAL it OOMs 
and the WAL has to be deleted manually. 
I've saw multiple messages suggesting that the instance just needs more 
memory in order to replay the WAL, but I wanted to get a better grasp of 
how the size of WAL relates to the memory the pod needs - if I can find out 
said ratio, I can probably alert well in advance if the WAL grows too big 
and needs to be handled with before we get in a crashloop.

Any idea how can I find out this information?

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d1b4df82-6be0-4b50-8731-9fe9890e41f0n%40googlegroups.com.

Reply via email to