I have a prometheus pod running on my cluster requesting 24gb of memory. It runs great most of the time but when it needs to read a large WAL it OOMs and the WAL has to be deleted manually. I've saw multiple messages suggesting that the instance just needs more memory in order to replay the WAL, but I wanted to get a better grasp of how the size of WAL relates to the memory the pod needs - if I can find out said ratio, I can probably alert well in advance if the WAL grows too big and needs to be handled with before we get in a crashloop.
Any idea how can I find out this information? Thanks. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d1b4df82-6be0-4b50-8731-9fe9890e41f0n%40googlegroups.com.