Hello,
We are using Flink 1.13.6 with Azure Kubernetes Service (AKS). Job worked fine 
for many months, but right after recent upgrade of AKS to 1.25.5, taskmanager 
started to get OOM killed every day. I suspected that this is maybe because AKS 
1.25.x uses cgroups v2, which is not supported by JDK 8 < 372, but on other 
hand Flink explicitly sets -Xmx, so looks like taskmanager should not depend on 
cgroup v2 support by JDK. Does RocksDB use cgroups to calculate limits? Any 
ideas what else it could be? Is anyone using AKS 1.25.x or any K8s with enabled 
cgroups v2 with Flink, how is your experience?

Thanks,
Alexey

Reply via email to