Re: Pod Disruption in Flink Kubernetes Cluster

2022-01-09 Thread Yang Wang
Maybe the Flink applications could run more stably if you configure enough resources(e.g. memory, cpu, ephemeral-storage) for the JobManager and TaskManager pods. Best, Yang David Morávek 于2022年1月5日周三 16:46写道: > Hi Tianyi, > > this really depends on your kubernetes setup (eg. if autoscaling is

Re: Pod Disruption in Flink Kubernetes Cluster

2022-01-05 Thread David Morávek
Hi Tianyi, this really depends on your kubernetes setup (eg. if autoscaling is enabled, you're using spot / preemtible instances). In general applications that run on Kubernetes needs be resilient to these kind of failures, Flink is no exception. In case of the failure, Flink needs to restart

Pod Disruption in Flink Kubernetes Cluster

2022-01-04 Thread Tianyi Deng
Hello Flink community, We have a Flink cluster deployed to AWS EKS along with many other applications. This cluster is managed by Spotify’s Flink operator. After deployment I notice the Stateful pods of job manager and task managers intermittently received SIGTERM to terminate themselves. I