Re: Flink scheduler keeps trying to schedule the pods indefinitely

2024-05-05 Thread Gyula Fóra
Hey! Let me first answer your questions then provide some actual solution hopefully :) 1. The adaptive scheduler would not reduce the vertex desired parallelism in this case but it should allow the job to start depending on the lower/upper bound resource config. There have been some changes in ho

Re: Looking for help with Job Initialisation issue

2024-05-05 Thread Biao Geng
Hi Abhi, If your case can be reproduced steadily, have your ever tried to get the thread dump of the TM which the problematic operator resides in? Maybe we can get more clues with the thread dump to see where the operator is getting stuck. Best, Biao Geng Abhi Sagar Khatri via user 于2024年4月30日周

Flink scheduler keeps trying to schedule the pods indefinitely

2024-05-05 Thread Chetas Joshi
Hello, I am running a flink job in the application mode on k8s. It's deployed as a FlinkDeployment and its life-cycle is managed by the flink-k8s-operator. The autoscaler is being used with the following config job.autoscaler.enabled: true job.autoscaler.metrics.window: 5m job.autoscaler.stabiliz

Re: Coordinator of operator ... does not exist or the job vertex this operator belongs to is not initialized.

2024-05-05 Thread Yunfeng Zhou
Hi Eduard, You may need to set log level = INFO to see if there are any other error messages generated in the JM or TM's log. The current exception message seems to be a result error generated from the JM, but the causing error message should still be lying somewhere in the TM's log. Best Yunfeng