Hey!
Let me first answer your questions then provide some actual solution
hopefully :)
1. The adaptive scheduler would not reduce the vertex desired parallelism
in this case but it should allow the job to start depending on the
lower/upper bound resource config. There have been some changes in ho
Hi Abhi,
If your case can be reproduced steadily, have your ever tried to get
the thread dump of the TM which the problematic operator resides in?
Maybe we can get more clues with the thread dump to see where the
operator is getting stuck.
Best,
Biao Geng
Abhi Sagar Khatri via user 于2024年4月30日周
Hello,
I am running a flink job in the application mode on k8s. It's deployed as a
FlinkDeployment and its life-cycle is managed by the flink-k8s-operator.
The autoscaler is being used with the following config
job.autoscaler.enabled: true
job.autoscaler.metrics.window: 5m
job.autoscaler.stabiliz
Hi Eduard,
You may need to set log level = INFO to see if there are any other error
messages generated in the JM or TM's log. The current exception message
seems to be a result error generated from the JM, but the causing error
message should still be lying somewhere in the TM's log.
Best
Yunfeng