Re: Queryable state on task managers that are not running the job

2020-12-23 Thread Yun Tang
Hi Martin,

What kind of deploy mode you choose? If you use per-job mode [1] to launch 
jobs, there might exist only idle slots instead of idle taskmanagers. 
Currently, queryable state is bounded to specific job and if the idle 
taskmanager is not registered in the target's resource manager, no queryable 
state could be queried.


[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/yarn.html#per-job-cluster-mode

Best
Yun Tang

From: Martin Boyanov 
Sent: Monday, December 21, 2020 19:04
To: user@flink.apache.org 
Subject: Queryable state on task managers that are not running the job

Hi,
I'm running a long-running flink job in cluster mode and I'm interested in 
using the queryable state functionality.
I have the following problem: when I query the flink task managers (i.e. the 
queryable state proxy), it is possible to hit a task manager which doesn't have 
the requested state, because the job is not running on that task manager.
For example, I might have a cluster with 5 task managers, but the job is 
deployed only on 3 of those. If my query hits any of the two idle task 
managers, I naturally get an error message that the job does not exist.
My current solution is to size the cluster appropriately so that there are no 
idle task managers. I was wondering if there was a better solution or if this 
could be handled better in the future?
Thanks in advance.
Kind regards,
Martin


Queryable state on task managers that are not running the job

2020-12-21 Thread Martin Boyanov
Hi,
I'm running a long-running flink job in cluster mode and I'm interested in
using the queryable state functionality.
I have the following problem: when I query the flink task managers (i.e.
the queryable state proxy), it is possible to hit a task manager which
doesn't have the requested state, because the job is not running on that
task manager.
For example, I might have a cluster with 5 task managers, but the job is
deployed only on 3 of those. If my query hits any of the two idle task
managers, I naturally get an error message that the job does not exist.
My current solution is to size the cluster appropriately so that there are
no idle task managers. I was wondering if there was a better solution or if
this could be handled better in the future?
Thanks in advance.
Kind regards,
Martin