Hi Pedro,

That’s interesting, and something we’d like to be able to control as well.

I did a little research, and it seems like (with some stunts) there could be a 
way to achieve this via CoLocationConstraint 
<https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/jobmanager/scheduler/CoLocationConstraint.html>/CoLocationGroup
 magic.

Though CoLocationConstraint is for ensuring the different subtasks of different 
JobVertices are executed on the same Instance 
<https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/instance/Instance.html>
 (Task Manager), versus ensuring they’re executed on different Task Managers.

The only thing I found on the list was this snippet (from Till), a few years 
back...

> If your requirement is that O_i will be executed in the same slot as P_i, 
> then you have to add the corresponding JobVertices to a CoLocationGroup. At 
> the moment this is not really exposed but you could try to get the JobGraph 
> from the StreamGraph.getJobGraph and then use JobGraph.getVertices to get the 
> JobVertices. Then you have to find out which JobVertices accommodate your 
> operators. Once this is done, you can colocate them via the 
> JobVertex.setStrictlyCoLocatedWith method. This might solve your problem, but 
> I haven’t tested it myself.
> 
Hoping someone with actual knowledge of the task to slot allocation logic can 
chime in here with a solution :)

— Ken



> On Apr 18, 2018, at 9:10 AM, PedroMrChaves <pedro.mr.cha...@gmail.com> wrote:
> 
> Hello,
> 
> I have a job that has one async operational node (i.e. implements
> AsyncFunction). This Operational node will spawn multiple threads that
> perform heavy tasks (cpu bound). 
> 
> I have a Flink Standalone cluster deployed on two machines of 32 cores and
> 128 gb of RAM, each machine has one task manager and one Job Manager. When I
> deploy the job, all of the subtasks from the async operational node end up
> on the same machine, which causes it to have a much higher cpu load then the
> other. 
> 
> I've researched ways to overcome this issue, but I haven't found a solution
> to my problem. 
> Ideally, the subtasks would be evenly split across both machines. 
> 
> Can this problem be solved somehow? 
> 
> Regards,
> Pedro Chaves. 
> 
> 
> 
> -----
> Best Regards,
> Pedro Chaves
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

--------------------------------------------
http://about.me/kkrugler
+1 530-210-6378

Reply via email to