[ 
https://issues.apache.org/jira/browse/BEAM-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646815#comment-16646815
 ] 

Maximilian Michels commented on BEAM-5713:
------------------------------------------

After testing this with native Flink jobs, looking inside the Scheduler code, 
and checking out FLINK-1003, it is clear there is no round-robin task 
scheduling logic for distributing tasks across TaskManagers. The location of 
task slots is transparent for normal operators. If you have more task slots per 
TaskManagers than tasks, then very likely all tasks will end up on the same 
TaskManager.

If the cluster is sized to the Job, this shouldn't impact performance. Also, if 
all task slots are filled by multiple jobs, this should be fine. It is only 
problematic if the cluster is not fully utilized. Then, spreading the load 
across nodes should lead to a better performance.

> Flink portable runner schedules all tasks of streaming job on same task 
> manager
> -------------------------------------------------------------------------------
>
>                 Key: BEAM-5713
>                 URL: https://issues.apache.org/jira/browse/BEAM-5713
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.8.0
>            Reporter: Thomas Weise
>            Assignee: Maximilian Michels
>            Priority: Major
>              Labels: portability, portability-flink
>         Attachments: Different SlotSharingGroup.png, With 
> RichParallelSourceFunction and parallelism 5.png
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The cluster has 9 task managers and 144 task slots total. A simple streaming 
> pipeline with parallelism of 8 will get all tasks scheduled on the same task 
> manager, causing the host to be fully booked and the remaining cluster idle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to