[ https://issues.apache.org/jira/browse/FLINK-12122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16845062#comment-16845062 ]
Ben La Monica commented on FLINK-12122: --------------------------------------- I'm running into this exact problem, I have a CoProcessFunction that contains a large amount of state, and they are spread primarily on only 2 of my 6 task managers. This causes memory issues on those boxes and then there is 60GB of ram on the third box unused. ||TaskManager||Num Slots Used for Memory Intensive Tasks|| |ip-10-255-58-174:39389|17| |ip-10-255-58-174:45161|8| |ip-10-255-58-179:33657|1| |ip-10-255-58-179:38439|0| |ip-10-255-58-44:40181|6| |ip-10-255-58-44:45435|18| And then I end up with resource usage in my YARN cluster that looks like this: !image-2019-05-21-12-28-29-538.png! Is there an estimate on when this problem will be fixed? I'm pretty much blocked unless I move to much larger servers and that is wasteful of money :). > Spread out tasks evenly across all available registered TaskManagers > -------------------------------------------------------------------- > > Key: FLINK-12122 > URL: https://issues.apache.org/jira/browse/FLINK-12122 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination > Affects Versions: 1.6.4, 1.7.2, 1.8.0 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Priority: Major > Fix For: 1.7.3, 1.9.0, 1.8.1 > > Attachments: image-2019-05-21-12-28-29-538.png > > > With Flip-6, we changed the default behaviour how slots are assigned to > {{TaskManages}}. Instead of evenly spreading it out over all registered > {{TaskManagers}}, we randomly pick slots from {{TaskManagers}} with a > tendency to first fill up a TM before using another one. This is a regression > wrt the pre Flip-6 code. > I suggest to change the behaviour so that we try to evenly distribute slots > across all available {{TaskManagers}} by considering how many of their slots > are already allocated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)