[ https://issues.apache.org/jira/browse/FLINK-33960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815475#comment-17815475 ]
Rui Fan edited comment on FLINK-33960 at 2/18/24 9:38 AM: ---------------------------------------------------------- Merged to: master(1.20) via: 777e96f0fbd90e5a45366c0fd54bda85dc813b94 and 71336d6c874cd3e4da3b694e22df132dff51a6a8 1.19 via: 891a9755eba1ba1be7598c35e8b3060dbf28b482 and d039c6f8ace1861517064779999bfa95ac312218 1.18 via: ae8f8b1eb839dff388f3435a3d16e0db33e1e41e and 6cd71506fe474eca3b02fbb064912fba9f242b94 was (Author: fanrui): Merged to: master(1.20) via: 777e96f0fbd90e5a45366c0fd54bda85dc813b94 and 71336d6c874cd3e4da3b694e22df132dff51a6a8 1.18 via: ae8f8b1eb839dff388f3435a3d16e0db33e1e41e and 6cd71506fe474eca3b02fbb064912fba9f242b94 > Adaptive Scheduler doesn't respect the lowerBound for tasks > ----------------------------------------------------------- > > Key: FLINK-33960 > URL: https://issues.apache.org/jira/browse/FLINK-33960 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Affects Versions: 1.17.2, 1.18.1 > Reporter: Rui Fan > Assignee: Rui Fan > Priority: Major > Labels: pull-request-available > Fix For: 1.19.0, 1.18.2, 1.20.0 > > > Adaptive Scheduler doesn't respect the lowerBound for tasks when one flink > job has more than 1 tasks. > > When we using the adaptive scheduler and the rescale api, users will set the > lowerBound and upperBound for each job vertices. And users expect the > parallelism of all vertices between lowerBound and upperBound. > But when one flink job has more than 1 vertex, and resource isn't enough. > Some of lowerBound won't be respect. > h2. How to reproduce this bug: > One job has 2 job vertices, we set the resource requirements are: > * Vertex1: lowerBound=2, upperBound=2 > * Vertex2: lowerBound=8, upperBound=8 > They are same slotSharingGroup, and we only 5 available slots. This job > shouldn't run due to the slots cannot meets the resource requiremnt for > vertex2. > But the job can runs, and the parallelism of vertex2 is 5. > > h2. Why does this bug happen? > Flink calculates the minimumRequiredSlots for each slot sharing group, it > should be the {color:#FF0000}max{color} lowerBound for all vertices of > current slot sharing group. > But it's using the on the {color:#FF0000}minimum{color} lowerBound. -- This message was sent by Atlassian Jira (v8.20.10#820010)