eric666666 opened a new pull request, #28626:
URL: https://github.com/apache/flink/pull/28626
## What is the purpose of the change
When the cluster runs in application mode and
`JobManagerOptions#SCHEDULER_PREFER_MINIMAL_TASKMANAGERS_ENABLED` is enabled,
`DefaultSlotAssigner` tries to satisfy the requested slots from as few
TaskManagers as possible so that the remaining TaskManagers can be released.
`pickSlotsInMinimalTaskExecutors` walks the TaskManagers in descending order
of
their free-slot count and adds **all** slots of each one until the request is
met. Since the last (boundary) TaskManager is always added in full via
`addAll`, the method can return **more** slots than requested. For example,
with
`requestedGroups = 5` and two TaskManagers offering 4 and 3 free slots, it
returns 7 slots instead of 5.
The surplus does not corrupt the final assignment —
`SimpleSlotMatchingResolver`
only consumes the first `requestedGroups` slots — but the returned "minimal"
set pulls in extra slots from the boundary TaskManager that were never
needed,
which works against the method's own goal of minimizing the involved
TaskManagers and the fragmentation on the boundary TaskManager. The loop also
has no `hasNext()` guard and relies solely on the caller's pre-check for
termination.
This PR makes the method return exactly the requested number of slots and
removes the unguarded loop, while keeping the "prefer the fewest
TaskManagers"
behavior unchanged.
## Brief change log
- Rewrite `DefaultSlotAssigner#pickSlotsInMinimalTaskExecutors` to flatten
the
sorted TaskManagers' slots into a single stream and keep only the first
`requestedGroups` via `Stream#limit`. Because `limit` is lazy and
short-circuiting, TaskManagers beyond the boundary one are never
materialized,
and the boundary TaskManager contributes exactly the slots still needed.
- Change `getSortedTaskExecutors` to return a `Stream<ResourceID>` instead
of an
`Iterator<ResourceID>` so its result can feed the `flatMap` pipeline
directly.
- Strengthen `DefaultSlotAssignerTest`: assert the exact total slot count and
the per-TaskManager slot contribution (instead of only which TaskManagers
are
involved), and add regression cases for single-TaskManager over-pick, the
cumulative boundary trim, and the smallest TaskManager being left
untouched.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]