Yangze Guo created FLINK-20864:
----------------------------------
Summary: Apply exact matching rules in fulfilling resource
requirement with slot resource
Key: FLINK-20864
URL: https://issues.apache.org/jira/browse/FLINK-20864
Project: Flink
Issue Type: Sub-task
Components: Runtime / Coordination
Reporter: Yangze Guo
Fix For: 1.13.0
Attachments: 屏幕快照 2021-01-06 下午5.34.17.png
Currently, ResourceProfile::isMatching uses the following rules (hereinafter,
*loose matching*) to decide whether a slot resource can be used to fulfill the
given resource requirement, in both SlotManager and SlotPool:
* An unspecified requirement (ResourceProfile::UNKNOWN) can be fulfilled by
any resource.
* A specified requirement can be fulfilled by any resource that is greater
than or equal to itself. Note that this rule is not taking effect since there’s
no specified requirement atm.
The loose matching rules were designed before the dynamic slot allocation.
Under the assumption that resources of slots are decided when the TM is started
and cannot be changed, the loose matching rules have the following advantages.
* For standalone deployments, it allows slot requests to be fulfilled when the
slots of pre-launched TMs can hardly have the exact required resources.
* For active resource manager deployments, it increases the chance of slots
being reused, thus reducing the cost of starting new TMs for various resource
requirements.
With dynamic slot allocation introduced in FLIP-56, the benefits of the loose
matching rules have been significantly reduced. As slots can be dynamically
created after the TMs being started, with any desired resources as long as
available, the only benefit the loose matching rules retain is to avoid
allocating new slots when the slots can be reused on the JM side, which is
insignificant since there’s no need to start new TMs.
On the other hand, the loose matching rules also introduce some problems.
* Reusing larger slots for fulfilling smaller requirements can harm resource
utilization.
* It’s not straightforward to always find a feasible matching solution
(assuming there is one) when matching a set of requirements and slots, in cases
of job failovers or declarative slot allocation protocol.
!屏幕快照 2021-01-06 下午5.34.17.png!
The above figure demonstrates how it could fail to find the feasible matching
solution with the loose matching rules. Assuming there are two resource
requirements A and B, and there are two slots X and Y. The number below each
Requirement/Slot represents the amount of resource. Then A can be fulfilled
with X and Y, while B can only be fulfilled with Y. A feasible matching is
shown on the left, where both requirements can be fulfilled. However, the loose
matching rules can also result in another matching, shown on the right, where A
is fulfilled by Y, leaving B and X unmatched.
Given the reduction of its benefits and the problems it introduced, we proposed
to replace the loose matching rules with the following *exact matching* rules.
* An unspecified requirement (ResourceProfile::UNKNOWN) can only be fulfilled
by a TM's default slot resource.
* A specified requirement can only be fulfilled by a resource that is equal to
itself.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)