Yangze Guo created FLINK-20864:
----------------------------------

             Summary: Apply exact matching rules in fulfilling resource 
requirement with slot resource
                 Key: FLINK-20864
                 URL: https://issues.apache.org/jira/browse/FLINK-20864
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
            Reporter: Yangze Guo
             Fix For: 1.13.0
         Attachments: 屏幕快照 2021-01-06 下午5.34.17.png

Currently, ResourceProfile::isMatching uses the following rules (hereinafter, 
*loose matching*) to decide whether a slot resource can be used to fulfill the 
given resource requirement, in both SlotManager and SlotPool:
 * An unspecified requirement (ResourceProfile::UNKNOWN) can be fulfilled by 
any resource.
 * A specified requirement can be fulfilled by any resource that is greater 
than or equal to itself. Note that this rule is not taking effect since there’s 
no specified requirement atm.

The loose matching rules were designed before the dynamic slot allocation. 
Under the assumption that resources of slots are decided when the TM is started 
and cannot be changed, the loose matching rules have the following advantages.
 * For standalone deployments, it allows slot requests to be fulfilled when the 
slots of pre-launched TMs can hardly have the exact required resources.
 * For active resource manager deployments, it increases the chance of slots 
being reused, thus reducing the cost of starting new TMs for various resource 
requirements.

With dynamic slot allocation introduced in FLIP-56, the benefits of the loose 
matching rules have been significantly reduced. As slots can be dynamically 
created after the TMs being started, with any desired resources as long as 
available, the only benefit the loose matching rules retain is to avoid 
allocating new slots when the slots can be reused on the JM side, which is 
insignificant since there’s no need to start new TMs.

 

On the other hand, the loose matching rules also introduce some problems.
 * Reusing larger slots for fulfilling smaller requirements can harm resource 
utilization.
 * It’s not straightforward to always find a feasible matching solution 
(assuming there is one) when matching a set of requirements and slots, in cases 
of job failovers or declarative slot allocation protocol.

!屏幕快照 2021-01-06 下午5.34.17.png!

The above figure demonstrates how it could fail to find the feasible matching 
solution with the loose matching rules. Assuming there are two resource 
requirements A and B, and there are two slots X and Y. The number below each 
Requirement/Slot represents the amount of resource. Then A can be fulfilled 
with X and Y, while B can only be fulfilled with Y. A feasible matching is 
shown on the left, where both requirements can be fulfilled. However, the loose 
matching rules can also result in another matching, shown on the right, where A 
is fulfilled by Y, leaving B and X unmatched. 

Given the reduction of its benefits and the problems it introduced, we proposed 
to replace the loose matching rules with the following *exact matching* rules.
 * An unspecified requirement (ResourceProfile::UNKNOWN) can only be fulfilled 
by a TM's default slot resource.
 * A specified requirement can only be fulfilled by a resource that is equal to 
itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to