[ 
https://issues.apache.org/jira/browse/MESOS-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731423#comment-16731423
 ] 

Meng Zhu edited comment on MESOS-9324 at 1/9/19 2:35 AM:
---------------------------------------------------------

We have landed MESOS-9516. It is essentially the option 3 in the "Near-term 
mitigations" mentioned in the description.


was (Author: mzhu):
https://reviews.apache.org/r/69603/

> Resource fragmentation: frameworks may be starved of port resources in the 
> presence of large number frameworks with quota.
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-9324
>                 URL: https://issues.apache.org/jira/browse/MESOS-9324
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation
>            Reporter: Meng Zhu
>            Assignee: Meng Zhu
>            Priority: Major
>              Labels: mesosphere
>
> In our environment where there are 1.5k frameworks and quota is heavily 
> utilized, we would experience a severe resource fragmentation issue. 
> Specifically, we observed a large number of port-less offers circulating in 
> the cluster. Thus frameworks that need port resources are not able to launch 
> tasks even if their roles have quota (because currently, we can only set 
> quota for scalar resources, not port range resources).
> While most of the 1.5k frameworks do not suppress today and we believe the 
> situation will significantly improve once they do. Still, I think there are 
> some improvements the Mesos allocator can make to help.
> h3. How resource becomes fragmented
> The origin of these port-less offers stems from quota chopping. Specifically, 
> when chopping an agent to satisfy a role’s quota, we will also hand out 
> resources that this role does not have quota for (as long as it does not 
> break other role’s quota). These “extra resources” certainly includes ALL the 
> remaining port resources on the agent. After this offer, the agent will be 
> left with no port resources even though it still has CPUs and etc. Later, 
> these resources may be offered to other frameworks but they are useless due 
> to no ports. Now we have some “bad offers” in the cluster.
> h3. How resource fragmentation prolonged
> A resource offer, once it is declined (e.g. due to no ports), is recovered by 
> the allocator and offered to other frameworks again. Before this happens, it 
> is possible that this offer might be able to merge with either the remaining 
> resources or other declined resources on the same agent. However, it is 
> conceivable that not uncommonly, the declined offer will be hand out again 
> *as-is*.  This is especially probable if the allocator makes offers faster 
> than the framework offer response time. As a result, we will observe the 
> circulation of bad offers across different frameworks. These bad offers will 
> exist for a long time before being consolidated again. For how long? *The 
> longevity of the bad offer will be roughly proportional to the number of 
> active frameworks*. In the worse case, once all the active frameworks have 
> (hopefully long) declined the bad offer, the bad offer will have nowhere to 
> go and finally start to merge with other resources on that agent.
> Note, since the allocator performance has greatly improved in the past 
> several months. The scenario described here could be increasingly common. 
> Also, as we introduce quota limits and hierarchical quota, there will be much 
> more agent chopping, making resource fragmentation even worse.
> h3. Near-term Mitigations
> As mentioned above, the longevity of a bad offer is proportional to the 
> active frameworks. Thus framework suppression will certainly help. In 
> addition, from the Mesos side, a couple of mitigation measures are worth 
> considering (other than the long-term optimistic allocation strategy):
> 1. Adding a defragment interval once in a while in the allocator. For 
> example, each minute or a dozen allocation cycles or so, we will pause the 
> allocation, rescind all the offers and start allocating again. This 
> essentially eliminates all the circulating bad offers by giving them a chance 
> to be consolidated. Think of this as a periodic “reboot” of the allocator.
> 2. Consider chopping non-quota resources as well. Right now, for resources 
> such as ports (or any other resources that the role does not have quota for), 
> all are allocated in a single offer. We could choose to chop these non-quota 
> resources as well. For example, port resources can be distributed 
> proportionally to allocated CPU resources.
> 3. Provide support for specifying port quantities. With this, we can utilize 
> the existing quota or `min_allocatable_resources` APIs to guarantee a certain 
> number of port resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to