[
https://issues.apache.org/jira/browse/MESOS-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731423#comment-16731423
]
Meng Zhu edited comment on MESOS-9324 at 1/9/19 2:35 AM:
---------------------------------------------------------
We have landed MESOS-9516. It is essentially the option 3 in the "Near-term
mitigations" mentioned in the description.
was (Author: mzhu):
https://reviews.apache.org/r/69603/
> Resource fragmentation: frameworks may be starved of port resources in the
> presence of large number frameworks with quota.
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: MESOS-9324
> URL: https://issues.apache.org/jira/browse/MESOS-9324
> Project: Mesos
> Issue Type: Bug
> Components: allocation
> Reporter: Meng Zhu
> Assignee: Meng Zhu
> Priority: Major
> Labels: mesosphere
>
> In our environment where there are 1.5k frameworks and quota is heavily
> utilized, we would experience a severe resource fragmentation issue.
> Specifically, we observed a large number of port-less offers circulating in
> the cluster. Thus frameworks that need port resources are not able to launch
> tasks even if their roles have quota (because currently, we can only set
> quota for scalar resources, not port range resources).
> While most of the 1.5k frameworks do not suppress today and we believe the
> situation will significantly improve once they do. Still, I think there are
> some improvements the Mesos allocator can make to help.
> h3. How resource becomes fragmented
> The origin of these port-less offers stems from quota chopping. Specifically,
> when chopping an agent to satisfy a role’s quota, we will also hand out
> resources that this role does not have quota for (as long as it does not
> break other role’s quota). These “extra resources” certainly includes ALL the
> remaining port resources on the agent. After this offer, the agent will be
> left with no port resources even though it still has CPUs and etc. Later,
> these resources may be offered to other frameworks but they are useless due
> to no ports. Now we have some “bad offers” in the cluster.
> h3. How resource fragmentation prolonged
> A resource offer, once it is declined (e.g. due to no ports), is recovered by
> the allocator and offered to other frameworks again. Before this happens, it
> is possible that this offer might be able to merge with either the remaining
> resources or other declined resources on the same agent. However, it is
> conceivable that not uncommonly, the declined offer will be hand out again
> *as-is*. This is especially probable if the allocator makes offers faster
> than the framework offer response time. As a result, we will observe the
> circulation of bad offers across different frameworks. These bad offers will
> exist for a long time before being consolidated again. For how long? *The
> longevity of the bad offer will be roughly proportional to the number of
> active frameworks*. In the worse case, once all the active frameworks have
> (hopefully long) declined the bad offer, the bad offer will have nowhere to
> go and finally start to merge with other resources on that agent.
> Note, since the allocator performance has greatly improved in the past
> several months. The scenario described here could be increasingly common.
> Also, as we introduce quota limits and hierarchical quota, there will be much
> more agent chopping, making resource fragmentation even worse.
> h3. Near-term Mitigations
> As mentioned above, the longevity of a bad offer is proportional to the
> active frameworks. Thus framework suppression will certainly help. In
> addition, from the Mesos side, a couple of mitigation measures are worth
> considering (other than the long-term optimistic allocation strategy):
> 1. Adding a defragment interval once in a while in the allocator. For
> example, each minute or a dozen allocation cycles or so, we will pause the
> allocation, rescind all the offers and start allocating again. This
> essentially eliminates all the circulating bad offers by giving them a chance
> to be consolidated. Think of this as a periodic “reboot” of the allocator.
> 2. Consider chopping non-quota resources as well. Right now, for resources
> such as ports (or any other resources that the role does not have quota for),
> all are allocated in a single offer. We could choose to chop these non-quota
> resources as well. For example, port resources can be distributed
> proportionally to allocated CPU resources.
> 3. Provide support for specifying port quantities. With this, we can utilize
> the existing quota or `min_allocatable_resources` APIs to guarantee a certain
> number of port resources.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)