[ 
https://issues.apache.org/jira/browse/MESOS-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956346#comment-16956346
 ] 

Andrei Sekretenko commented on MESOS-10015:
-------------------------------------------

https://issues.apache.org/jira/browse/MESOS-9942 and related work will fix the 
`total number of frameworks` part.

To fix the quadratic growth vs the reservations count, we can avoid using 
`Resources::operator +=`, `Resources::operator-=` and `Resources::contains()` 
for re-adding a slave to a framework sorter.


> HierarchicalAllocatorProcess::updateAvailable() can stall the allocator with 
> a huge number of reservations on an agent.
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-10015
>                 URL: https://issues.apache.org/jira/browse/MESOS-10015
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.5.3, 1.6.2, 1.7.2, 1.8.1, 1.9.0
>            Reporter: Andrei Sekretenko
>            Assignee: Andrei Sekretenko
>            Priority: Critical
>              Labels: resource-management
>
> Currently, updateAvailable() called for a single-object Resources for a 
> single framework on a single slave requires `(total number of frameworks) * 
> (number of resource objects per this slave)^2` calls of `Resource::addable()`
> In a cluster with a large number of frameworks this results in severe 
> degradation of allocator performance  when a bunch of RESERVE/UNRESERVE 
> operations occurs for an agent with hundreds of unique resources. 
> On our testing cluster task we observed task scheduling delays up to 30 
> minutes due to allocator being occupied with processing UNRESERVE operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to