[jira] [Commented] (MESOS-9806) Address allocator performance regression due to the addition of quota limits.

Meng Zhu (Jira) Fri, 23 Aug 2019 14:57:08 -0700


    [ 
https://issues.apache.org/jira/browse/MESOS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914672#comment-16914672
 ]


Meng Zhu commented on MESOS-9806:
---------------------------------

Small vector optimization for ResourceQuantities, ResourceLimits and Resources:

{noformat}
commit 73033130de7872c6f240b9b05dced039d7666138
Author: Meng Zhu <[email protected]>
Date:   Thu Aug 22 17:19:30 2019 -0700

    Used boost `small_vector` in `Resources`.

    Master + previous patch:
    *HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
    Made 3500 allocations in 16.307044003secs
    Made 0 allocation in 14.948262599secs

    Master + previous patch + this patch:
    *HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
    Made 3500 allocations in 15.385276405secs
    Made 0 allocation in 13.718502414secs

    Review: https://reviews.apache.org/r/71357

commit 95201cbe4dc87eae2fde5754d16f5effbb6c1974
Author: Meng Zhu <[email protected]>
Date:   Thu Aug 22 16:55:34 2019 -0700

    Used boost `small_vector` in Resource Quantities and Limits.

    Master + previous patch
    *HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
    Made 3500 allocations in 16.831380548secs
    Made 0 allocation in 15.102885644secs

    Master + previous patch + this patch:
    *HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
    Made 3500 allocations in 16.307044003secs
    Made 0 allocation in 14.948262599secs

    Review: https://reviews.apache.org/r/71355

commit 25070f232a9bb97d1b78f8a7e5b774bbd50654f9
Author: Meng Zhu <[email protected]>
Date:   Thu Aug 22 16:54:42 2019 -0700

    Updated the boost library.

    This update includes adding `container/small_vector.hpp`.

    Review: https://reviews.apache.org/r/71356
{noformat}


> Address allocator performance regression due to the addition of quota limits.
> -----------------------------------------------------------------------------
>
>                 Key: MESOS-9806
>                 URL: https://issues.apache.org/jira/browse/MESOS-9806
>             Project: Mesos
>          Issue Type: Improvement
>          Components: allocation
>            Reporter: Meng Zhu
>            Assignee: Meng Zhu
>            Priority: Critical
>              Labels: resource-management
>
> In MESOS-9802, we removed the quota role sorter which is tech debt.
> However, this slows down the allocator. The problem is that in the first 
> stage, even though a cluster might have no active roles with non-default 
> quota, the allocator will now have to sort and go through each and every role 
> in the cluster. Benchmark result shows that for 1k roles with 2k frameworks, 
> the allocator could experience ~50% performance degradation.
> There are a couple of ways to address this issue. For example, we could make 
> the sorter aware of quota. And add a method, say `sortQuotaRoles`, to return 
> all the roles with non-default quota. Alternatively, an even better approach 
> would be to deprecate the sorter concept and just have two standalone 
> functions e.g. sortRoles() and sortQuotaRoles() that takes in the role tree 
> structure (not yet exist in the allocator) and return the sorted roles.
> In addition, when implementing MESOS-8068, we need to do more during the 
> allocation cycle. In particular, we need to call shrink many more times than 
> before. These all contribute to the performance slowdown. Specifically, for 
> the quota oriented benchmark 
> `HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2` we can observe 
> 2-3x slowdown compared to the previous release (1.8.1):
> Current master:
> QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 32.051382735secs
> Made 0 allocation in 27.976022773secs
> 1.8.1:
> HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Made 3500 allocations in 13.810811063secs
> Made 0 allocation in 9.885972984secs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (MESOS-9806) Address allocator performance regression due to the addition of quota limits.

Reply via email to