Re: Review Request 71646: Modified Sorter interface to add/remove an agent as a whole.

Benjamin Mahler Tue, 22 Oct 2019 10:30:08 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71646/#review218327
-----------------------------------------------------------




Thanks Andrei! Nice work.

The Testing Done section doesn't make its way into the commit message, but I 
find it helpful for the commit message to contain a concise summary of the 
performance improvements that this patch provides for posterity, e.g.

https://github.com/apache/mesos/commit/de90b2b3078e06975ab2cccc061db821cfe7dda8


src/master/allocator/mesos/sorter/drf/sorter.cpp
Lines 456-458 (original), 456-458 (patched)
<https://reviews.apache.org/r/71646/#comment305985>

    I'm a little puzzled about this case, do we still need to treat empty 
resources as a special case? I'm not sure why it's there in the first place and 
it seems simpler without the special casing?
    
    I assume this is what made double removal not an error



src/master/allocator/mesos/sorter/drf/sorter.cpp
Line 464 (original), 460-463 (patched)
<https://reviews.apache.org/r/71646/#comment305984>

    How about s/const auto insertionResult/bool inserted/ ?



src/master/allocator/mesos/sorter/drf/sorter.cpp
Line 483 (original), 477 (patched)
<https://reviews.apache.org/r/71646/#comment305986>

    how about s/resourcesIter/agent/ ?



src/master/allocator/mesos/sorter/random/sorter.cpp
Lines 390-423 (original), 390-410 (patched)
<https://reviews.apache.org/r/71646/#comment305987>

    Ditto comments here



src/master/allocator/mesos/sorter/sorter.hpp
Lines 126-131 (original), 126-134 (patched)
<https://reviews.apache.org/r/71646/#comment305983>

    If double insertion fails, let's have double removal fail? (rather than 
silently a no-op)
    
    This means the internal special case logic for empty resources needs to go 
away?
    
    Perhaps a note about how adding / removing the entire agent is cheaper than 
the old code of updating an agent's resources when they change? That seems 
non-obvious. (i.e. if reservations get made, have to remove and add agent, 
there's no ability to update the total).



src/tests/sorter_tests.cpp
Lines 514-515 (original), 514-515 (patched)
<https://reviews.apache.org/r/71646/#comment305988>

    Is this totalResources variable needed?
    
    Maybe just inline all of these to simplify?
    
    ```
      sorter.addSlave(slaveId, *Resources::parse("cpus:100;mem:100"));
    ```
    
    Seems more readable to me. Ditto for the other tests.



src/tests/sorter_tests.cpp
Line 1656 (original), 1655 (patched)
<https://reviews.apache.org/r/71646/#comment305989>

    s/consider/Consider/
    
    Can this TODO expand on why?



src/tests/sorter_tests.cpp
Lines 1831-1832 (original)
<https://reviews.apache.org/r/71646/#comment305990>

    Maybe the note on addSlave/removeSlave needs to also mention that the 
immutable approach (add/remove and no update) doesn't have to deal with the 
shared resource accounting trickyness that update has to deal with?


- Benjamin Mahler


On Oct. 22, 2019, 1:41 p.m., Andrei Sekretenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71646/
> -----------------------------------------------------------
> 
> (Updated Oct. 22, 2019, 1:41 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-10015
>     https://issues.apache.org/jira/browse/MESOS-10015
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch replaces Sorter methods which modify `Resources` of an agent
> in the Sorter with methods which add/remove an agent as a whole
> (which was actually the only use case of the old methods).
> 
> Thus, subtracting/adding `Resources` of the whole agent no longer
> occurs when updating resources of the agent in the Sorter. This
> mitigates the issue with poor performance of
> `HierarchicalAllocatorProcess::updateAllocation()` for agents with
> a huge number of non-addable resources (see MESOS-10015).
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp 
> 21010de363f25c516bb031e4ae48888e53621128 
>   src/master/allocator/mesos/sorter/drf/sorter.hpp 
> 3f6c7413f1b76f3fa86388360983763c8b76079f 
>   src/master/allocator/mesos/sorter/drf/sorter.cpp 
> ef79083b710fba628b4a7e93f903883899f8a71b 
>   src/master/allocator/mesos/sorter/random/sorter.hpp 
> a3097be98d175d2b47714eb8b70b1ce8c5c2bba8 
>   src/master/allocator/mesos/sorter/random/sorter.cpp 
> 86aeb1b8136eaffd2d52d3b603636b01383a9024 
>   src/master/allocator/mesos/sorter/sorter.hpp 
> 6b6b4a1811ba36e0212de17b9a6e63a6f8678a7f 
>   src/tests/sorter_tests.cpp d7fdee8f2cab4c930230750f0bd1a55eb08f89bb 
> 
> 
> Diff: https://reviews.apache.org/r/71646/diff/1/
> 
> 
> Testing
> -------
> 
> **make check**
> 
> **Variant of 
> `ReservationParam/HierarchicalAllocator__BENCHMARK_WithReservationParam`**
> from https://reviews.apache.org/r/71639/ (work in progress) 
> shows significant improvement and change from O(number_of_roles^3) to 
> O(number_of_roles^2):
> **Before**:
> Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.08586secs
> Average UNRESERVE duration: 51.491561ms
> Average RESERVE duration: 52.801438ms
> 
> Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 13.8449005secs
> Average UNRESERVE duration: 347.624639ms
> Average RESERVE duration: 344.620385ms
> 
> Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.19253121188333mins
> Average UNRESERVE duration: 3.285422441secs
> Average RESERVE duration: 3.292171194secs
> 
> Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
> (killed after several minutes)
> 
> **After:**
> 
> Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 559.381603ms
> Average UNRESERVE duration: 14.221942ms
> Average RESERVE duration: 13.747137ms
> 
> Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 1.186726648secs
> Average UNRESERVE duration: 29.569935ms
> Average RESERVE duration: 29.766396ms
> 
> Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 3.269639813secs
> Average UNRESERVE duration: 82.652908ms
> Average RESERVE duration: 80.829081ms
> 
> Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 11.156949417secs
> Average UNRESERVE duration: 278.158063ms
> Average RESERVE duration: 279.689407ms
> 
> Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 44.796308954secs
> Average UNRESERVE duration: 1.114141457secs
> Average RESERVE duration: 1.12567399secs
> 
> **No significant performnce changes in 
> `QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota`.**
> 
> **Before:**
> 
> Added 30 agents in 1.175593ms
> Added 30 frameworks in 6.829173ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
> Made 36 allocations in 8.294832ms
> Made 0 allocation in 3.674923ms
> 
> Added 300 agents in 7.860046ms
> Added 300 frameworks in 149.743858ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
> Made 350 allocations in 132.796102ms
> Made 0 allocation in 107.887758ms
> 
> Added 3000 agents in 36.944587ms
> Added 3000 frameworks in 10.688501403secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 12.6020582secs
> Made 0 allocation in 9.716229696secs
> 
> Added 30 agents in 1.010362ms
> Added 30 frameworks in 6.272027ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
> Made 38 allocations in 9.119976ms
> Made 0 allocation in 5.460369ms
> 
> Added 300 agents in 7.442897ms
> Added 300 frameworks in 152.016597ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
> Made 391 allocations in 195.242282ms
> Made 0 allocation in 139.638551ms
> 
> Added 3000 agents in 36.003028ms
> Added 3000 frameworks in 11.203697649secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 17.807913455secs
> Made 0 allocation in 13.524946653secs
> 
> **After:**
> 
> Added 30 agents in 1.196576ms
> Added 30 frameworks in 6.814792ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
> Made 36 allocations in 8.263036ms
> Made 0 allocation in 3.947283ms
> 
> Added 300 agents in 8.497121ms
> Added 300 frameworks in 156.578165ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
> Made 350 allocations in 168.745307ms
> Made 0 allocation in 95.505069ms
> 
> Added 3000 agents in 38.074525ms
> Added 3000 frameworks in 11.249150205secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 12.772526049secs
> Made 0 allocation in 10.132801781secs
> 
> Added 30 agents in 799844ns
> Added 30 frameworks in 5.8663ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
> Made 38 allocations in 9.612524ms
> Made 0 allocation in 5.150924ms
> 
> Added 300 agents in 5.560583ms
> Added 300 frameworks in 138.469712ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
> Made 391 allocations in 175.021255ms
> Made 0 allocation in 138.181869ms
> 
> Added 3000 agents in 42.921689ms
> Added 3000 frameworks in 10.825018278secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 15.29232742secs
> Made 0 allocation in 14.202057473secs
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>

Re: Review Request 71646: Modified Sorter interface to add/remove an agent as a whole.

Reply via email to