> On Oct. 22, 2019, 5:29 p.m., Benjamin Mahler wrote:
> > src/master/allocator/mesos/sorter/drf/sorter.cpp
> > Lines 456-458 (original), 456-458 (patched)
> > <https://reviews.apache.org/r/71646/diff/1/?file=2169764#file2169764line456>
> >
> >     I'm a little puzzled about this case, do we still need to treat empty 
> > resources as a special case? I'm not sure why it's there in the first place 
> > and it seems simpler without the special casing?
> >     
> >     I assume this is what made double removal not an error

You are right, this special case is not needed anymore.
It was introduced in https://reviews.apache.org/r/35473 to have an invariant 
"there are no SlaveIDs with empty resources".
Now that agent's resources are explicitly immutable in the sorter, this 
invariant is no longer useful (and, as you noted, this prevents CHECKing 
against double removal).

Neither do we need this as a "fast track": (re)adding an empty agent should not 
be something frequently happening in the wild.


> On Oct. 22, 2019, 5:29 p.m., Benjamin Mahler wrote:
> > src/master/allocator/mesos/sorter/sorter.hpp
> > Lines 126-131 (original), 126-134 (patched)
> > <https://reviews.apache.org/r/71646/diff/1/?file=2169767#file2169767line126>
> >
> >     If double insertion fails, let's have double removal fail? (rather than 
> > silently a no-op)
> >     
> >     This means the internal special case logic for empty resources needs to 
> > go away?
> >     
> >     Perhaps a note about how adding / removing the entire agent is cheaper 
> > than the old code of updating an agent's resources when they change? That 
> > seems non-obvious. (i.e. if reservations get made, have to remove and add 
> > agent, there's no ability to update the total).

Got rid of the special case for empty resources, double removal now fails.
Added a note to the sorter interface.


- Andrei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71646/#review218327
-----------------------------------------------------------


On Oct. 24, 2019, 4:52 p.m., Andrei Sekretenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71646/
> -----------------------------------------------------------
> 
> (Updated Oct. 24, 2019, 4:52 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-10015
>     https://issues.apache.org/jira/browse/MESOS-10015
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch replaces Sorter methods which modify `Resources` of an agent
> in the Sorter with methods which add/remove an agent as a whole (which
> was actually the only use case of the old methods). Thus, subtracting/
> adding `Resources` of the whole agent no longer occurs when updating
> resources of the agent in the Sorter.
> 
> This mitigates the issue with poor performance of
> `HierarchicalAllocatorProcess::updateAllocation()` for agents with
> a huge number of non-addable resources (see MESOS-10015).
> 
> Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*`:
> 
> Master:
> Agent resources size: 200 (50 frameworks)
> Made 20 reserve and unreserve operations in 2.08586secs
> Agent resources size: 400 (100 frameworks)
> Made 20 reserve and unreserve operations in 13.8449005secs
> Agent resources size: 800 (200 frameworks)
> Made 20 reserve and unreserve operations in 2.19253121188333mins
> 
> Master + this patch:
> Agent resources size: 200 (50 frameworks)
> Made 20 reserve and unreserve operations in 471.781183ms
> Agent resources size: 400 (100 frameworks)
> Made 20 reserve and unreserve operations in 1.022879058secs
> Agent resources size: 800 (200 frameworks)
> Made 20 reserve and unreserve operations in 2.622324521secs
> ...
> Agent resources size: 6400 (1600 frameworks)
> Made 20 reserve and unreserve operations in 2.04261335795mins
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp 
> 21010de363f25c516bb031e4ae48888e53621128 
>   src/master/allocator/mesos/sorter/drf/sorter.hpp 
> 3f6c7413f1b76f3fa86388360983763c8b76079f 
>   src/master/allocator/mesos/sorter/drf/sorter.cpp 
> ef79083b710fba628b4a7e93f903883899f8a71b 
>   src/master/allocator/mesos/sorter/random/sorter.hpp 
> a3097be98d175d2b47714eb8b70b1ce8c5c2bba8 
>   src/master/allocator/mesos/sorter/random/sorter.cpp 
> 86aeb1b8136eaffd2d52d3b603636b01383a9024 
>   src/master/allocator/mesos/sorter/sorter.hpp 
> 6b6b4a1811ba36e0212de17b9a6e63a6f8678a7f 
>   src/tests/sorter_tests.cpp d7fdee8f2cab4c930230750f0bd1a55eb08f89bb 
> 
> 
> Diff: https://reviews.apache.org/r/71646/diff/2/
> 
> 
> Testing
> -------
> 
> **make check**
> 
> **Variant of 
> `ReservationParam/HierarchicalAllocator__BENCHMARK_WithReservationParam`**
> from https://reviews.apache.org/r/71639/ (work in progress) 
> shows significant improvement and change from O(number_of_roles^3) to 
> O(number_of_roles^2):
> **Before**:
> Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.08586secs
> Average UNRESERVE duration: 51.491561ms
> Average RESERVE duration: 52.801438ms
> 
> Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 13.8449005secs
> Average UNRESERVE duration: 347.624639ms
> Average RESERVE duration: 344.620385ms
> 
> Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.19253121188333mins
> Average UNRESERVE duration: 3.285422441secs
> Average RESERVE duration: 3.292171194secs
> 
> Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
> (killed after several minutes)
> 
> **After:**
> Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 471.781183ms
> Average UNRESERVE duration: 12.112223ms
> Average RESERVE duration: 11.476835ms
> 
> Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 1.022879058secs
> Average UNRESERVE duration: 25.53819ms
> Average RESERVE duration: 25.605762ms
> 
> Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.622324521secs
> Average UNRESERVE duration: 65.166039ms
> Average RESERVE duration: 65.950186ms
> 
> Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 8.419455875secs
> Average UNRESERVE duration: 209.886948ms
> Average RESERVE duration: 211.085845ms
> 
> Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 32.82614382secs
> Average UNRESERVE duration: 823.126069ms
> Average RESERVE duration: 818.181121ms
> 
> Agent resources size: 6400 (1600 roles, 1 reservations per role, 1 port 
> ranges)
> Made 20 reserve and unreserve operations in 2.04261335795mins
> Average UNRESERVE duration: 3.063538394secs
> Average RESERVE duration: 3.064301679secs
> 
> **No significant performnce changes in 
> `QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota`.**
> 
> **Before:**
> 
> Added 30 agents in 1.175593ms
> Added 30 frameworks in 6.829173ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
> Made 36 allocations in 8.294832ms
> Made 0 allocation in 3.674923ms
> 
> Added 300 agents in 7.860046ms
> Added 300 frameworks in 149.743858ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
> Made 350 allocations in 132.796102ms
> Made 0 allocation in 107.887758ms
> 
> Added 3000 agents in 36.944587ms
> Added 3000 frameworks in 10.688501403secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 12.6020582secs
> Made 0 allocation in 9.716229696secs
> 
> Added 30 agents in 1.010362ms
> Added 30 frameworks in 6.272027ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
> Made 38 allocations in 9.119976ms
> Made 0 allocation in 5.460369ms
> 
> Added 300 agents in 7.442897ms
> Added 300 frameworks in 152.016597ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
> Made 391 allocations in 195.242282ms
> Made 0 allocation in 139.638551ms
> 
> Added 3000 agents in 36.003028ms
> Added 3000 frameworks in 11.203697649secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 17.807913455secs
> Made 0 allocation in 13.524946653secs
> 
> **After:**
> 
> Added 30 agents in 1.196576ms
> Added 30 frameworks in 6.814792ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
> Made 36 allocations in 8.263036ms
> Made 0 allocation in 3.947283ms
> 
> Added 300 agents in 8.497121ms
> Added 300 frameworks in 156.578165ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
> Made 350 allocations in 168.745307ms
> Made 0 allocation in 95.505069ms
> 
> Added 3000 agents in 38.074525ms
> Added 3000 frameworks in 11.249150205secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 12.772526049secs
> Made 0 allocation in 10.132801781secs
> 
> Added 30 agents in 799844ns
> Added 30 frameworks in 5.8663ms
> Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
> Made 38 allocations in 9.612524ms
> Made 0 allocation in 5.150924ms
> 
> Added 300 agents in 5.560583ms
> Added 300 frameworks in 138.469712ms
> Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
> Made 391 allocations in 175.021255ms
> Made 0 allocation in 138.181869ms
> 
> Added 3000 agents in 42.921689ms
> Added 3000 frameworks in 10.825018278secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 15.29232742secs
> Made 0 allocation in 14.202057473secs
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>

Reply via email to