-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71646/
-----------------------------------------------------------
(Updated Oct. 24, 2019, 4:52 p.m.)
Review request for mesos, Benjamin Mahler and Meng Zhu.
Changes
-------
Addressed issues; modified sorters to store per-agent `resourceQuantities`
instead of `Resources`.
Summary (updated)
-----------------
Modified Sorter interface to add/remove agent resources as a whole.
Bugs: MESOS-10015
https://issues.apache.org/jira/browse/MESOS-10015
Repository: mesos
Description (updated)
-------
This patch replaces Sorter methods which modify `Resources` of an agent
in the Sorter with methods which add/remove an agent as a whole (which
was actually the only use case of the old methods). Thus, subtracting/
adding `Resources` of the whole agent no longer occurs when updating
resources of the agent in the Sorter.
This mitigates the issue with poor performance of
`HierarchicalAllocatorProcess::updateAllocation()` for agents with
a huge number of non-addable resources (see MESOS-10015).
Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*`:
Master:
Agent resources size: 200 (50 frameworks)
Made 20 reserve and unreserve operations in 2.08586secs
Agent resources size: 400 (100 frameworks)
Made 20 reserve and unreserve operations in 13.8449005secs
Agent resources size: 800 (200 frameworks)
Made 20 reserve and unreserve operations in 2.19253121188333mins
Master + this patch:
Agent resources size: 200 (50 frameworks)
Made 20 reserve and unreserve operations in 471.781183ms
Agent resources size: 400 (100 frameworks)
Made 20 reserve and unreserve operations in 1.022879058secs
Agent resources size: 800 (200 frameworks)
Made 20 reserve and unreserve operations in 2.622324521secs
...
Agent resources size: 6400 (1600 frameworks)
Made 20 reserve and unreserve operations in 2.04261335795mins
Diffs (updated)
-----
src/master/allocator/mesos/hierarchical.cpp
21010de363f25c516bb031e4ae48888e53621128
src/master/allocator/mesos/sorter/drf/sorter.hpp
3f6c7413f1b76f3fa86388360983763c8b76079f
src/master/allocator/mesos/sorter/drf/sorter.cpp
ef79083b710fba628b4a7e93f903883899f8a71b
src/master/allocator/mesos/sorter/random/sorter.hpp
a3097be98d175d2b47714eb8b70b1ce8c5c2bba8
src/master/allocator/mesos/sorter/random/sorter.cpp
86aeb1b8136eaffd2d52d3b603636b01383a9024
src/master/allocator/mesos/sorter/sorter.hpp
6b6b4a1811ba36e0212de17b9a6e63a6f8678a7f
src/tests/sorter_tests.cpp d7fdee8f2cab4c930230750f0bd1a55eb08f89bb
Diff: https://reviews.apache.org/r/71646/diff/2/
Changes: https://reviews.apache.org/r/71646/diff/1-2/
Testing (updated)
-------
**make check**
**Variant of
`ReservationParam/HierarchicalAllocator__BENCHMARK_WithReservationParam`**
from https://reviews.apache.org/r/71639/ (work in progress)
shows significant improvement and change from O(number_of_roles^3) to
O(number_of_roles^2):
**Before**:
Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.08586secs
Average UNRESERVE duration: 51.491561ms
Average RESERVE duration: 52.801438ms
Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 13.8449005secs
Average UNRESERVE duration: 347.624639ms
Average RESERVE duration: 344.620385ms
Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.19253121188333mins
Average UNRESERVE duration: 3.285422441secs
Average RESERVE duration: 3.292171194secs
Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
(killed after several minutes)
**After:**
Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 471.781183ms
Average UNRESERVE duration: 12.112223ms
Average RESERVE duration: 11.476835ms
Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 1.022879058secs
Average UNRESERVE duration: 25.53819ms
Average RESERVE duration: 25.605762ms
Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.622324521secs
Average UNRESERVE duration: 65.166039ms
Average RESERVE duration: 65.950186ms
Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 8.419455875secs
Average UNRESERVE duration: 209.886948ms
Average RESERVE duration: 211.085845ms
Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 32.82614382secs
Average UNRESERVE duration: 823.126069ms
Average RESERVE duration: 818.181121ms
Agent resources size: 6400 (1600 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.04261335795mins
Average UNRESERVE duration: 3.063538394secs
Average RESERVE duration: 3.064301679secs
**No significant performnce changes in
`QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota`.**
**Before:**
Added 30 agents in 1.175593ms
Added 30 frameworks in 6.829173ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
Made 36 allocations in 8.294832ms
Made 0 allocation in 3.674923ms
Added 300 agents in 7.860046ms
Added 300 frameworks in 149.743858ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
Made 350 allocations in 132.796102ms
Made 0 allocation in 107.887758ms
Added 3000 agents in 36.944587ms
Added 3000 frameworks in 10.688501403secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
Made 3500 allocations in 12.6020582secs
Made 0 allocation in 9.716229696secs
Added 30 agents in 1.010362ms
Added 30 frameworks in 6.272027ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
Made 38 allocations in 9.119976ms
Made 0 allocation in 5.460369ms
Added 300 agents in 7.442897ms
Added 300 frameworks in 152.016597ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
Made 391 allocations in 195.242282ms
Made 0 allocation in 139.638551ms
Added 3000 agents in 36.003028ms
Added 3000 frameworks in 11.203697649secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 17.807913455secs
Made 0 allocation in 13.524946653secs
**After:**
Added 30 agents in 1.196576ms
Added 30 frameworks in 6.814792ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter
Made 36 allocations in 8.263036ms
Made 0 allocation in 3.947283ms
Added 300 agents in 8.497121ms
Added 300 frameworks in 156.578165ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter
Made 350 allocations in 168.745307ms
Made 0 allocation in 95.505069ms
Added 3000 agents in 38.074525ms
Added 3000 frameworks in 11.249150205secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
Made 3500 allocations in 12.772526049secs
Made 0 allocation in 10.132801781secs
Added 30 agents in 799844ns
Added 30 frameworks in 5.8663ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter
Made 38 allocations in 9.612524ms
Made 0 allocation in 5.150924ms
Added 300 agents in 5.560583ms
Added 300 frameworks in 138.469712ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter
Made 391 allocations in 175.021255ms
Made 0 allocation in 138.181869ms
Added 3000 agents in 42.921689ms
Added 3000 frameworks in 10.825018278secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 15.29232742secs
Made 0 allocation in 14.202057473secs
Thanks,
Andrei Sekretenko