----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71646/ -----------------------------------------------------------
(Updated Oct. 28, 2019, 6:45 p.m.) Review request for mesos, Benjamin Mahler and Meng Zhu. Changes ------- Merged https://reviews.apache.org/r/71672 and https://reviews.apache.org/r/71673/ into this patch. Summary (updated) ----------------- Got rid of `Reources::add()/subtract()` for agent resources in Sorter. Bugs: MESOS-10015 https://issues.apache.org/jira/browse/MESOS-10015 Repository: mesos Description (updated) ------- This patch addresses the issue with poor performance of `HierarchicalAllocatorProcess::updateAllocation()` for agents with a huge number of non-addable resources (see MESOS-10015) Sorter methods that modify `Resources` of an agent in the Sorter are replaced with methods that add/remove resource quantities of an agent as a whole (which was actually the only use case of the old methods). Thus subtracting/adding of `Resources` of the whole agent no longer occurs when updating resources of the agent in the Sorter. Further, this patch fully removes tracking agent resources from the random sorter by implementing tracking of the cluster totals inside of the allocator. Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*` (for the DRF sorter): Master: Agent resources size: 200 (50 frameworks) Made 20 reserve and unreserve operations in 2.08586secs Agent resources size: 400 (100 frameworks) Made 20 reserve and unreserve operations in 13.8449005secs Agent resources size: 800 (200 frameworks) Made 20 reserve and unreserve operations in 2.19253121188333mins Master + this patch: Agent resources size: 200 (50 frameworks) Made 20 reserve and unreserve operations in 463.223084ms Agent resources size: 400 (100 frameworks) Made 20 reserve and unreserve operations in 930.097972ms Agent resources size: 800 (200 frameworks) Made 20 reserve and unreserve operations in 2.160506847secs ... Agent resources size: 6400 (1600 frameworks) Made 20 reserve and unreserve operations in 1.50729369885mins Diffs (updated) ----- src/master/allocator/mesos/hierarchical.hpp 9d0fbe771868ea60e66b9e25b0c666d5416d6e85 src/master/allocator/mesos/hierarchical.cpp 21010de363f25c516bb031e4ae48888e53621128 src/master/allocator/mesos/sorter/drf/sorter.hpp 3f6c7413f1b76f3fa86388360983763c8b76079f src/master/allocator/mesos/sorter/drf/sorter.cpp ef79083b710fba628b4a7e93f903883899f8a71b src/master/allocator/mesos/sorter/random/sorter.hpp a3097be98d175d2b47714eb8b70b1ce8c5c2bba8 src/master/allocator/mesos/sorter/random/sorter.cpp 86aeb1b8136eaffd2d52d3b603636b01383a9024 src/master/allocator/mesos/sorter/sorter.hpp 6b6b4a1811ba36e0212de17b9a6e63a6f8678a7f src/tests/sorter_tests.cpp d7fdee8f2cab4c930230750f0bd1a55eb08f89bb Diff: https://reviews.apache.org/r/71646/diff/3/ Changes: https://reviews.apache.org/r/71646/diff/2-3/ Testing ------- **make check** **Variant of `ReservationParam/HierarchicalAllocator__BENCHMARK_WithReservationParam`** from https://reviews.apache.org/r/71639/ (work in progress) shows significant improvement and change from O(number_of_roles^3) to O(number_of_roles^2): **Before**: Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 2.08586secs Average UNRESERVE duration: 51.491561ms Average RESERVE duration: 52.801438ms Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 13.8449005secs Average UNRESERVE duration: 347.624639ms Average RESERVE duration: 344.620385ms Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 2.19253121188333mins Average UNRESERVE duration: 3.285422441secs Average RESERVE duration: 3.292171194secs Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges) (killed after several minutes) **After:** Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 471.781183ms Average UNRESERVE duration: 12.112223ms Average RESERVE duration: 11.476835ms Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 1.022879058secs Average UNRESERVE duration: 25.53819ms Average RESERVE duration: 25.605762ms Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 2.622324521secs Average UNRESERVE duration: 65.166039ms Average RESERVE duration: 65.950186ms Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 8.419455875secs Average UNRESERVE duration: 209.886948ms Average RESERVE duration: 211.085845ms Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 32.82614382secs Average UNRESERVE duration: 823.126069ms Average RESERVE duration: 818.181121ms Agent resources size: 6400 (1600 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 2.04261335795mins Average UNRESERVE duration: 3.063538394secs Average RESERVE duration: 3.064301679secs **No significant performnce changes in `QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota`.** **Before:** Added 30 agents in 1.175593ms Added 30 frameworks in 6.829173ms Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter Made 36 allocations in 8.294832ms Made 0 allocation in 3.674923ms Added 300 agents in 7.860046ms Added 300 frameworks in 149.743858ms Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter Made 350 allocations in 132.796102ms Made 0 allocation in 107.887758ms Added 3000 agents in 36.944587ms Added 3000 frameworks in 10.688501403secs Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter Made 3500 allocations in 12.6020582secs Made 0 allocation in 9.716229696secs Added 30 agents in 1.010362ms Added 30 frameworks in 6.272027ms Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter Made 38 allocations in 9.119976ms Made 0 allocation in 5.460369ms Added 300 agents in 7.442897ms Added 300 frameworks in 152.016597ms Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter Made 391 allocations in 195.242282ms Made 0 allocation in 139.638551ms Added 3000 agents in 36.003028ms Added 3000 frameworks in 11.203697649secs Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter Made 3856 allocations in 17.807913455secs Made 0 allocation in 13.524946653secs **After:** Added 30 agents in 1.196576ms Added 30 frameworks in 6.814792ms Benchmark setup: 30 agents, 30 roles, 30 frameworks, with drf sorter Made 36 allocations in 8.263036ms Made 0 allocation in 3.947283ms Added 300 agents in 8.497121ms Added 300 frameworks in 156.578165ms Benchmark setup: 300 agents, 300 roles, 300 frameworks, with drf sorter Made 350 allocations in 168.745307ms Made 0 allocation in 95.505069ms Added 3000 agents in 38.074525ms Added 3000 frameworks in 11.249150205secs Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter Made 3500 allocations in 12.772526049secs Made 0 allocation in 10.132801781secs Added 30 agents in 799844ns Added 30 frameworks in 5.8663ms Benchmark setup: 30 agents, 30 roles, 30 frameworks, with random sorter Made 38 allocations in 9.612524ms Made 0 allocation in 5.150924ms Added 300 agents in 5.560583ms Added 300 frameworks in 138.469712ms Benchmark setup: 300 agents, 300 roles, 300 frameworks, with random sorter Made 391 allocations in 175.021255ms Made 0 allocation in 138.181869ms Added 3000 agents in 42.921689ms Added 3000 frameworks in 10.825018278secs Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter Made 3856 allocations in 15.29232742secs Made 0 allocation in 14.202057473secs Thanks, Andrei Sekretenko
