----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71697/ -----------------------------------------------------------
Review request for mesos, Benjamin Mahler and Meng Zhu. Bugs: MESOS-10015 https://issues.apache.org/jira/browse/MESOS-10015 Repository: mesos Description ------- This patch addresses poor performance of `HierarchicalAllocatorProcess::updateAllocation()` for agents with a huge number of non-addable resources in a many-framework case (see MESOS-10015). Sorter methods for totals tracking that modify `Resources` of an agent in the Sorter are replaced with methods that add/remove resource quantities of an agent as a whole (which was actually the only use case of the old methods). Thus, subtracting/adding `Resources` of a whole agent no longer occurs when updating resources of an agent in a Sorter. Further, this patch completely removes agent resource tracking logic from the random sorter (which by itself makes no use of them) by implementing cluster totals tracking in the allocator. Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*` (for the DRF sorter): 1.8.x branch: Agent resources size: 200 (50 frameworks) Made 20 reserve and unreserve operations in 1.938801227secs Agent resources size: 400 (100 frameworks) Made 20 reserve and unreserve operations in 13.861857374secs Agent resources size: 800 (200 frameworks) Made 20 reserve and unreserve operations in 2.13412983136667mins 1.8.x branch + this pathch: Agent resources size: 200 (50 frameworks) Made 20 reserve and unreserve operations in 214.063821ms Agent resources size: 400 (100 frameworks) Made 20 reserve and unreserve operations in 425.278671ms Agent resources size: 800 (200 frameworks) Made 20 reserve and unreserve operations in 1.136214374secs ... Agent resources size: 6400 (1600 frameworks) Made 20 reserve and unreserve operations in 50.094194999secs This is a backport of https://reviews.apache.org/r/71646 Diffs ----- src/master/allocator/mesos/hierarchical.hpp 4f716820748e070569e988f8dad15670367a74b7 src/master/allocator/mesos/hierarchical.cpp 061b70258f4874f4f2b26a57705b9ba1543c7553 src/master/allocator/sorter/drf/sorter.hpp 7daf1bfd2dfe88e2d8e0af07c8af8aa823f80935 src/master/allocator/sorter/drf/sorter.cpp 9367469132e426f0b4b66a80ad300c157fba6bf2 src/master/allocator/sorter/random/sorter.hpp c8e777be256b4faf931bf1a106185d7f91b3ba6f src/master/allocator/sorter/random/sorter.cpp 9899cfd570607a60dbd7980d340a8e7d9d3e6df5 src/master/allocator/sorter/sorter.hpp d56a1166a9e82b034564842ac071874ec2885004 src/tests/sorter_tests.cpp 1e4a7893411d2107049a7bb92ee159526588c58c Diff: https://reviews.apache.org/r/71697/diff/1/ Testing ------- make check `*BENCHMARK_WithReservationParam.UpdateAllocation*`: **Before:** Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 1.938801227secs Average UNRESERVE duration: 49.161884ms Average RESERVE duration: 47.778177ms Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 13.861857374secs Average UNRESERVE duration: 346.822609ms Average RESERVE duration: 346.270259ms Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 2.13412983136667mins Average UNRESERVE duration: 3.200348465secs Average RESERVE duration: 3.202041028secs Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges) (killed after several minutes) **After:** Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 214.063821ms Average UNRESERVE duration: 5.134867ms Average RESERVE duration: 5.568323ms Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 425.278671ms Average UNRESERVE duration: 10.201193ms Average RESERVE duration: 11.06274ms Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 1.136214374secs Average UNRESERVE duration: 28.336427ms Average RESERVE duration: 28.474291ms Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 3.773618637secs Average UNRESERVE duration: 93.619424ms Average RESERVE duration: 95.061507ms Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 13.881966194secs Average UNRESERVE duration: 350.46368ms Average RESERVE duration: 343.634628ms Agent resources size: 6400 (1600 roles, 1 reservations per role, 1 port ranges) Made 20 reserve and unreserve operations in 50.094194999secs Average UNRESERVE duration: 1.252057472secs Average RESERVE duration: 1.252652277secs Thanks, Andrei Sekretenko