[ https://issues.apache.org/jira/browse/MESOS-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515183#comment-15515183 ]
Benjamin Mahler commented on MESOS-4694: ---------------------------------------- {noformat} commit fba3108123442c78d4cee6047e2b1d64aab5a37a Author: Dario Rexin <dre...@apple.com> Date: Thu Sep 22 16:00:03 2016 -0700 Improve DRF sorter performance by bypassing `Resources`. Currently in `DRFSorter::calculateShare()` (which is called very frequently), the use of `Resources::get<Scalar>()` is expensive as it needs to loop over the `Resource` objects and do string comparison on the `Resource::name` strings. This patch avoids using `Resources::get<Scalar>()` in `DRFSorter::calaculateShare()` in favor of maintaining a map of resource names to scalars. Note that we had to maintain both this new map and the previously added `strippedScalarQuantities` since the latter stores resource roles. Review: https://reviews.apache.org/r/43665/ {noformat} > DRFAllocator takes very long to allocate resources with a large number of > frameworks > ------------------------------------------------------------------------------------ > > Key: MESOS-4694 > URL: https://issues.apache.org/jira/browse/MESOS-4694 > Project: Mesos > Issue Type: Improvement > Components: allocation > Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.27.2, 0.28.0, 0.28.1 > Reporter: Dario Rexin > Assignee: Dario Rexin > > With a growing number of connected frameworks, the allocation time grows to > very high numbers. The addition of quota in 0.27 had an additional impact on > these numbers. Running `mesos-tests.sh --benchmark > --gtest_filter=HierarchicalAllocator_BENCHMARK_Test.DeclineOffers` gives us > the following numbers: > {noformat} > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 200 frameworks > round 0 allocate took 2.921202secs to make 200 offers > round 1 allocate took 2.85045secs to make 200 offers > round 2 allocate took 2.823768secs to make 200 offers > {noformat} > Increasing the number of frameworks to 2000: > {noformat} > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 2000 frameworks > round 0 allocate took 28.209454secs to make 2000 offers > round 1 allocate took 28.469419secs to make 2000 offers > round 2 allocate took 28.138086secs to make 2000 offers > {noformat} > I was able to reduce this time by a substantial amount. After applying the > patches: > {noformat} > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 200 frameworks > round 0 allocate took 1.016226secs to make 2000 offers > round 1 allocate took 1.102729secs to make 2000 offers > round 2 allocate took 1.102624secs to make 2000 offers > {noformat} > And with 2000 frameworks: > {noformat} > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 2000 frameworks > round 0 allocate took 12.563203secs to make 2000 offers > round 1 allocate took 12.437517secs to make 2000 offers > round 2 allocate took 12.470708secs to make 2000 offers > {noformat} > The patches do 3 things to improve the performance of the allocator. > 1) The total values in the DRFSorter will be pre calculated per resource type > 2) In the allocate method, when no resources are available to allocate, we > break out of the innermost loop to prevent looping over a large number of > frameworks when we have nothing to allocate > 3) when a framework suppresses offers, we remove it from the sorter instead > of just calling continue in the allocation loop - this greatly improves > performance in the sorter and prevents looping over frameworks that don't > need resources > Assuming that most of the frameworks behave nicely and suppress offers when > they have nothing to schedule, it is fair to assume, that point 3) has the > biggest impact on the performance. If we suppress offers for 90% of the > frameworks in the benchmark test, we see following numbers: > {noformat} > ==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 200 slaves and 2000 frameworks > round 0 allocate took 11626us to make 200 offers > round 1 allocate took 22890us to make 200 offers > round 2 allocate took 21346us to make 200 offers > {noformat} > And for 200 frameworks: > {noformat} > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 2000 frameworks > round 0 allocate took 1.11178secs to make 2000 offers > round 1 allocate took 1.062649secs to make 2000 offers > round 2 allocate took 1.080181secs to make 2000 offers > {noformat} > Review requests: > https://reviews.apache.org/r/43665/ > https://reviews.apache.org/r/43666/ > https://reviews.apache.org/r/43668/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)