> On Sept. 23, 2016, 2:40 a.m., Guangya Liu wrote: > > It is really weired that the performance of > > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7` > > does not improve much when calling `addSlave`, need check more for why > > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for > > each agent, but with the fix, only one `allocate` will be called.... > > > > ``` > > without fix: > > [==========] Running 1 test from 1 test case. > > [----------] Global test environment set-up. > > [----------] 1 test from > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test > > [ RUN ] > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7 > > Using 1000 agents and 6000 frameworks > > Added 6000 frameworks in 122268us > > Added 1000 agents in 42.037104secs > > > > With fix: > > [==========] Running 1 test from 1 test case. > > [----------] Global test environment set-up. > > [----------] 1 test from > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test > > [ RUN ] > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7 > > Using 1000 agents and 6000 frameworks > > Added 6000 frameworks in 116107us > > Added 1000 agents in 41.615396secs > > ``` > > Guangya Liu wrote: > Jacob, I did more test with the code on Aug 23, at which I posted some > result in this RR, and found that the test result is different, I did > following to get Aug 23 code. > > ``` > LiuGuangyas-MacBook-Pro:build gyliu$ git checkout > 2f78a440ef4201c5b11fb92c225694e84a60369c > > LiuGuangyas-MacBook-Pro:build gyliu$ git log -1 > commit 2f78a440ef4201c5b11fb92c225694e84a60369c > Author: Gilbert Song <songzihao1...@gmail.com> > Date: Mon Aug 22 13:00:58 2016 -0700 > > Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume. > > Review: https://reviews.apache.org/r/51271/ > ``` > > The test result seems still same as now (without your patch and the code > is get from Aug 23): > > ``` > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7 > Using 1000 agents and 6000 frameworks > Added 6000 frameworks in 144272us > Added 1000 agents in 43.107001secs > ``` > > But anyway, I think that we need find out why the performance for > `addSlave` was not improved based on your patch. > > Jacob Janco wrote: > Yes agreed, per our Slack discussions, I'll look into this. Thanks for > posting the followup. > > Benjamin Mahler wrote: > `addSlave()` is asynchronous and we do not wait for all of the > `addSlave()` futures to complete, so any speedup in `addSlave()` will only > affect the next caller that waits for a result from the allocator. > > Benjamin Mahler wrote: > Ah I missed that we do a `Clock::settle()`, nevermind :) > > Guangya Liu wrote: > Some thinking for why `addSlave` does not improve much... > > Without Jacob's patch, the logic woule be: > > ``` > addSlave -> allocate the single slave > addSlave -> allocate the single slave > addSlave -> allocate the single slave > ... > addSlave -> allocate the single slave > ``` > > With Jacob's patch, the logic would be: > > ``` > addSlave > addSlave > addSlave > ... > addSlave - > allocate for **all** of the slaves > ``` > > The time elapsed by `allocate a single slave N times` with `allocate N > slaves in one allocate` request should not different much, the only > difference is one is looping the event queue while another is looping in > allocator, that's why there are not enough performance change for this. > > But this will impact a lot when adding frameworks or some other events in > allocator which will call `allocate(slaves)`, one proposal is we may need to > add some new benchmark test cases which do the following logic, the following > logic will trigger each `addframework` operation call `allocate(slaves)` > without Jacob's patch, but will only call `allocate(slaves)` one time with > Jacob's patch. > > ``` > 1) Add slaves first > 2) Add frameworks > ``` > > We may get some performance improvement with above case. > > Currently, all of the benchmark test are using > > ``` > 1) Add frameworks > 2) Add agents > ``` > > That's why not much performance improvement...
This makes sense Guangya, I'm in the process of creating a minimal benchmark adding a set of slaves then adding frameworks. I'll post here if the results are interesting. - Jacob ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51027/#review150123 ----------------------------------------------------------- On Sept. 23, 2016, 4:32 p.m., Jacob Janco wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51027/ > ----------------------------------------------------------- > > (Updated Sept. 23, 2016, 4:32 p.m.) > > > Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus > Ma, and Jiang Yan Xu. > > > Bugs: MESOS-3157 > https://issues.apache.org/jira/browse/MESOS-3157 > > > Repository: mesos > > > Description > ------- > > - Triggered allocations dispatch allocate() only > if there is no pending allocation in the queue. > - Allocation candidates are accumulated and only > cleared when enqueued allocations are processed. > > > Diffs > ----- > > src/master/allocator/mesos/hierarchical.hpp > 2c31471ee0f5d6836393bf87ff9ecfd8df835013 > src/master/allocator/mesos/hierarchical.cpp > 2d56bd011f2c87c67a02d0ae467a4a537d36867e > > Diff: https://reviews.apache.org/r/51027/diff/ > > > Testing > ------- > > make check > > note: check without filters depends on https://reviews.apache.org/r/51028 > > With new benchmark https://reviews.apache.org/r/49617: > Sample output without 51027: > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > Using 10000 agents and 3000 frameworks > Added 3000 frameworks in 57251us > Added 10000 agents in 3.21345353333333mins > allocator settled after 1.61236038333333mins > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > (290578 ms) > > Sample output with 51027: > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > Using 10000 agents and 3000 frameworks > Added 3000 frameworks in 39817us > Added 10000 agents in 3.22860541666667mins > allocator settled after 25.525654secs > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > (220137 ms) > > > Thanks, > > Jacob Janco > >