----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70325/#review214141 -----------------------------------------------------------
src/master/master.cpp Lines 10538-10541 (patched) <https://reviews.apache.org/r/70325/#comment300321> Looking at this again, I guess I should build up a `hashmap<SlaveID, std::pair<Resources, Resources>>` and make just one `addAgentResources()` call per agent. - Greg Mann On March 27, 2019, 7:59 p.m., Greg Mann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70325/ > ----------------------------------------------------------- > > (Updated March 27, 2019, 7:59 p.m.) > > > Review request for mesos, Benjamin Mahler, Gastón Kleiman, Joseph Wu, and > Meng Zhu. > > > Bugs: MESOS-9635 > https://issues.apache.org/jira/browse/MESOS-9635 > > > Repository: mesos > > > Description > ------- > > This patch updates the master's framework recovery code to use > the allocator's `addAgentResources()` method rather than > `updateSlave()` when recovering orphan operations, which has the > benefit of tracking the allocation of the operations' consumed > resources, avoiding situations in which those resources would be > incorrectly offered to frameworks while the operation is still > in a pending state. > > > Diffs > ----- > > src/master/master.cpp acc67d3763ddee9027e6cf375f1d495ff5805026 > > > Diff: https://reviews.apache.org/r/70325/diff/1/ > > > Testing > ------- > > `make check` > > To verify the flaky test fix, the following command was executed both before > and after the patches were applied, while `stress -c <num_cores_on_machine>` > was being run: > `bin/mesos-tests.sh > --gtest_filter="*AgentPendingOperationAfterMasterFailover*" --gtest_repeat=-1 > --gtest_break_on_failure` > > Before the patches were applied, the test would reliably fail after less than > 50 repetitions. After the patches are applied, the test can be run for > hundreds of repetitions with no failures. > > > Thanks, > > Greg Mann > >