----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65482/#review197641 -----------------------------------------------------------
src/master/master.cpp Lines 7643 (patched) <https://reviews.apache.org/r/65482/#comment277863> s/know/knows/ src/master/master.cpp Lines 7644 (patched) <https://reviews.apache.org/r/65482/#comment277864> s/added/adding/ src/master/master.cpp Lines 7645 (patched) <https://reviews.apache.org/r/65482/#comment277865> s/`RunTaskMessage`, see/`RunTaskMessage`. See/ src/master/master.cpp Lines 7647-7654 (patched) <https://reviews.apache.org/r/65482/#comment277874> I'm sitting here trying to think of ways we might avoid crashing if the framework subscribes before the operation becomes terminal... Would it be reasonable to add an `if (framework == nullptr)` check to `updateOperation()` so that we only recover resources if the framework is known to the master? src/master/master.cpp Lines 7652 (patched) <https://reviews.apache.org/r/65482/#comment277862> s/MESOS-8356/MESOS-8536/ - Greg Mann On Feb. 14, 2018, 2:21 p.m., Benjamin Bannier wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65482/ > ----------------------------------------------------------- > > (Updated Feb. 14, 2018, 2:21 p.m.) > > > Review request for mesos, Greg Mann, Jie Yu, and Jan Schlicht. > > > Bugs: MESOS-8536 > https://issues.apache.org/jira/browse/MESOS-8536 > > > Repository: mesos > > > Description > ------- > > This patch fixes the handling of non-terminal operations learned by a > newly elected master after a master failover, so that only these > operations are counted as using resources. Previously we did not count > any operations as using resources which by accident produced expected > behavior if the operation was already terminal when the master learned > about them. > > We do not address the issue of being unable to properly account for > operations triggered by frameworks unknown to the master, see > MESOS-8582. Instead we emit a warning for now since the master might > continue to abort due to assertion failures due to incomplete resource > accounting. > > > Diffs > ----- > > src/master/master.cpp b06d7a6e2fbbb81b97eaf537d5b6745c73dc867d > > > Diff: https://reviews.apache.org/r/65482/diff/3/ > > > Testing > ------- > > `make check`, also tested with a version of the test added in r/65045 which > triggered this issue. > > > Thanks, > > Benjamin Bannier > >