----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63732/#review191911 -----------------------------------------------------------
src/master/master.cpp Lines 7119 (patched) <https://reviews.apache.org/r/63732/#comment269852> Can you add a comment saying that if the key is None, it means the agent default resources that have no resource provider id. src/master/master.cpp Lines 7128-7129 (patched) <https://reviews.apache.org/r/63732/#comment269845> Please use `Resources::hasResourceProvider` which has more CHECKs. src/master/master.cpp Lines 7211-7218 (patched) <https://reviews.apache.org/r/63732/#comment269854> i feel this is unnecessary. It should be the responsibility of the agent to convert the operation to post refinement format if it is every downgraded before checkpointing. We can add some CHECK here. src/master/master.cpp Lines 7231-7244 (patched) <https://reviews.apache.org/r/63732/#comment269857> Hum, what if this is a failover master and the UpdateSlaveMessage is a result of an LRP re-registration with the agent. In that case, master does not know about the operation but the agent does? Also, i would be future proof and does not check the latest state of the operation. It's possible that the operaiton is terminal but still waiting for ack from the framework, and master will still maintain that operation. I would still check the invariant that if the master knows about the RP, then it should know more operation than the agent. There might be one edge case worth documenting: master receives an ack from the framework, and removed the operation and send an Ack message to the agent. The agent didn't receive the ack yet before it sends an UpdateSlaveMessage. In that case, what we should do? Adding back the operation does not sound right because the agent will then delete the operation on receiving ack. Then we have an inconsistency between master and agent. src/master/master.cpp Lines 7302-7308 (patched) <https://reviews.apache.org/r/63732/#comment269863> We do want to support that when a CSI plugin detects a chagne to the total resources. src/master/master.cpp Lines 7352-7362 (patched) <https://reviews.apache.org/r/63732/#comment269864> This should go to addOfferOperation. src/master/master.cpp Lines 7389-7403 (patched) <https://reviews.apache.org/r/63732/#comment269870> That's `Slave::recoverResources(operation)`? - Jie Yu On Nov. 24, 2017, 2:50 p.m., Benjamin Bannier wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63732/ > ----------------------------------------------------------- > > (Updated Nov. 24, 2017, 2:50 p.m.) > > > Review request for mesos, Jie Yu and Jan Schlicht. > > > Bugs: MESOS-8207 > https://issues.apache.org/jira/browse/MESOS-8207 > > > Repository: mesos > > > Description > ------- > > See summary. > > > Diffs > ----- > > src/master/master.cpp 53263e499d88b906b6406c24c0dfb737e589e813 > > > Diff: https://reviews.apache.org/r/63732/diff/7/ > > > Testing > ------- > > `make check`, tested as part of https://reviews.apache.org/r/63843/. > > This patch requires `protobuf::isSpeculativeOperation` from > https://reviews.apache.org/r/63751/ which is _not_ part of this review chain > (to avoid multiple dependent RRs). > > > Thanks, > > Benjamin Bannier > >