> On July 13, 2016, 11:04 a.m., Neil Conway wrote: > > BTW, one thought: rather than writing out a new checkpoint and then > > deleting the target checkpoint file, what about renaming target -> current > > checkpoint? Rename is typically atomic (within a single filesystem), which > > is nice, and it would avoid the need to separately delete the target > > checkpoint afterward. > > Anindya Sinha wrote: > SGTM. Updating the changes.
@neilc, @xujyan: I discarded https://reviews.apache.org/r/48314 and https://reviews.apache.org/r/48315, so can this be submitted now? - Anindya ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48313/#review142052 ----------------------------------------------------------- On July 14, 2016, 12:11 a.m., Anindya Sinha wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48313/ > ----------------------------------------------------------- > > (Updated July 14, 2016, 12:11 a.m.) > > > Review request for mesos, Neil Conway and Jiang Yan Xu. > > > Bugs: MESOS-5448 > https://issues.apache.org/jira/browse/MESOS-5448 > > > Repository: mesos > > > Description > ------- > > When the agent receives CheckpointedResourcesMessage, we store the > target checkpoint on disk. On successful create and destroy of > persistent volumes as a part of handling this messages, we commit > the checkpoint on the disk, and clear the target checkpoint. > > However, incase of any failure we do not commit the checkpoint to > disk, and exit the agent. When the agent restarts and there is a > target checkpoint present on disk which differs from the committed > checkpoint, we retry to sync the target and committed checkpoint. > On success, we reregister the agent with the master, but in case it > fails, we do not commit the checkpoint and the agent exits. > > > Diffs > ----- > > src/slave/paths.hpp 339e539863c678b6ed4d4670d75c7ff4c54daa79 > src/slave/paths.cpp 03157f93b1e703006f95ef6d0a30afae375dcdb5 > src/slave/slave.hpp 9864cf43b8c1a5cce31b886ae4dc20ec5cfafcb9 > src/slave/slave.cpp 02982d542c9e6b5b5f7fc8b3c73db6f5bac01358 > src/slave/state.hpp 0de2a4ee4fabaad612c4526166157b001c380bdb > src/slave/state.cpp 9cec0868b1187ed3ccac7f065e8a21c2f52178d9 > > Diff: https://reviews.apache.org/r/48313/diff/ > > > Testing > ------- > > All tests passed. > > > Thanks, > > Anindya Sinha > >