> On Jan. 23, 2015, 6:44 p.m., Jie Yu wrote: > > src/slave/state.hpp, line 120 > > <https://reviews.apache.org/r/29918/diff/9/?file=829596#file829596line120> > > > > One more thought: > > > > Can we introduce a third parameter for this function to let users opt > > out the temp + rename (we can use a default value)? > > > > This function is super critical. What if /tmp has too many links? It'll > > cause slave to flap. For majority of the checkpointing work, we don't need > > this temp + rename because we know the old file does not exist. > > > > What do you think? > > Michael Park wrote: > But if we want to guarantee that the state of the checkpointed file is > always valid, I don't think we can get away from writing to temp + renaming. > For the case where the old file does not exist, we'd do something like, > write directly to the file if it doesn't already exist, and delete on error? > We would end up with an invalid checkpoint file if we crash while > writing, or run into an error and crash before deleting the file. > > I'm also not sure if /tmp having too many links is the level of problem > we need to worry about? It seems like a very unlikely case to me, but I'm not > sure. > > Just my thoughts.
Looks like it causes this bug: https://issues.apache.org/jira/browse/MESOS-2319 - Jie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29918/#review69435 ----------------------------------------------------------- On Jan. 24, 2015, 1:21 a.m., Michael Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/29918/ > ----------------------------------------------------------- > > (Updated Jan. 24, 2015, 1:21 a.m.) > > > Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and > Vinod Kone. > > > Repository: mesos > > > Description > ------- > > Introduced a generic checkpoint function. > > > Diffs > ----- > > src/slave/state.hpp 70777cf6ab681c29ca4df601fe47903e1dbdf41f > src/slave/state.cpp a36fa53099300ee03f051b0f5eaaafe9f1da68d1 > src/tests/slave_recovery_tests.cpp 809822e63b05a21418cd9297c927d656d6fd871d > > Diff: https://reviews.apache.org/r/29918/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Michael Park > >