> On Jan. 23, 2015, 6:44 p.m., Jie Yu wrote:
> > src/slave/state.hpp, line 120
> > <https://reviews.apache.org/r/29918/diff/9/?file=829596#file829596line120>
> >
> >     One more thought:
> >     
> >     Can we introduce a third parameter for this function to let users opt 
> > out the temp + rename (we can use a default value)?
> >     
> >     This function is super critical. What if /tmp has too many links? It'll 
> > cause slave to flap. For majority of the checkpointing work, we don't need 
> > this temp + rename because we know the old file does not exist.
> >     
> >     What do you think?

But if we want to guarantee that the state of the checkpointed file is always 
valid, I don't think we can get away from writing to temp + renaming.
For the case where the old file does not exist, we'd do something like, write 
directly to the file if it doesn't already exist, and delete on error?
We would end up with an invalid checkpoint file if we crash while writing, or 
run into an error and crash before deleting the file.

I'm also not sure if /tmp having too many links is the level of problem we need 
to worry about? It seems like a very unlikely case to me, but I'm not sure.

Just my thoughts.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29918/#review69435
-----------------------------------------------------------


On Jan. 24, 2015, 1:21 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29918/
> -----------------------------------------------------------
> 
> (Updated Jan. 24, 2015, 1:21 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
> Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Introduced a generic checkpoint function.
> 
> 
> Diffs
> -----
> 
>   src/slave/state.hpp 70777cf6ab681c29ca4df601fe47903e1dbdf41f 
>   src/slave/state.cpp a36fa53099300ee03f051b0f5eaaafe9f1da68d1 
>   src/tests/slave_recovery_tests.cpp 809822e63b05a21418cd9297c927d656d6fd871d 
> 
> Diff: https://reviews.apache.org/r/29918/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Michael Park
> 
>

Reply via email to