> On March 24, 2015, 8:01 p.m., Vinod Kone wrote:
> > The recovery code is a bit confusing with a bunch of linkers and linkees. 
> > Is it possible to clean that up a bit?
> > 
> > More importantly, IIUC there are 3 entities here. links (veth), pids (file 
> > bind mounted) and containerids (symlink to pid). Why is that there is no 
> > symmetry in their creation (isolate(): pid -> cid -> link) and deletion 
> > (cleanup(): cid -> pid -> link)? Shouldn't 'link' be removed first in 
> > deletion?
> > 
> > Also, have you considered all the cases when a slave fails at any point 
> > during isolate() or cleanup()?
> > 
> > 1) pid, no cid, no link  (slave dies during isolate() or cleanup())
> > 2) pid, cid, no link     (slave dies during isolate() or cleanup())
> > 3) pid, cid, link        (steady state)
> > 
> > If there is symmetry these are the only cases to consider. If not, there 
> > are more cases to consider.

Yeah, I agree with you. Previously, I want to remove the link in the end so 
that we don't leak a bind mount if slave crashes between veth removal and 
umount. However, as you pointed out, the bind mount will be leaked anyway if 
the slave crashes in isolate() since we create the bind mount first (and we 
must do because we need to have a reference to the namespace handle before 
creating veth, otherwise, it'll be removed automatically if the child process 
exits early).

So I'll swap the deletion order in cleanup() and augument the stale bind mounts 
cleanup (https://issues.apache.org/jira/browse/MESOS-2547).


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32426/#review77596
-----------------------------------------------------------


On March 24, 2015, 2:44 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32426/
> -----------------------------------------------------------
> 
> (Updated March 24, 2015, 2:44 a.m.)
> 
> 
> Review request for mesos, Chi Zhang, Ian Downes, and Vinod Kone.
> 
> 
> Bugs: MESOS-2528
>     https://issues.apache.org/jira/browse/MESOS-2528
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Symlink the namespace handle with ContainerID for the port mapping isolator.
> 
> See ticket for details. This patch will allow a smooth upgrade (i.e., rolling 
> foward and back are both safe).
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/isolators/network/port_mapping.hpp 
> 4dd066a47d43cb1d52f93294d86309151738743e 
>   src/slave/containerizer/isolators/network/port_mapping.cpp 
> 4bf0adeeac1cb6fe59f9c2ca8d5980b1500f5ddd 
>   src/tests/port_mapping_tests.cpp 8192deac8d9b7ea1896bb62a8b5961ef90326fa4 
> 
> Diff: https://reviews.apache.org/r/32426/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> Many existing tests should already capture the regression. For example, 
> ROOT_CleanUpOrphanTest checks if the bind mount dir is empty after the 
> container is destroyed.
> 
> Will add a compatibility test tomorrow.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Reply via email to