-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56195/#review166040
-----------------------------------------------------------



Patch looks great!

Reviews applied: [56814, 56195]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Feb. 19, 2017, 4:37 p.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56195/
> -----------------------------------------------------------
> 
> (Updated Feb. 19, 2017, 4:37 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Gastón Kleiman, Gilbert Song, 
> Jie Yu, Joseph Wu, and Vinod Kone.
> 
> 
> Bugs: MESOS-7050
>     https://issues.apache.org/jira/browse/MESOS-7050
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, the containizer launch path would leak FDs if the
> containerizer launch path failed between successfully calling
> prepare() on either the ContainerLogger (in the case of the Docker
> containerizer) or the IOSwitchboard (in the case of the mesos
> containerizer) and forking the actual container.
> 
> These components relied on the Subprocess call inside launcher->fork()
> to close these FDS on their behalf. If the containerizer launch path
> failed somewhere between calling prepare() and making this fork()
> call, these FDs would never be closed.
> 
> In the case of the IOSwitchboard, this would lead to deadlock in the
> destroy path because the future returned by the IOSwitchboard's
> cleanup function would never be satisfied. The IOSwitchboard doesn't
> shutdown until the FDs it allocates to the container have been closed.
> 
> This commit fixes this problem by updating the
> ContainerLogger::SubprocessInfo::FD abstraction to change the way
> manages FDS. Instead of tagging each FD with the Subprocess::IO::OWNED
> label and forcing the launcher->fork() call to deal with closing the
> FDs once it's forked a new subprocess, we now do things slightly
> differently now.
> 
> We now keep the default DUP label on each FD (instead fo changing it
> to OWNED) to cause launcher->fork() to dup it before mapping it onto
> the stdin/stdout/stderr of the subprocess. It then only closes the
> duped FD, leaving the original one open.
> 
> In doing so, it s now the contaienrizer's responsibility to ensure
> that these FDs are closed properly (whether that's between a
> successful prepare() call and launcher->fork()) or after
> launcher->fork() has completed successfully. While this has the
> potential to complicate things slightly on the SUCCESS path,
> at least it is now the containerizers's responsibility to close these
> FDS in *all* cases, rather than splitting that responsibility across
> components.
> 
> In order to simplify this, we've also modified the
> ContainerLogger::SubprocessInfo::FD abstraction to hold a Shared
> pointer to its underlying file descriptor and (optionally) close it on
> destruction. With this, we can ensure that all file descriptors
> created through this abstraction will be automatically closed onced
> their final reference goes out of scope (even if its been copied
> around several times).
> 
> In essence, this release the containerizer from the burden of manually
> closing these FDS itself. So long as it holds the final reference to
> these FDs (which it does), they will be automatically closed along
> *any* path out of containerizer->launch(). These are exactly the
> semantics we wanted to achieve.
> 
> In the case of the the ContainerLogger, ownership of these FDs (and
> thus their final reference) is passed to the containerizer in the
> SubprocessInfo struct returned by prepare(). In the case of the
> IOSwitchboard, we had to add a new API call to transfer ownership
> (since it is an isolator and prepare() can only return a protobuf),
> but the end result is the same.
> 
> 
> Diffs
> -----
> 
>   include/mesos/slave/container_logger.hpp 
> a3f619b79ca0188df9e231c600dfa396f39ab29a 
>   include/mesos/slave/containerizer.proto 
> c70d437ac3f8f813fcb71c04275cc8eeeb946065 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 10a9b57660388ac2409458a4d07af64cc3b185e2 
>   src/slave/containerizer/mesos/containerizer.cpp 
> d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/io/switchboard.hpp 
> 7a7ad2da047ef316f4382e45526d503c87a903a0 
>   src/slave/containerizer/mesos/io/switchboard.cpp 
> b948f3c44dd2e1af2228ca1579f24ae3a94160b0 
> 
> Diff: https://reviews.apache.org/r/56195/diff/
> 
> 
> Testing
> -------
> 
> Linux CentOS 7:
> ```
> GTEST_FILTER="" make -j check
> sudo src/mesos-tests
> ```
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>

Reply via email to