----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56195/#review166829 -----------------------------------------------------------
include/mesos/slave/container_logger.hpp (lines 104 - 106) <https://reviews.apache.org/r/56195/#comment238904> I don't think we need those anymore? include/mesos/slave/container_logger.hpp (line 121) <https://reviews.apache.org/r/56195/#comment238925> This will probably break windows build? src/slave/containerizer/mesos/containerizer.cpp (line 1252) <https://reviews.apache.org/r/56195/#comment238905> s/containerIo/containerIO/ so that it's the same as that in the header. src/slave/containerizer/mesos/containerizer.cpp (line 2089) <https://reviews.apache.org/r/56195/#comment238983> This is a bit weird here, especially the name of the function suggests that it is a const function, but in fact it has side effects. I'd suggest we simply rely on isolator cleanup to cleanup the reference to the ContainerIO struct. src/slave/containerizer/mesos/io/switchboard.hpp (lines 56 - 57) <https://reviews.apache.org/r/56195/#comment238927> I would suggest we rename ContainerLogger::SubprocessInfo to ContainerIO. Also see my comments below. src/slave/containerizer/mesos/io/switchboard.cpp (lines 310 - 314) <https://reviews.apache.org/r/56195/#comment238928> If we only have `ContainerIO`, then the logic here can just be using the default copy constructor. Also, I'd suggest we don't use 'Owned' pointer here since ContainerIO is copyable. src/slave/containerizer/mesos/io/switchboard.cpp (lines 311 - 313) <https://reviews.apache.org/r/56195/#comment238926> The `std::move` here is not necessary. We haven't defined move constructor for Subprocess::IO. src/slave/containerizer/mesos/io/switchboard.cpp (line 626) <https://reviews.apache.org/r/56195/#comment238980> I'd suggest we try to remove this in `cleanup` as a safetynet. Also, as I mentioned above, let's cleanup the reference in cleanup so that we don't need to perform the extra step in containerizer destroy path. - Jie Yu On Feb. 22, 2017, 7:50 p.m., Kevin Klues wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56195/ > ----------------------------------------------------------- > > (Updated Feb. 22, 2017, 7:50 p.m.) > > > Review request for mesos, Alexander Rukletsov, Gastón Kleiman, Gilbert Song, > Jie Yu, Joseph Wu, and Vinod Kone. > > > Bugs: MESOS-7050 > https://issues.apache.org/jira/browse/MESOS-7050 > > > Repository: mesos > > > Description > ------- > > Previously, the containizer launch path would leak FDs if the > containerizer launch path failed between successfully calling > prepare() on either the ContainerLogger (in the case of the Docker > containerizer) or the IOSwitchboard (in the case of the mesos > containerizer) and forking the actual container. > > These components relied on the Subprocess call inside launcher->fork() > to close these FDS on their behalf. If the containerizer launch path > failed somewhere between calling prepare() and making this fork() > call, these FDs would never be closed. > > In the case of the IOSwitchboard, this would lead to deadlock in the > destroy path because the future returned by the IOSwitchboard's > cleanup function would never be satisfied. The IOSwitchboard doesn't > shutdown until the FDs it allocates to the container have been closed. > > This commit fixes this problem by updating the > ContainerLogger::SubprocessInfo::FD abstraction to change the way > manages FDS. Instead of tagging each FD with the Subprocess::IO::OWNED > label and forcing the launcher->fork() call to deal with closing the > FDs once it's forked a new subprocess, we now do things slightly > differently now. > > We now keep the default DUP label on each FD (instead fo changing it > to OWNED) to cause launcher->fork() to dup it before mapping it onto > the stdin/stdout/stderr of the subprocess. It then only closes the > duped FD, leaving the original one open. > > In doing so, it s now the contaienrizer's responsibility to ensure > that these FDs are closed properly (whether that's between a > successful prepare() call and launcher->fork()) or after > launcher->fork() has completed successfully. While this has the > potential to complicate things slightly on the SUCCESS path, > at least it is now the containerizers's responsibility to close these > FDS in *all* cases, rather than splitting that responsibility across > components. > > In order to simplify this, we've also modified the > ContainerLogger::SubprocessInfo::FD abstraction to hold a Shared > pointer to its underlying file descriptor and (optionally) close it on > destruction. With this, we can ensure that all file descriptors > created through this abstraction will be automatically closed onced > their final reference goes out of scope (even if its been copied > around several times). > > In essence, this release the containerizer from the burden of manually > closing these FDS itself. So long as it holds the final reference to > these FDs (which it does), they will be automatically closed along > *any* path out of containerizer->launch(). These are exactly the > semantics we wanted to achieve. > > In the case of the the ContainerLogger, ownership of these FDs (and > thus their final reference) is passed to the containerizer in the > SubprocessInfo struct returned by prepare(). In the case of the > IOSwitchboard, we had to add a new API call to transfer ownership > (since it is an isolator and prepare() can only return a protobuf), > but the end result is the same. > > > Diffs > ----- > > include/mesos/slave/container_logger.hpp > a3f619b79ca0188df9e231c600dfa396f39ab29a > include/mesos/slave/containerizer.proto > c70d437ac3f8f813fcb71c04275cc8eeeb946065 > src/slave/containerizer/mesos/containerizer.hpp > 10a9b57660388ac2409458a4d07af64cc3b185e2 > src/slave/containerizer/mesos/containerizer.cpp > d2b4f75a55dbe4746bc2dfc180335fa831a554ef > src/slave/containerizer/mesos/io/switchboard.hpp > 7a7ad2da047ef316f4382e45526d503c87a903a0 > src/slave/containerizer/mesos/io/switchboard.cpp > b948f3c44dd2e1af2228ca1579f24ae3a94160b0 > > Diff: https://reviews.apache.org/r/56195/diff/ > > > Testing > ------- > > Linux CentOS 7: > ``` > GTEST_FILTER="" make -j check > sudo src/mesos-tests > ``` > > > Thanks, > > Kevin Klues > >