----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69684/#review211790 -----------------------------------------------------------
Fix it, then Ship it! src/slave/containerizer/mesos/containerizer.cpp Lines 1385-1386 (patched) <https://reviews.apache.org/r/69684/#comment297389> 4 space ahead if you make it to the newline after ( I will fix it for you - Gilbert Song On Jan. 7, 2019, 7:27 a.m., Andrei Budnik wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69684/ > ----------------------------------------------------------- > > (Updated Jan. 7, 2019, 7:27 a.m.) > > > Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Vinod Kone. > > > Bugs: MESOS-6632 > https://issues.apache.org/jira/browse/MESOS-6632 > > > Repository: mesos > > > Description > ------- > > For the period between IOSB server is up and container process exec, > if the containerizer launch returns failure or discarded, the FD used > for signaling from the container process to the IOSB finish redirect > will be leaked, which would cause the IOSB stuck at `io::redirect` > forever. It would block the containerizer from cleaning up this > container at IOSB. > > This issue could be exposed if there are frequent check containers > launch on an agent with heavy loads. > > This patch fixes the issue by handling discard of a `launch` future, > so the container IO is cleaned up and therefore all FDs are closed. > > > Diffs > ----- > > src/slave/containerizer/mesos/containerizer.cpp > a5cf2da55c046c5c45e0c2ca3400f64de12de62b > > > Diff: https://reviews.apache.org/r/69684/diff/6/ > > > Testing > ------- > > Manual testing: > > Add `return Failure("Test!");` to `IOSwitchboard::_prepare()`: > https://github.com/apache/mesos/blob/7a378597d6a4359de0b1eabb563d4c12cdb4290e/src/slave/containerizer/mesos/io/switchboard.cpp#L657 > > 1) Compile without this patch applied > 2) Run `src/mesos-tests --verbose > --gtest_filter=ContentType/AgentAPITest.AttachContainerInputRepeat/0` > 3) This test fails due to a hanging IOSB (15 seconds timeout) after trying to > destroy IOSB on the HTTP error "Test!" > > 1) Compile with this patch applied > 2) Run `src/mesos-tests --verbose > --gtest_filter=ContentType/AgentAPITest.AttachContainerInputRepeat/0` > 3) This test fails due to the HTTP error "Test!" and terminates immediately > (as expected) > > > Thanks, > > Andrei Budnik > >