----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52446/#review151092 -----------------------------------------------------------
Fix it, then Ship it! src/slave/containerizer/mesos/containerizer.cpp (lines 711 - 713) <https://reviews.apache.org/r/52446/#comment219207> Can you introduce a helper in paths.hpp|cpp: ``` paths::getContainerTerminationPath(...); ``` src/slave/containerizer/mesos/containerizer.cpp (lines 1699 - 1705) <https://reviews.apache.org/r/52446/#comment219206> Let's do that only for nested container. src/slave/containerizer/mesos/containerizer.cpp (line 1708) <https://reviews.apache.org/r/52446/#comment219208> Failed to get container termination file: ... src/slave/containerizer/mesos/containerizer.cpp (lines 2197 - 2203) <https://reviews.apache.org/r/52446/#comment219210> NO need for this? I think checkpoint supports a Protobuf Message directly. src/slave/containerizer/mesos/containerizer.cpp (line 2211) <https://reviews.apache.org/r/52446/#comment219211> Totally orthogonol issue: I am wondering if we should prevent people from creating nested container under a legacy container? src/slave/containerizer/mesos/paths.cpp (lines 142 - 154) <https://reviews.apache.org/r/52446/#comment219212> you can use protobuf::read<ContainerTermination> src/tests/containerizer/nested_container_tests.cpp (line 32) <https://reviews.apache.org/r/52446/#comment219213> Please avoid this. Use 'using' explicitly src/tests/containerizer/nested_container_tests.cpp (line 48) <https://reviews.apache.org/r/52446/#comment219214> 2 lines apart src/tests/containerizer/nested_container_tests.cpp (line 59) <https://reviews.apache.org/r/52446/#comment219215> s/false/true/ src/tests/containerizer/nested_container_tests.cpp (lines 95 - 97) <https://reviews.apache.org/r/52446/#comment219216> No need for this. src/tests/containerizer/nested_container_tests.cpp (line 116) <https://reviews.apache.org/r/52446/#comment219217> This looks redundant? The fact the first 'wait' returns means that container has been destroyed. - Jie Yu On Sept. 30, 2016, 9:27 p.m., Kevin Klues wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/52446/ > ----------------------------------------------------------- > > (Updated Sept. 30, 2016, 9:27 p.m.) > > > Review request for mesos, Gilbert Song, Jie Yu, and Vinod Kone. > > > Bugs: MESOS-6287 > https://issues.apache.org/jira/browse/MESOS-6287 > > > Repository: mesos > > > Description > ------- > > Previously, when a nested container was being destroyed, it's runtime > directory was being deleted (just the same as a top-level container). > However, this meant that calling 'wait()' on a previously terminated > nested container would return 'None()' since its status had already > been reaped. The problem with this, however, is that this will cause > an entire pod to be terminated since it thinks that the container it > is calling wait on cannot be found. > > To fix this, we leave the runtime directory of nested containers > around until their top-level containers are destroyed. Additionally, > we checkpiont the entire termination state of the nested container > into its runtime directory, so that subsequent calls to 'wait()' can > retrieve the full termination state for the lifetime of the top-level > container. > > > Diffs > ----- > > src/Makefile.am f093000e0282a8d5ac17e7ba33711690ccdfe68a > src/slave/containerizer/mesos/containerizer.cpp > 522d2c37229b07b66a0824c3e246c32f8d803b10 > src/slave/containerizer/mesos/paths.hpp > 1051c219c55253d03199045b6d2f43377ae93e53 > src/slave/containerizer/mesos/paths.cpp > 6c6b4dcc39fbc00485552caab88457918e622e08 > src/tests/containerizer/nested_container_tests.cpp PRE-CREATION > > Diff: https://reviews.apache.org/r/52446/diff/ > > > Testing > ------- > > GTEST_FILTER="" make -j check > sudo src/mesos-tests > > > Thanks, > > Kevin Klues > >