> On Aug. 7, 2018, 4:03 p.m., Vinod Kone wrote: > > src/slave/containerizer/mesos/utils.cpp > > Line 102 (original), 105 (patched) > > <https://reviews.apache.org/r/68257/diff/1/?file=2069850#file2069850line108> > > > > Are we guaranteed that there are no short-lived processes, other than > > the task process, at the 2nd level? If not, we will have the same issue > > right? > > > > Modulo the above question, the change LGTM. > > Andrei Budnik wrote: > There are two types of 2nd-level processes: > 1) the command executor's task > 2) the nested container's task > > E.g. the process tree can be like the following: > 0. mesos-containerizer (`init`) > 1. mesos-executor (command executor) > 2. sleep 1000 (command executor's task) > 1. mesos-containerizer (`init` of a nested container) <- enters `mnt` > namespace of command executor's task before forking a task > 2. echo "echo" (nested container's task) > > Since we skip 1st-level processes whose `mnt` namespace is not the same > as `init` process (PID 1), the algorithm doesn't iterate over their 2nd-level > processes. That gives a gurarantee that we only detect command executor's > task, but not the short-lived processes. > > Alexander Rukletsov wrote: > Let's make it explicit to the reader. I suggest to: > 1. Collect _candidates_ (2nd level children) first. > 2. Assert there are >= 1 candidates. > 3. Filter all candidates except those whose (`mnt` differs from root's > `mnt` && parent's `mnt` equals to root's `mnt`). > 4. Assert there are now 0 or 1 candidates left. > 5. Return pid of that single candidate or root's pid if there are none.
@andrei Thanks for the info. IIUC, any processes launched by the mesos-executor or the task using `LAUNCH_NESTED_CONTAINER` API will end up with the process tree you showed above. I just wanted to make sure it is not possible for us to see a child process of command executor that is a sibling of the task process (e.g., task launched a process without using LNC API but it ended up as a sibling, command executor runs a script/binary without using LNC API etc). - Vinod ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68257/#review206947 ----------------------------------------------------------- On Aug. 7, 2018, 1:46 p.m., Andrei Budnik wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68257/ > ----------------------------------------------------------- > > (Updated Aug. 7, 2018, 1:46 p.m.) > > > Review request for mesos, Alexander Rukletsov, Gilbert Song, Jie Yu, and > Kevin Klues. > > > Bugs: MESOS-9116 > https://issues.apache.org/jira/browse/MESOS-9116 > > > Repository: mesos > > > Description > ------- > > Previously, we were walking the process tree from the container's > `init` process to find the first process along the way whose `mnt` > namespace differs from the `init` process. We expected this algorithm > to always return the PID of the command executor's task. However, if > someone launches multiple nested containers within the process tree, > the algorithm might detect the PID of the nested container instead of > the command executor's task. The detected PID might belong to a > short-lived container, so the container's process might terminate at > the moment the containerizer launcher (aka `nanny`) process tries to > enter its `mnt` namespace. This patch fixes the detection algorithm > so that it always returns PID of the command executor's task. > > > Diffs > ----- > > src/slave/containerizer/mesos/utils.cpp > 30e76d1d91651975033078f5450e45f5f2fd8ba0 > > > Diff: https://reviews.apache.org/r/68257/diff/1/ > > > Testing > ------- > > 1) Internal CI with disabled > `ROOT_CGROUPS_LaunchNestedContainerSessionsInParallel` test (see previous > patch). > 2) Fedora 25: `./src/mesos-tests > --gtest_filter=*AgentAPITest.LaunchNestedContainerSessionInParallel* > --gtest_break_on_failure --gtest_repeat=100 --verbose` > > > Thanks, > > Andrei Budnik > >