[ https://issues.apache.org/jira/browse/MESOS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Lambert updated MESOS-1149: --------------------------------- Sprint: Q2'14 Sprint 2, Q2'14 Sprint 3 (was: Q2'14 Sprint 2) > SlaveRecovery.Reboot test doesn't reap executor > ----------------------------------------------- > > Key: MESOS-1149 > URL: https://issues.apache.org/jira/browse/MESOS-1149 > Project: Mesos > Issue Type: Bug > Components: test > Affects Versions: 0.18.0 > Reporter: Ian Downes > Assignee: Ian Downes > > The executor pid should be reaped after the slave is "rebooted" and before > the next slave is started to correctly simulate a host reboot, otherwise > there's a race and it may be present (as a zombie) when the test completes. > {noformat} > # ./bin/mesos-tests.sh --gtest_filter="SlaveRecovery*Reboot" > --gtest_repeat=100 --gtest_break_on_failure=1 > Source directory: /home/idownes/workspace/mesos > Build directory: /home/idownes/workspace/mesos/build > ------------------------------------------------------------- > We cannot run any cgroups tests that require mounting > hierarchies because you have the following hierarchies mounted: > /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/freezer, > /sys/fs/cgroup/memory > We'll disable the CgroupsNoHierarchyTest test fixture for now. > ------------------------------------------------------------- > Repeating all tests (iteration 1) . . . > Note: Google Test filter = > SlaveRecovery*Reboot-CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy: > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from SlaveRecoveryTest/0, where TypeParam = > mesos::internal::slave::MesosContainerizer > [ RUN ] SlaveRecoveryTest/0.Reboot > WARNING: Logging before InitGoogleLogging() is written to STDERR > I0326 22:44:23.032676 34814 exec.cpp:131] Version: 0.19.0 > I0326 22:44:23.035573 34835 exec.cpp:205] Executor registered on slave > 20140326-224421-1828659978-41066-34759-0 > Registered executor on smfd-atr-11-sr1.devel.twitter.com > Starting task f503d996-3e82-43f6-861b-38bacd5e4855 > sh -c 'sleep 1000' > Forked command at 34854 > I0326 22:44:23.263057 34852 exec.cpp:378] Executor asked to shutdown > Shutting down > Killing process tree at pid 34854 > Killed the following process trees: > [ > --- 34854 sleep 1000 > ] > [ OK ] SlaveRecoveryTest/0.Reboot (1997 ms) > [----------] 1 test from SlaveRecoveryTest/0 (1998 ms total) > [----------] Global test environment tear-down > ../../src/tests/environment.cpp:244: Failure > Failed > Tests completed with child processes remaining: > -+- 34759 /home/idownes/workspace/mesos/build/src/.libs/lt-mesos-tests > --gtest_filter=SlaveRecovery*Reboot --gtest_repeat=100 > --gtest_break_on_failure=1 > \--- 34814 () > {noformat} > 34814 () is the zombied executor. -- This message was sent by Atlassian JIRA (v6.2#6252)