[ https://issues.apache.org/jira/browse/MESOS-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Lambert updated MESOS-976: -------------------------------- Sprint: Q3 Sprint 1, Q3 Sprint 2 (was: Q3 Sprint 1) > SlaveRecoveryTest/1.SchedulerFailover is flaky > ---------------------------------------------- > > Key: MESOS-976 > URL: https://issues.apache.org/jira/browse/MESOS-976 > Project: Mesos > Issue Type: Bug > Components: test > Affects Versions: 0.18.0 > Reporter: Vinod Kone > Assignee: Ian Downes > > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from SlaveRecoveryTest/1, where TypeParam = > mesos::internal::slave::CgroupsIsolator > [ RUN ] SlaveRecoveryTest/1.SchedulerFailover > I0206 20:18:31.525116 56447 master.cpp:239] Master ID: > 2014-02-06-20:18:31-1740121354-55566-56447 Hostname: > smfd-bkq-03-sr4.devel.twitter.com > I0206 20:18:31.525295 56481 master.cpp:321] Master started on > 10.37.184.103:55566 > I0206 20:18:31.525315 56481 master.cpp:324] Master only allowing > authenticated frameworks to register! > I0206 20:18:31.527093 56481 master.cpp:756] The newly elected leader is > master@10.37.184.103:55566 > I0206 20:18:31.527122 56481 master.cpp:764] Elected as the leading master! > I0206 20:18:31.530642 56473 slave.cpp:112] Slave started on > 9)@10.37.184.103:55566 > I0206 20:18:31.530802 56473 slave.cpp:212] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0206 20:18:31.531203 56473 slave.cpp:240] Slave hostname: > smfd-bkq-03-sr4.devel.twitter.com > I0206 20:18:31.531221 56473 slave.cpp:241] Slave checkpoint: true > I0206 20:18:31.531991 56482 cgroups_isolator.cpp:225] Using > /tmp/mesos_test_cgroup as cgroups hierarchy root > I0206 20:18:31.532470 56478 state.cpp:33] Recovering state from > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta' > I0206 20:18:31.532698 56469 status_update_manager.cpp:188] Recovering status > update manager > I0206 20:18:31.533962 56472 sched.cpp:265] Authenticating with master > master@10.37.184.103:55566 > I0206 20:18:31.534102 56472 sched.cpp:234] Detecting new master > I0206 20:18:31.534124 56484 authenticatee.hpp:124] Creating new client SASL > connection > I0206 20:18:31.534299 56473 master.cpp:2317] Authenticating framework at > scheduler(9)@10.37.184.103:55566 > I0206 20:18:31.534459 56461 authenticator.hpp:140] Creating new server SASL > connection > I0206 20:18:31.534572 56466 authenticatee.hpp:212] Received SASL > authentication mechanisms: CRAM-MD5 > I0206 20:18:31.534595 56466 authenticatee.hpp:238] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0206 20:18:31.534667 56474 authenticator.hpp:243] Received SASL > authentication start > I0206 20:18:31.534732 56474 authenticator.hpp:325] Authentication requires > more steps > I0206 20:18:31.534814 56468 authenticatee.hpp:258] Received SASL > authentication step > I0206 20:18:31.534946 56466 authenticator.hpp:271] Received SASL > authentication step > I0206 20:18:31.535007 56466 authenticator.hpp:317] Authentication success > I0206 20:18:31.535084 56471 authenticatee.hpp:298] Authentication success > I0206 20:18:31.535107 56461 master.cpp:2357] Successfully authenticated > framework at scheduler(9)@10.37.184.103:55566 > I0206 20:18:31.535392 56476 sched.cpp:339] Successfully authenticated with > master master@10.37.184.103:55566 > I0206 20:18:31.535512 56465 master.cpp:812] Received registration request > from scheduler(9)@10.37.184.103:55566 > I0206 20:18:31.535570 56465 master.cpp:830] Registering framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 at > scheduler(9)@10.37.184.103:55566 > I0206 20:18:31.535856 56465 hierarchical_allocator_process.hpp:332] Added > framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.537802 56482 cgroups_isolator.cpp:840] Recovering isolator > I0206 20:18:31.538462 56472 slave.cpp:2760] Finished recovery > I0206 20:18:31.538910 56472 slave.cpp:508] New master detected at > master@10.37.184.103:55566 > I0206 20:18:31.539036 56478 status_update_manager.cpp:162] New master > detected at master@10.37.184.103:55566 > I0206 20:18:31.539223 56464 master.cpp:1834] Attempting to register slave on > smfd-bkq-03-sr4.devel.twitter.com at slave(9)@10.37.184.103:55566 > I0206 20:18:31.539271 56472 slave.cpp:533] Detecting new master > I0206 20:18:31.539330 56464 master.cpp:2804] Adding slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 at > smfd-bkq-03-sr4.devel.twitter.com with cpus(*):2; mem(*):1024; disk(*):1024; > ports(*):[31000-32000] > I0206 20:18:31.539454 56472 slave.cpp:551] Registered with master > master@10.37.184.103:55566; given slave ID > 2014-02-06-20:18:31-1740121354-55566-56447-0 > I0206 20:18:31.539620 56472 slave.cpp:564] Checkpointing SlaveInfo to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/slave.info' > I0206 20:18:31.539834 56475 hierarchical_allocator_process.hpp:445] Added > slave 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) with cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] available) > I0206 20:18:31.540341 56472 master.cpp:2272] Sending 1 offers to framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.543433 56472 master.cpp:1568] Processing reply for offers: [ > 2014-02-06-20:18:31-1740121354-55566-56447-0 ] on slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) for framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.543642 56472 master.hpp:411] Adding task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) > I0206 20:18:31.543781 56472 master.cpp:2441] Launching task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) > I0206 20:18:31.544002 56484 slave.cpp:736] Got assigned task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.544097 56484 slave.cpp:2899] Checkpointing FrameworkInfo to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.info' > I0206 20:18:31.544272 56484 slave.cpp:2906] Checkpointing framework pid > 'scheduler(9)@10.37.184.103:55566' to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.pid' > I0206 20:18:31.544617 56484 slave.cpp:845] Launching task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.546721 56484 slave.cpp:3169] Checkpointing ExecutorInfo to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/executor.info' > I0206 20:18:31.547317 56484 slave.cpp:3257] Checkpointing TaskInfo to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/tasks/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/task.info' > I0206 20:18:31.547514 56484 slave.cpp:955] Queuing task > 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' for executor > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > '2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.547590 56481 cgroups_isolator.cpp:517] Launching > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 > (/home/vinod/mesos/build/src/mesos-executor) in > /tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986 > with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] > for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 in cgroup > mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986 > I0206 20:18:31.548408 56481 cgroups_isolator.cpp:717] Changing cgroup > controls for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0206 20:18:31.548833 56481 cgroups_isolator.cpp:1007] Updated 'cpu.shares' > to 2048 for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.549294 56481 cgroups_isolator.cpp:1117] Updated > 'memory.soft_limit_in_bytes' to 1GB for executor > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.550107 56481 cgroups_isolator.cpp:1147] Updated > 'memory.limit_in_bytes' to 1GB for executor > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.550571 56481 cgroups_isolator.cpp:1174] Started listening for > OOM events for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.551553 56481 cgroups_isolator.cpp:569] Forked executor at = > 56671 > Checkpointing executor's forked pid 56671 to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/pids/forked.pid' > I0206 20:18:31.552222 56472 slave.cpp:2098] Monitoring executor > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 forked at pid 56671 > Fetching resources into > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986' > I0206 20:18:31.604012 56472 slave.cpp:1431] Got registration for executor > 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.604167 56472 slave.cpp:1516] Checkpointing executor pid > 'executor(1)@10.37.184.103:46181' to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/pids/libprocess.pid' > I0206 20:18:31.605183 56472 slave.cpp:1552] Flushing queued task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for executor > 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > Registered executor on smfd-bkq-03-sr4.devel.twitter.com > Starting task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 > sh -c 'sleep 1000' > Forked command at 56712 > I0206 20:18:31.613098 56481 slave.cpp:1765] Handling status update > TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 from > executor(1)@10.37.184.103:46181 > I0206 20:18:31.613628 56469 status_update_manager.cpp:314] Received status > update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.614006 56469 status_update_manager.hpp:342] Checkpointing > UPDATE for status update TASK_RUNNING (UUID: > fc151a46-751b-4c4b-b048-1727752f34e3) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.795529 56469 status_update_manager.cpp:367] Forwarding status > update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 to master@10.37.184.103:55566 > I0206 20:18:31.795992 56480 slave.cpp:1890] Sending acknowledgement for > status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for > task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 to > executor(1)@10.37.184.103:46181 > I0206 20:18:31.796131 56471 master.cpp:2020] Status update TASK_RUNNING > (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 from > slave(9)@10.37.184.103:55566 > I0206 20:18:31.797099 56483 status_update_manager.cpp:392] Received status > update acknowledgement (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.797165 56483 status_update_manager.hpp:342] Checkpointing ACK > for status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) > for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.882767 56481 slave.cpp:394] Slave terminating > I0206 20:18:31.883112 56481 master.cpp:641] Slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) disconnected > I0206 20:18:31.883200 56476 hierarchical_allocator_process.hpp:484] Slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 disconnected > I0206 20:18:31.888206 56473 sched.cpp:265] Authenticating with master > master@10.37.184.103:55566 > I0206 20:18:31.888473 56473 sched.cpp:234] Detecting new master > I0206 20:18:31.888556 56469 authenticatee.hpp:124] Creating new client SASL > connection > I0206 20:18:31.888978 56484 master.cpp:2317] Authenticating framework at > scheduler(10)@10.37.184.103:55566 > I0206 20:18:31.889348 56469 authenticator.hpp:140] Creating new server SASL > connection > I0206 20:18:31.889925 56469 authenticatee.hpp:212] Received SASL > authentication mechanisms: CRAM-MD5 > I0206 20:18:31.889989 56469 authenticatee.hpp:238] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0206 20:18:31.890059 56469 authenticator.hpp:243] Received SASL > authentication start > I0206 20:18:31.890233 56469 authenticator.hpp:325] Authentication requires > more steps > I0206 20:18:31.890399 56468 authenticatee.hpp:258] Received SASL > authentication step > I0206 20:18:31.890554 56484 authenticator.hpp:271] Received SASL > authentication step > I0206 20:18:31.890630 56484 authenticator.hpp:317] Authentication success > I0206 20:18:31.890728 56470 authenticatee.hpp:298] Authentication success > I0206 20:18:31.890748 56484 master.cpp:2357] Successfully authenticated > framework at scheduler(10)@10.37.184.103:55566 > I0206 20:18:31.892210 56469 sched.cpp:339] Successfully authenticated with > master master@10.37.184.103:55566 > I0206 20:18:31.892410 56473 master.cpp:900] Re-registering framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 at > scheduler(10)@10.37.184.103:55566 > I0206 20:18:31.892460 56473 master.cpp:926] Framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 failed over > W0206 20:18:31.892691 56465 master.cpp:1048] Ignoring deactivate framework > message for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 from > 'scheduler(9)@10.37.184.103:55566' because it is not from the registered > framework 'scheduler(10)@10.37.184.103:55566' > I0206 20:18:31.897049 56466 slave.cpp:112] Slave started on > 10)@10.37.184.103:55566 > I0206 20:18:31.897207 56466 slave.cpp:212] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0206 20:18:31.897536 56466 slave.cpp:240] Slave hostname: > smfd-bkq-03-sr4.devel.twitter.com > I0206 20:18:31.897554 56466 slave.cpp:241] Slave checkpoint: true > I0206 20:18:31.898388 56463 cgroups_isolator.cpp:225] Using > /tmp/mesos_test_cgroup as cgroups hierarchy root > I0206 20:18:31.898936 56472 state.cpp:33] Recovering state from > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta' > I0206 20:18:31.901702 56465 slave.cpp:2828] Recovering framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.901759 56465 slave.cpp:3020] Recovering executor > 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:31.902716 56464 status_update_manager.cpp:188] Recovering status > update manager > I0206 20:18:31.902884 56464 status_update_manager.cpp:196] Recovering > executor 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.475915 56463 cgroups_isolator.cpp:840] Recovering isolator > I0206 20:18:34.476066 56463 cgroups_isolator.cpp:847] Recovering executor > 'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.477478 56463 cgroups_isolator.cpp:1174] Started listening for > OOM events for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.478728 56463 slave.cpp:2700] Sending reconnect request to > executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 at > executor(1)@10.37.184.103:46181 > I0206 20:18:34.480114 56476 slave.cpp:1597] Re-registering executor > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.480566 56476 cgroups_isolator.cpp:717] Changing cgroup > controls for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0206 20:18:34.481370 56476 cgroups_isolator.cpp:1007] Updated 'cpu.shares' > to 2048 for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.481827 56476 cgroups_isolator.cpp:1117] Updated > 'memory.soft_limit_in_bytes' to 1GB for executor > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > Re-registered executor on smfd-bkq-03-sr4.devel.twitter.com > I0206 20:18:34.489497 56471 slave.cpp:1713] Cleaning up un-reregistered > executors > I0206 20:18:34.489588 56471 slave.cpp:2760] Finished recovery > I0206 20:18:34.490048 56463 slave.cpp:508] New master detected at > master@10.37.184.103:55566 > I0206 20:18:34.490257 56475 status_update_manager.cpp:162] New master > detected at master@10.37.184.103:55566 > I0206 20:18:34.490357 56463 slave.cpp:533] Detecting new master > W0206 20:18:34.490603 56480 master.cpp:1878] Slave at > slave(10)@10.37.184.103:55566 (smfd-bkq-03-sr4.devel.twitter.com) is being > allowed to re-register with an already in use id > (2014-02-06-20:18:31-1740121354-55566-56447-0) > I0206 20:18:34.490927 56479 slave.cpp:601] Re-registered with master > master@10.37.184.103:55566 > I0206 20:18:34.491322 56461 hierarchical_allocator_process.hpp:498] Slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 reconnected > I0206 20:18:34.491421 56468 slave.cpp:1312] Updating framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 pid to > scheduler(10)@10.37.184.103:55566 > I0206 20:18:34.491444 56480 master.cpp:1673] Asked to kill task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.491488 56468 slave.cpp:1320] Checkpointing framework pid > 'scheduler(10)@10.37.184.103:55566' to > '/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.pid' > I0206 20:18:34.491497 56480 master.cpp:1707] Telling slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) to kill task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.491657 56468 slave.cpp:1013] Asked to kill task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > Shutting down > Killing process tree at pid 56712 > Killed the following process trees: > [ > --- 56712 sleep 1000 > ] > Command terminated with signal Killed (pid: 56712) > I0206 20:18:34.615216 56463 slave.cpp:1765] Handling status update > TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 from > executor(1)@10.37.184.103:46181 > I0206 20:18:34.615556 56483 cgroups_isolator.cpp:717] Changing cgroup > controls for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources > I0206 20:18:34.615624 56476 status_update_manager.cpp:314] Received status > update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.615701 56476 status_update_manager.hpp:342] Checkpointing > UPDATE for status update TASK_KILLED (UUID: > d9d37827-3002-4a67-8659-fa36f1986fc7) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.706945 56476 status_update_manager.cpp:367] Forwarding status > update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 to master@10.37.184.103:55566 > I0206 20:18:34.707263 56476 slave.cpp:1890] Sending acknowledgement for > status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for > task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 to > executor(1)@10.37.184.103:46181 > I0206 20:18:34.707352 56469 master.cpp:2020] Status update TASK_KILLED (UUID: > d9d37827-3002-4a67-8659-fa36f1986fc7) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 from > slave(10)@10.37.184.103:55566 > I0206 20:18:34.707620 56469 master.hpp:429] Removing task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) > I0206 20:18:34.708348 56466 hierarchical_allocator_process.hpp:637] Recovered > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total > allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on > slave 2014-02-06-20:18:31-1740121354-55566-56447-0 from framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.708673 56469 status_update_manager.cpp:392] Received status > update acknowledgement (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task > d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.708749 56469 status_update_manager.hpp:342] Checkpointing ACK > for status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) > for task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.709411 56470 master.cpp:2272] Sending 1 offers to framework > 2014-02-06-20:18:31-1740121354-55566-56447-0000 > I0206 20:18:34.809782 56447 master.cpp:583] Master terminating > I0206 20:18:34.810066 56447 master.cpp:246] Shutting down master > I0206 20:18:34.810134 56482 slave.cpp:1965] master@10.37.184.103:55566 exited > W0206 20:18:34.810184 56482 slave.cpp:1968] Master disconnected! Waiting for > a new master to be elected > I0206 20:18:34.810652 56447 master.cpp:289] Removing slave > 2014-02-06-20:18:31-1740121354-55566-56447-0 > (smfd-bkq-03-sr4.devel.twitter.com) > I0206 20:18:34.813144 56447 slave.cpp:394] Slave terminating > I0206 20:18:34.821583 56467 cgroups.cpp:1209] Trying to freeze cgroup > /tmp/mesos_test_cgroup/mesos_test > I0206 20:18:34.821652 56467 cgroups.cpp:1248] Successfully froze cgroup > /tmp/mesos_test_cgroup/mesos_test after 1 attempts > I0206 20:18:34.823129 56471 cgroups.cpp:1224] Trying to thaw cgroup > /tmp/mesos_test_cgroup/mesos_test > I0206 20:18:34.823247 56471 cgroups.cpp:1334] Successfully thawed > /tmp/mesos_test_cgroup/mesos_test > I0206 20:18:34.923945 56470 cgroups.cpp:1209] Trying to freeze cgroup > /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986 > I0206 20:18:34.924018 56470 cgroups.cpp:1248] Successfully froze cgroup > /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986 > after 1 attempts > I0206 20:18:34.925506 56461 cgroups.cpp:1224] Trying to thaw cgroup > /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986 > I0206 20:18:34.925580 56461 cgroups.cpp:1334] Successfully thawed > /tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986 > [ OK ] SlaveRecoveryTest/1.SchedulerFailover (3408 ms) > [----------] 1 test from SlaveRecoveryTest/1 (3409 ms total) > [----------] Global test environment tear-down > ../../src/tests/environment.cpp:247: Failure > Failed > Tests completed with child processes remaining: > -+- 56447 /home/vinod/mesos/build/src/.libs/lt-mesos-tests --verbose > --gtest_filter=*SlaveRecoveryTest/1.SchedulerFailover* --gtest_repeat=10 > \--- 56671 () -- This message was sent by Atlassian JIRA (v6.2#6252)