[ https://issues.apache.org/jira/browse/MESOS-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Mahler reopened MESOS-2228: ------------------------------------ Turns out this change didn't increase 3 second timeout in the command executor after all, I'll follow up. > SlaveTest.MesosExecutorGracefulShutdown is flaky > ------------------------------------------------ > > Key: MESOS-2228 > URL: https://issues.apache.org/jira/browse/MESOS-2228 > Project: Mesos > Issue Type: Bug > Components: test > Affects Versions: 0.22.0 > Reporter: Vinod Kone > Assignee: Benjamin Mahler > Labels: twitter > Fix For: 0.22.0 > > > Observed this on internal CI > {noformat} > [ RUN ] SlaveTest.MesosExecutorGracefulShutdown > Using temporary directory > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_AWdtVJ' > I0124 08:14:04.399211 7926 leveldb.cpp:176] Opened db in 27.364056ms > I0124 08:14:04.402632 7926 leveldb.cpp:183] Compacted db in 3.357646ms > I0124 08:14:04.402691 7926 leveldb.cpp:198] Created db iterator in 23822ns > I0124 08:14:04.402708 7926 leveldb.cpp:204] Seeked to beginning of db in > 1913ns > I0124 08:14:04.402716 7926 leveldb.cpp:273] Iterated through 0 keys in the > db in 458ns > I0124 08:14:04.402767 7926 replica.cpp:744] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0124 08:14:04.403728 7951 recover.cpp:449] Starting replica recovery > I0124 08:14:04.404011 7951 recover.cpp:475] Replica is in EMPTY status > I0124 08:14:04.407765 7950 replica.cpp:641] Replica in EMPTY status received > a broadcasted recover request > I0124 08:14:04.408710 7951 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I0124 08:14:04.419666 7951 recover.cpp:566] Updating replica status to > STARTING > I0124 08:14:04.429719 7953 master.cpp:262] Master > 20150124-081404-16842879-47787-7926 (utopic) started on 127.0.1.1:47787 > I0124 08:14:04.429790 7953 master.cpp:308] Master only allowing > authenticated frameworks to register > I0124 08:14:04.429802 7953 master.cpp:313] Master only allowing > authenticated slaves to register > I0124 08:14:04.429826 7953 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_AWdtVJ/credentials' > I0124 08:14:04.430277 7953 master.cpp:357] Authorization enabled > I0124 08:14:04.432682 7953 master.cpp:1219] The newly elected leader is > master@127.0.1.1:47787 with id 20150124-081404-16842879-47787-7926 > I0124 08:14:04.432816 7953 master.cpp:1232] Elected as the leading master! > I0124 08:14:04.432894 7953 master.cpp:1050] Recovering from registrar > I0124 08:14:04.433212 7950 registrar.cpp:313] Recovering registrar > I0124 08:14:04.434226 7951 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 14.323302ms > I0124 08:14:04.434270 7951 replica.cpp:323] Persisted replica status to > STARTING > I0124 08:14:04.434489 7951 recover.cpp:475] Replica is in STARTING status > I0124 08:14:04.436164 7951 replica.cpp:641] Replica in STARTING status > received a broadcasted recover request > I0124 08:14:04.439368 7947 recover.cpp:195] Received a recover response from > a replica in STARTING status > I0124 08:14:04.440626 7947 recover.cpp:566] Updating replica status to VOTING > I0124 08:14:04.443667 7947 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 2.698664ms > I0124 08:14:04.443759 7947 replica.cpp:323] Persisted replica status to > VOTING > I0124 08:14:04.443925 7947 recover.cpp:580] Successfully joined the Paxos > group > I0124 08:14:04.444160 7947 recover.cpp:464] Recover process terminated > I0124 08:14:04.444543 7949 log.cpp:660] Attempting to start the writer > I0124 08:14:04.446331 7949 replica.cpp:477] Replica received implicit > promise request with proposal 1 > I0124 08:14:04.449329 7949 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 2.690453ms > I0124 08:14:04.449388 7949 replica.cpp:345] Persisted promised to 1 > I0124 08:14:04.450637 7947 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0124 08:14:04.452271 7949 replica.cpp:378] Replica received explicit > promise request for position 0 with proposal 2 > I0124 08:14:04.455124 7949 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 2.593522ms > I0124 08:14:04.455157 7949 replica.cpp:679] Persisted action at 0 > I0124 08:14:04.456594 7951 replica.cpp:511] Replica received write request > for position 0 > I0124 08:14:04.456657 7951 leveldb.cpp:438] Reading position from leveldb > took 30358ns > I0124 08:14:04.464860 7951 leveldb.cpp:343] Persisting action (14 bytes) to > leveldb took 8.164646ms > I0124 08:14:04.464903 7951 replica.cpp:679] Persisted action at 0 > I0124 08:14:04.465947 7949 replica.cpp:658] Replica received learned notice > for position 0 > I0124 08:14:04.471567 7949 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 5.587838ms > I0124 08:14:04.471601 7949 replica.cpp:679] Persisted action at 0 > I0124 08:14:04.471622 7949 replica.cpp:664] Replica learned NOP action at > position 0 > I0124 08:14:04.472682 7951 log.cpp:676] Writer started with ending position 0 > I0124 08:14:04.473919 7951 leveldb.cpp:438] Reading position from leveldb > took 28676ns > I0124 08:14:04.491591 7951 registrar.cpp:346] Successfully fetched the > registry (0B) in 58.337024ms > I0124 08:14:04.491704 7951 registrar.cpp:445] Applied 1 operations in > 28163ns; attempting to update the 'registry' > I0124 08:14:04.493938 7953 log.cpp:684] Attempting to append 118 bytes to > the log > I0124 08:14:04.494122 7953 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 1 > I0124 08:14:04.495069 7953 replica.cpp:511] Replica received write request > for position 1 > I0124 08:14:04.500089 7953 leveldb.cpp:343] Persisting action (135 bytes) to > leveldb took 4.989356ms > I0124 08:14:04.500123 7953 replica.cpp:679] Persisted action at 1 > I0124 08:14:04.501271 7950 replica.cpp:658] Replica received learned notice > for position 1 > I0124 08:14:04.505698 7950 leveldb.cpp:343] Persisting action (137 bytes) to > leveldb took 4.396221ms > I0124 08:14:04.505734 7950 replica.cpp:679] Persisted action at 1 > I0124 08:14:04.505755 7950 replica.cpp:664] Replica learned APPEND action at > position 1 > I0124 08:14:04.507313 7950 registrar.cpp:490] Successfully updated the > 'registry' in 15.52896ms > I0124 08:14:04.507478 7953 log.cpp:703] Attempting to truncate the log to 1 > I0124 08:14:04.507848 7953 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 2 > I0124 08:14:04.508743 7953 replica.cpp:511] Replica received write request > for position 2 > I0124 08:14:04.509214 7950 registrar.cpp:376] Successfully recovered > registrar > I0124 08:14:04.509682 7946 master.cpp:1077] Recovered 0 slaves from the > Registry (82B) ; allowing 10mins for slaves to re-register > I0124 08:14:04.514654 7953 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 5.880031ms > I0124 08:14:04.514689 7953 replica.cpp:679] Persisted action at 2 > I0124 08:14:04.515736 7953 replica.cpp:658] Replica received learned notice > for position 2 > I0124 08:14:04.522014 7953 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 6.245138ms > I0124 08:14:04.522086 7953 leveldb.cpp:401] Deleting ~1 keys from leveldb > took 37803ns > I0124 08:14:04.522107 7953 replica.cpp:679] Persisted action at 2 > I0124 08:14:04.522128 7953 replica.cpp:664] Replica learned TRUNCATE action > at position 2 > I0124 08:14:04.531460 7926 containerizer.cpp:103] Using isolation: > posix/cpu,posix/mem > I0124 08:14:04.547194 7951 slave.cpp:173] Slave started on > 208)@127.0.1.1:47787 > I0124 08:14:04.555682 7951 credentials.hpp:84] Loading credential for > authentication from > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/credential' > I0124 08:14:04.556622 7951 slave.cpp:282] Slave using credential for: > test-principal > I0124 08:14:04.557052 7951 slave.cpp:300] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0124 08:14:04.557842 7951 slave.cpp:329] Slave hostname: utopic > I0124 08:14:04.558091 7951 slave.cpp:330] Slave checkpoint: false > W0124 08:14:04.558352 7951 slave.cpp:332] Disabling checkpointing is > deprecated and the --checkpoint flag will be removed in a future release. > Please avoid using this flag > I0124 08:14:04.566864 7948 state.cpp:33] Recovering state from > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/meta' > I0124 08:14:04.575711 7951 status_update_manager.cpp:197] Recovering status > update manager > I0124 08:14:04.575904 7951 containerizer.cpp:300] Recovering containerizer > I0124 08:14:04.577112 7951 slave.cpp:3519] Finished recovery > I0124 08:14:04.577374 7926 sched.cpp:151] Version: 0.22.0 > I0124 08:14:04.578663 7950 sched.cpp:248] New master detected at > master@127.0.1.1:47787 > I0124 08:14:04.578759 7950 sched.cpp:304] Authenticating with master > master@127.0.1.1:47787 > I0124 08:14:04.578781 7950 sched.cpp:311] Using default CRAM-MD5 > authenticatee > I0124 08:14:04.579071 7950 authenticatee.hpp:138] Creating new client SASL > connection > I0124 08:14:04.579550 7947 master.cpp:4129] Authenticating > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 > I0124 08:14:04.579582 7947 master.cpp:4140] Using default CRAM-MD5 > authenticator > I0124 08:14:04.580031 7947 authenticator.hpp:170] Creating new server SASL > connection > I0124 08:14:04.580402 7947 authenticatee.hpp:229] Received SASL > authentication mechanisms: CRAM-MD5 > I0124 08:14:04.580430 7947 authenticatee.hpp:255] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0124 08:14:04.580538 7947 authenticator.hpp:276] Received SASL > authentication start > I0124 08:14:04.580581 7947 authenticator.hpp:398] Authentication requires > more steps > I0124 08:14:04.580651 7947 authenticatee.hpp:275] Received SASL > authentication step > I0124 08:14:04.580746 7947 authenticator.hpp:304] Received SASL > authentication step > I0124 08:14:04.580837 7947 authenticator.hpp:390] Authentication success > I0124 08:14:04.580940 7947 authenticatee.hpp:315] Authentication success > I0124 08:14:04.581009 7947 master.cpp:4187] Successfully authenticated > principal 'test-principal' at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 > I0124 08:14:04.581328 7947 sched.cpp:392] Successfully authenticated with > master master@127.0.1.1:47787 > I0124 08:14:04.581509 7947 master.cpp:1420] Received registration request > for framework 'default' at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 > I0124 08:14:04.581585 7947 master.cpp:1298] Authorizing framework principal > 'test-principal' to receive offers for role '*' > I0124 08:14:04.582033 7947 master.cpp:1484] Registering framework > 20150124-081404-16842879-47787-7926-0000 (default) at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 > I0124 08:14:04.582595 7947 hierarchical_allocator_process.hpp:319] Added > framework 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:04.583051 7947 sched.cpp:442] Framework registered with > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:04.584087 7951 slave.cpp:613] New master detected at > master@127.0.1.1:47787 > I0124 08:14:04.584388 7951 slave.cpp:676] Authenticating with master > master@127.0.1.1:47787 > I0124 08:14:04.584564 7951 slave.cpp:681] Using default CRAM-MD5 > authenticatee > I0124 08:14:04.584951 7951 slave.cpp:649] Detecting new master > I0124 08:14:04.585219 7951 status_update_manager.cpp:171] Pausing sending > status updates > I0124 08:14:04.585604 7951 authenticatee.hpp:138] Creating new client SASL > connection > I0124 08:14:04.587666 7953 master.cpp:4129] Authenticating > slave(208)@127.0.1.1:47787 > I0124 08:14:04.587702 7953 master.cpp:4140] Using default CRAM-MD5 > authenticator > I0124 08:14:04.588434 7953 authenticator.hpp:170] Creating new server SASL > connection > I0124 08:14:04.588764 7953 authenticatee.hpp:229] Received SASL > authentication mechanisms: CRAM-MD5 > I0124 08:14:04.588790 7953 authenticatee.hpp:255] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0124 08:14:04.588896 7953 authenticator.hpp:276] Received SASL > authentication start > I0124 08:14:04.588935 7953 authenticator.hpp:398] Authentication requires > more steps > I0124 08:14:04.589005 7953 authenticatee.hpp:275] Received SASL > authentication step > I0124 08:14:04.589082 7953 authenticator.hpp:304] Received SASL > authentication step > I0124 08:14:04.589140 7953 authenticator.hpp:390] Authentication success > I0124 08:14:04.589232 7953 authenticatee.hpp:315] Authentication success > I0124 08:14:04.589300 7953 master.cpp:4187] Successfully authenticated > principal 'test-principal' at slave(208)@127.0.1.1:47787 > I0124 08:14:04.589587 7953 slave.cpp:747] Successfully authenticated with > master master@127.0.1.1:47787 > I0124 08:14:04.589913 7953 master.cpp:3275] Registering slave at > slave(208)@127.0.1.1:47787 (utopic) with id > 20150124-081404-16842879-47787-7926-S0 > I0124 08:14:04.590322 7953 registrar.cpp:445] Applied 1 operations in > 60404ns; attempting to update the 'registry' > I0124 08:14:04.595336 7948 log.cpp:684] Attempting to append 283 bytes to > the log > I0124 08:14:04.595552 7948 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 3 > I0124 08:14:04.596535 7948 replica.cpp:511] Replica received write request > for position 3 > I0124 08:14:04.597846 7951 master.cpp:3263] Ignoring register slave message > from slave(208)@127.0.1.1:47787 (utopic) as admission is already in progress > I0124 08:14:04.602326 7948 leveldb.cpp:343] Persisting action (302 bytes) to > leveldb took 5.758211ms > I0124 08:14:04.602363 7948 replica.cpp:679] Persisted action at 3 > I0124 08:14:04.603492 7951 replica.cpp:658] Replica received learned notice > for position 3 > I0124 08:14:04.608952 7951 leveldb.cpp:343] Persisting action (304 bytes) to > leveldb took 5.427195ms > I0124 08:14:04.608985 7951 replica.cpp:679] Persisted action at 3 > I0124 08:14:04.609007 7951 replica.cpp:664] Replica learned APPEND action at > position 3 > I0124 08:14:04.610643 7951 registrar.cpp:490] Successfully updated the > 'registry' in 20.258048ms > I0124 08:14:04.610800 7948 log.cpp:703] Attempting to truncate the log to 3 > I0124 08:14:04.611184 7948 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 4 > I0124 08:14:04.612076 7948 replica.cpp:511] Replica received write request > for position 4 > I0124 08:14:04.613061 7946 master.cpp:3329] Registered slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0124 08:14:04.613299 7946 hierarchical_allocator_process.hpp:453] Added > slave 20150124-081404-16842879-47787-7926-S0 (utopic) with cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] available) > I0124 08:14:04.613688 7946 slave.cpp:781] Registered with master > master@127.0.1.1:47787; given slave ID 20150124-081404-16842879-47787-7926-S0 > I0124 08:14:04.614112 7946 master.cpp:4071] Sending 1 offers to framework > 20150124-081404-16842879-47787-7926-0000 (default) at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 > I0124 08:14:04.614228 7946 status_update_manager.cpp:178] Resuming sending > status updates > I0124 08:14:04.617481 7947 master.cpp:2677] Processing ACCEPT call for > offers: [ 20150124-081404-16842879-47787-7926-O0 ] on slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > for framework 20150124-081404-16842879-47787-7926-0000 (default) at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 > I0124 08:14:04.617535 7947 master.cpp:2513] Authorizing framework principal > 'test-principal' to launch task 7c16772d-4aed-4719-81c4-658a2cc22543 as user > 'jenkins' > I0124 08:14:04.618736 7947 master.hpp:782] Adding task > 7c16772d-4aed-4719-81c4-658a2cc22543 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20150124-081404-16842879-47787-7926-S0 (utopic) > I0124 08:14:04.618854 7947 master.cpp:2885] Launching task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 (default) at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 with resources > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > I0124 08:14:04.619209 7947 slave.cpp:1130] Got assigned task > 7c16772d-4aed-4719-81c4-658a2cc22543 for framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:04.619472 7948 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 7.364828ms > I0124 08:14:04.619941 7948 replica.cpp:679] Persisted action at 4 > I0124 08:14:04.624851 7953 replica.cpp:658] Replica received learned notice > for position 4 > I0124 08:14:04.625757 7947 slave.cpp:1245] Launching task > 7c16772d-4aed-4719-81c4-658a2cc22543 for framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:04.630590 7953 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 5.705336ms > I0124 08:14:04.630805 7953 leveldb.cpp:401] Deleting ~2 keys from leveldb > took 51263ns > I0124 08:14:04.630828 7953 replica.cpp:679] Persisted action at 4 > I0124 08:14:04.630851 7953 replica.cpp:664] Replica learned TRUNCATE action > at position 4 > I0124 08:14:04.633968 7947 slave.cpp:3921] Launching executor > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 in work directory > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000/executors/7c16772d-4aed-4719-81c4-658a2cc22543/runs/53887a08-f11d-4a2f-a659-a715d9fcf3d2' > I0124 08:14:04.634963 7951 containerizer.cpp:445] Starting container > '53887a08-f11d-4a2f-a659-a715d9fcf3d2' for executor > '7c16772d-4aed-4719-81c4-658a2cc22543' of framework > '20150124-081404-16842879-47787-7926-0000' > W0124 08:14:04.636931 7951 containerizer.cpp:296] CommandInfo.grace_period > flag is not set, using default value: 3secs > I0124 08:14:04.655591 7947 slave.cpp:1368] Queuing task > '7c16772d-4aed-4719-81c4-658a2cc22543' for executor > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > '20150124-081404-16842879-47787-7926-0000 > I0124 08:14:04.656992 7951 launcher.cpp:137] Forked child with pid '11030' > for container '53887a08-f11d-4a2f-a659-a715d9fcf3d2' > I0124 08:14:04.673646 7951 slave.cpp:2890] Monitoring executor > '7c16772d-4aed-4719-81c4-658a2cc22543' of framework > '20150124-081404-16842879-47787-7926-0000' in container > '53887a08-f11d-4a2f-a659-a715d9fcf3d2' > I0124 08:14:04.964946 11044 exec.cpp:147] Version: 0.22.0 > I0124 08:14:05.113059 7948 slave.cpp:1912] Got registration for executor > '7c16772d-4aed-4719-81c4-658a2cc22543' of framework > 20150124-081404-16842879-47787-7926-0000 from executor(1)@127.0.1.1:49174 > I0124 08:14:05.121086 7948 slave.cpp:2031] Flushing queued task > 7c16772d-4aed-4719-81c4-658a2cc22543 for executor > '7c16772d-4aed-4719-81c4-658a2cc22543' of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:05.266849 11062 exec.cpp:221] Executor registered on slave > 20150124-081404-16842879-47787-7926-S0 > Shutdown timeout is set to 3secsRegistered executor on utopic > Starting task 7c16772d-4aed-4719-81c4-658a2cc22543 > Forked command at 11067 > sh -c 'sleep 1000' > I0124 08:14:05.492084 7953 slave.cpp:2265] Handling status update > TASK_RUNNING (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 from executor(1)@127.0.1.1:49174 > I0124 08:14:05.492805 7953 status_update_manager.cpp:317] Received status > update TASK_RUNNING (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:05.493762 7953 slave.cpp:2508] Forwarding the update > TASK_RUNNING (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 to master@127.0.1.1:47787 > I0124 08:14:05.493948 7953 slave.cpp:2441] Sending acknowledgement for > status update TASK_RUNNING (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for > task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 to executor(1)@127.0.1.1:49174 > I0124 08:14:05.495378 7949 master.cpp:3652] Forwarding status update > TASK_RUNNING (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:05.495584 7949 master.cpp:3624] Status update TASK_RUNNING > (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 from slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > I0124 08:14:05.495678 7949 master.cpp:4934] Updating the latest state of > task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 to TASK_RUNNING > I0124 08:14:05.496422 7949 master.cpp:3125] Forwarding status update > acknowledgement 54742a87-ef02-4e72-a19b-83b0eeb62568 for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 (default) at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 to slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > I0124 08:14:05.497735 7946 master.cpp:2961] Asked to kill task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:05.497859 7946 master.cpp:3021] Telling slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > to kill task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 (default) at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 > I0124 08:14:05.498589 7947 status_update_manager.cpp:389] Received status > update acknowledgement (UUID: 54742a87-ef02-4e72-a19b-83b0eeb62568) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:05.499006 7953 slave.cpp:1424] Asked to kill task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > Shutting down > Sending SIGTERM to process tree at pid 11067 > Killing the following process trees: > [ > -+- 11067 sh -c sleep 1000 > \--- 11068 sleep 1000 > ] > 2015-01-24 > 08:14:07,295:7926(0x7f30b1b34700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:57753] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > Process 11067 did not terminate after 3secs, sending SIGKILL to process tree > at 11067 > Killed the following process trees: > [ > -+- 11067 sh -c sleep 1000 > \--- 11068 sleep 1000 > ] > Command terminated with signal Killed (pid: 11067) > I0124 08:14:09.063453 7953 slave.cpp:2265] Handling status update > TASK_KILLED (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 from executor(1)@127.0.1.1:49174 > I0124 08:14:09.069545 7953 status_update_manager.cpp:317] Received status > update TASK_KILLED (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:09.070265 7953 slave.cpp:2508] Forwarding the update TASK_KILLED > (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 to master@127.0.1.1:47787 > I0124 08:14:09.070996 7947 master.cpp:3652] Forwarding status update > TASK_KILLED (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:09.071182 7947 master.cpp:3624] Status update TASK_KILLED (UUID: > 4bd05372-2705-46e5-8182-5cb6907fbab3) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 from slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > I0124 08:14:09.071260 7947 master.cpp:4934] Updating the latest state of > task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 to TASK_KILLED > I0124 08:14:09.072052 7947 hierarchical_allocator_process.hpp:653] Recovered > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total > allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on > slave 20150124-081404-16842879-47787-7926-S0 from framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:09.072449 7947 master.cpp:4993] Removing task > 7c16772d-4aed-4719-81c4-658a2cc22543 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] of framework > 20150124-081404-16842879-47787-7926-0000 on slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > I0124 08:14:09.072700 7947 master.cpp:3125] Forwarding status update > acknowledgement 4bd05372-2705-46e5-8182-5cb6907fbab3 for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 (default) at > scheduler-4a6c5cde-c54a-455a-aaad-6fc4e8ee99ef@127.0.1.1:47787 to slave > 20150124-081404-16842879-47787-7926-S0 at slave(208)@127.0.1.1:47787 (utopic) > ../../src/tests/slave_tests.cpp:1736: Failure > Expected: (std::string::npos) != > (statusKilled.get().message().find("Terminated")), actual: > 18446744073709551615 vs 18446744073709551615 > I0124 08:14:09.073422 7926 sched.cpp:1471] Asked to stop the driver > I0124 08:14:09.073629 7926 master.cpp:654] Master terminating > I0124 08:14:09.075768 7950 sched.cpp:808] Stopping framework > '20150124-081404-16842879-47787-7926-0000' > I0124 08:14:09.079352 7953 slave.cpp:2441] Sending acknowledgement for > status update TASK_KILLED (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for > task 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 to executor(1)@127.0.1.1:49174 > I0124 08:14:09.085199 7953 slave.cpp:2673] master@127.0.1.1:47787 exited > W0124 08:14:09.085232 7953 slave.cpp:2676] Master disconnected! Waiting for > a new master to be elected > I0124 08:14:09.085263 7953 status_update_manager.cpp:389] Received status > update acknowledgement (UUID: 4bd05372-2705-46e5-8182-5cb6907fbab3) for task > 7c16772d-4aed-4719-81c4-658a2cc22543 of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:09.120879 7946 containerizer.cpp:879] Destroying container > '53887a08-f11d-4a2f-a659-a715d9fcf3d2' > I0124 08:14:09.216553 7952 containerizer.cpp:1084] Executor for container > '53887a08-f11d-4a2f-a659-a715d9fcf3d2' has exited > I0124 08:14:09.218641 7952 slave.cpp:2948] Executor > '7c16772d-4aed-4719-81c4-658a2cc22543' of framework > 20150124-081404-16842879-47787-7926-0000 terminated with signal Killed > I0124 08:14:09.218855 7952 slave.cpp:3057] Cleaning up executor > '7c16772d-4aed-4719-81c4-658a2cc22543' of framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:09.223268 7947 gc.cpp:56] Scheduling > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000/executors/7c16772d-4aed-4719-81c4-658a2cc22543/runs/53887a08-f11d-4a2f-a659-a715d9fcf3d2' > for gc 6.99999746482667days in the future > I0124 08:14:09.224205 7947 gc.cpp:56] Scheduling > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000/executors/7c16772d-4aed-4719-81c4-658a2cc22543' > for gc 6.99999746293926days in the future > I0124 08:14:09.227552 7952 slave.cpp:3136] Cleaning up framework > 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:09.229786 7949 status_update_manager.cpp:279] Closing status > update streams for framework 20150124-081404-16842879-47787-7926-0000 > I0124 08:14:09.230849 7952 slave.cpp:495] Slave terminating > I0124 08:14:09.230989 7952 gc.cpp:56] Scheduling > '/tmp/SlaveTest_MesosExecutorGracefulShutdown_kB74xo/slaves/20150124-081404-16842879-47787-7926-S0/frameworks/20150124-081404-16842879-47787-7926-0000' > for gc 6.99999732935407days in the future > [ FAILED ] SlaveTest.MesosExecutorGracefulShutdown (4881 ms) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)