[ https://issues.apache.org/jira/browse/MESOS-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinod Kone resolved MESOS-1945. ------------------------------- Resolution: Fixed Fix Version/s: 0.21.0 commit 5ccf43e5ad037e42b7488da80e61504cc721ae37 Author: Bernd Mathiske <be...@mesosphere.io> Date: Tue Oct 21 10:04:54 2014 -0700 Fixed flaky SlaveTest.KillTaskBetweenRunTaskParts test. Moving EXPECT_CALL(removeFramework) before killTask() removes a race, which made the expected call occur before it is expected. Review: https://reviews.apache.org/r/26929 > SlaveTest.KillTaskBetweenRunTaskParts is flaky > ---------------------------------------------- > > Key: MESOS-1945 > URL: https://issues.apache.org/jira/browse/MESOS-1945 > Project: Mesos > Issue Type: Bug > Components: test > Affects Versions: 0.21.0 > Reporter: Vinod Kone > Assignee: Bernd Mathiske > Fix For: 0.21.0 > > > Observed this on the internal CI. > [~bernd-mesos] Can you take a look? > {code} > [ RUN ] SlaveTest.KillTaskBetweenRunTaskParts > Using temporary directory '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_RmlPwG' > I1017 13:42:13.066948 13328 leveldb.cpp:176] Opened db in 102.342262ms > I1017 13:42:13.096580 13328 leveldb.cpp:183] Compacted db in 29.603997ms > I1017 13:42:13.096628 13328 leveldb.cpp:198] Created db iterator in 10276ns > I1017 13:42:13.096638 13328 leveldb.cpp:204] Seeked to beginning of db in > 4732ns > I1017 13:42:13.096644 13328 leveldb.cpp:273] Iterated through 0 keys in the > db in 3353ns > I1017 13:42:13.096659 13328 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1017 13:42:13.096951 13349 recover.cpp:437] Starting replica recovery > I1017 13:42:13.097007 13349 recover.cpp:463] Replica is in EMPTY status > I1017 13:42:13.097256 13349 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I1017 13:42:13.097306 13349 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I1017 13:42:13.097378 13349 recover.cpp:554] Updating replica status to > STARTING > I1017 13:42:13.101631 13345 master.cpp:312] Master > 20141017-134213-16842879-48221-13328 (trusty) started on 127.0.1.1:48221 > I1017 13:42:13.102226 13345 master.cpp:358] Master only allowing > authenticated frameworks to register > I1017 13:42:13.102473 13345 master.cpp:363] Master only allowing > authenticated slaves to register > I1017 13:42:13.102738 13345 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_RmlPwG/credentials' > I1017 13:42:13.103142 13345 master.cpp:392] Authorization enabled > I1017 13:42:13.103667 13346 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@127.0.1.1:48221 > I1017 13:42:13.103966 13342 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I1017 13:42:13.104833 13345 master.cpp:1242] The newly elected leader is > master@127.0.1.1:48221 with id 20141017-134213-16842879-48221-13328 > I1017 13:42:13.105020 13345 master.cpp:1255] Elected as the leading master! > I1017 13:42:13.105200 13345 master.cpp:1073] Recovering from registrar > I1017 13:42:13.105465 13347 registrar.cpp:313] Recovering registrar > I1017 13:42:13.112493 13349 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 15.090983ms > I1017 13:42:13.112735 13349 replica.cpp:320] Persisted replica status to > STARTING > I1017 13:42:13.113172 13349 recover.cpp:463] Replica is in STARTING status > I1017 13:42:13.113713 13349 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I1017 13:42:13.113998 13349 recover.cpp:188] Received a recover response from > a replica in STARTING status > I1017 13:42:13.114323 13349 recover.cpp:554] Updating replica status to VOTING > I1017 13:42:13.131239 13349 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 16.594369ms > I1017 13:42:13.131573 13349 replica.cpp:320] Persisted replica status to > VOTING > I1017 13:42:13.131916 13344 recover.cpp:568] Successfully joined the Paxos > group > I1017 13:42:13.132225 13342 recover.cpp:452] Recover process terminated > I1017 13:42:13.132542 13343 log.cpp:656] Attempting to start the writer > I1017 13:42:13.134614 13343 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I1017 13:42:13.155139 13343 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 20.162122ms > I1017 13:42:13.155519 13343 replica.cpp:342] Persisted promised to 1 > I1017 13:42:13.155941 13343 coordinator.cpp:230] Coordinator attemping to > fill missing position > I1017 13:42:13.156524 13343 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I1017 13:42:13.170680 13343 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 13.967251ms > I1017 13:42:13.171041 13343 replica.cpp:676] Persisted action at 0 > I1017 13:42:13.171551 13343 replica.cpp:508] Replica received write request > for position 0 > I1017 13:42:13.171787 13343 leveldb.cpp:438] Reading position from leveldb > took 30854ns > I1017 13:42:13.182826 13343 leveldb.cpp:343] Persisting action (14 bytes) to > leveldb took 10.867833ms > I1017 13:42:13.183177 13343 replica.cpp:676] Persisted action at 0 > I1017 13:42:13.186982 13348 replica.cpp:655] Replica received learned notice > for position 0 > I1017 13:42:13.201468 13348 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 14.332407ms > I1017 13:42:13.201720 13348 replica.cpp:676] Persisted action at 0 > I1017 13:42:13.202015 13348 replica.cpp:661] Replica learned NOP action at > position 0 > I1017 13:42:13.202435 13348 log.cpp:672] Writer started with ending position 0 > I1017 13:42:13.202925 13348 leveldb.cpp:438] Reading position from leveldb > took 27747ns > I1017 13:42:13.204242 13348 registrar.cpp:346] Successfully fetched the > registry (0B) in 98.612992ms > I1017 13:42:13.204498 13348 registrar.cpp:445] Applied 1 operations in > 14313ns; attempting to update the 'registry' > I1017 13:42:13.205639 13349 log.cpp:680] Attempting to append 119 bytes to > the log > I1017 13:42:13.205909 13349 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 1 > I1017 13:42:13.206362 13349 replica.cpp:508] Replica received write request > for position 1 > I1017 13:42:13.221566 13349 leveldb.cpp:343] Persisting action (136 bytes) to > leveldb took 15.040219ms > I1017 13:42:13.221835 13349 replica.cpp:676] Persisted action at 1 > I1017 13:42:13.222483 13349 replica.cpp:655] Replica received learned notice > for position 1 > I1017 13:42:13.241530 13349 leveldb.cpp:343] Persisting action (138 bytes) to > leveldb took 18.838403ms > I1017 13:42:13.241823 13349 replica.cpp:676] Persisted action at 1 > I1017 13:42:13.242197 13349 replica.cpp:661] Replica learned APPEND action at > position 1 > I1017 13:42:13.242813 13349 registrar.cpp:490] Successfully updated the > 'registry' in 38.1248ms > I1017 13:42:13.243016 13349 registrar.cpp:376] Successfully recovered > registrar > I1017 13:42:13.243237 13349 log.cpp:699] Attempting to truncate the log to 1 > I1017 13:42:13.243494 13349 master.cpp:1100] Recovered 0 slaves from the > Registry (83B) ; allowing 10mins for slaves to re-register > I1017 13:42:13.243643 13342 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 2 > I1017 13:42:13.244231 13342 replica.cpp:508] Replica received write request > for position 2 > I1017 13:42:13.261124 13342 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 16.733315ms > I1017 13:42:13.261438 13342 replica.cpp:676] Persisted action at 2 > I1017 13:42:13.264452 13345 replica.cpp:655] Replica received learned notice > for position 2 > I1017 13:42:13.281764 13345 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 17.088595ms > I1017 13:42:13.282115 13345 leveldb.cpp:401] Deleting ~1 keys from leveldb > took 63443ns > I1017 13:42:13.282454 13345 replica.cpp:676] Persisted action at 2 > I1017 13:42:13.282613 13345 replica.cpp:661] Replica learned TRUNCATE action > at position 2 > I1017 13:42:13.288882 13348 slave.cpp:169] Slave started on 127.0.1.1:48221 > I1017 13:42:13.289211 13348 credentials.hpp:84] Loading credential for > authentication from > '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/credential' > I1017 13:42:13.289453 13348 slave.cpp:276] Slave using credential for: > test-principal > I1017 13:42:13.289701 13348 slave.cpp:289] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I1017 13:42:13.289937 13348 slave.cpp:318] Slave hostname: trusty > I1017 13:42:13.290114 13348 slave.cpp:319] Slave checkpoint: false > I1017 13:42:13.290704 13348 state.cpp:33] Recovering state from > '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/meta' > I1017 13:42:13.291016 13348 status_update_manager.cpp:193] Recovering status > update manager > I1017 13:42:13.291292 13348 slave.cpp:3362] Finished recovery > I1017 13:42:13.291786 13348 slave.cpp:600] New master detected at > master@127.0.1.1:48221 > I1017 13:42:13.292021 13344 status_update_manager.cpp:167] New master > detected at master@127.0.1.1:48221 > I1017 13:42:13.292275 13348 slave.cpp:674] Authenticating with master > master@127.0.1.1:48221 > I1017 13:42:13.292551 13342 authenticatee.hpp:133] Creating new client SASL > connection > I1017 13:42:13.292855 13342 master.cpp:3787] Authenticating > (623)@127.0.1.1:48221 > I1017 13:42:13.293114 13342 authenticator.hpp:161] Creating new server SASL > connection > I1017 13:42:13.293398 13342 authenticatee.hpp:224] Received SASL > authentication mechanisms: CRAM-MD5 > I1017 13:42:13.293592 13342 authenticatee.hpp:250] Attempting to authenticate > with mechanism 'CRAM-MD5' > I1017 13:42:13.293789 13342 authenticator.hpp:267] Received SASL > authentication start > I1017 13:42:13.293995 13342 authenticator.hpp:389] Authentication requires > more steps > I1017 13:42:13.294193 13342 authenticatee.hpp:270] Received SASL > authentication step > I1017 13:42:13.294400 13342 authenticator.hpp:295] Received SASL > authentication step > I1017 13:42:13.294586 13342 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: false > I1017 13:42:13.294780 13342 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I1017 13:42:13.294963 13342 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I1017 13:42:13.295145 13342 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: true > I1017 13:42:13.295318 13342 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I1017 13:42:13.295490 13342 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I1017 13:42:13.295668 13342 authenticator.hpp:381] Authentication success > I1017 13:42:13.295869 13342 authenticatee.hpp:310] Authentication success > I1017 13:42:13.296062 13342 master.cpp:3827] Successfully authenticated > principal 'test-principal' at (623)@127.0.1.1:48221 > I1017 13:42:13.296386 13348 slave.cpp:647] Detecting new master > I1017 13:42:13.301076 13348 slave.cpp:731] Successfully authenticated with > master master@127.0.1.1:48221 > I1017 13:42:13.301614 13346 master.cpp:2968] Registering slave at > (623)@127.0.1.1:48221 (trusty) with id 20141017-134213-16842879-48221-13328-0 > I1017 13:42:13.301964 13346 registrar.cpp:445] Applied 1 operations in > 21609ns; attempting to update the 'registry' > I1017 13:42:13.303146 13344 log.cpp:680] Attempting to append 284 bytes to > the log > I1017 13:42:13.303452 13344 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 3 > I1017 13:42:13.303969 13344 replica.cpp:508] Replica received write request > for position 3 > I1017 13:42:13.304587 13348 slave.cpp:1048] Will retry registration in > 3.95735ms if necessary > I1017 13:42:13.306241 13328 sched.cpp:137] Version: 0.21.0 > I1017 13:42:13.306624 13346 sched.cpp:233] New master detected at > master@127.0.1.1:48221 > I1017 13:42:13.306804 13346 sched.cpp:283] Authenticating with master > master@127.0.1.1:48221 > I1017 13:42:13.307168 13346 authenticatee.hpp:133] Creating new client SASL > connection > I1017 13:42:13.307487 13346 master.cpp:3787] Authenticating > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:13.307837 13346 authenticator.hpp:161] Creating new server SASL > connection > I1017 13:42:13.308207 13346 authenticatee.hpp:224] Received SASL > authentication mechanisms: CRAM-MD5 > I1017 13:42:13.308398 13346 authenticatee.hpp:250] Attempting to authenticate > with mechanism 'CRAM-MD5' > I1017 13:42:13.308619 13346 authenticator.hpp:267] Received SASL > authentication start > I1017 13:42:13.308823 13346 authenticator.hpp:389] Authentication requires > more steps > I1017 13:42:13.309046 13346 authenticatee.hpp:270] Received SASL > authentication step > I1017 13:42:13.309221 13348 slave.cpp:1048] Will retry registration in > 18.553497ms if necessary > I1017 13:42:13.309278 13345 master.cpp:2956] Ignoring register slave message > from (623)@127.0.1.1:48221 (trusty) as admission is already in progress > I1017 13:42:13.309834 13348 authenticator.hpp:295] Received SASL > authentication step > I1017 13:42:13.310006 13348 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: false > I1017 13:42:13.310238 13348 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I1017 13:42:13.310441 13348 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I1017 13:42:13.310624 13348 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: true > I1017 13:42:13.310813 13348 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I1017 13:42:13.310987 13348 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I1017 13:42:13.311210 13348 authenticator.hpp:381] Authentication success > I1017 13:42:13.311472 13343 authenticatee.hpp:310] Authentication success > I1017 13:42:13.315773 13344 leveldb.cpp:343] Persisting action (303 bytes) to > leveldb took 11.584908ms > I1017 13:42:13.316170 13344 replica.cpp:676] Persisted action at 3 > I1017 13:42:13.316678 13344 replica.cpp:655] Replica received learned notice > for position 3 > I1017 13:42:13.317164 13348 master.cpp:3827] Successfully authenticated > principal 'test-principal' at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:13.317832 13343 sched.cpp:357] Successfully authenticated with > master master@127.0.1.1:48221 > I1017 13:42:13.318039 13343 sched.cpp:476] Sending registration request to > master@127.0.1.1:48221 > I1017 13:42:13.318295 13347 master.cpp:1362] Received registration request > for framework 'default' at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:13.318481 13347 master.cpp:1321] Authorizing framework principal > 'test-principal' to receive offers for role '*' > I1017 13:42:13.318783 13347 master.cpp:1426] Registering framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:13.319156 13347 hierarchical_allocator_process.hpp:329] Added > framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.319458 13347 hierarchical_allocator_process.hpp:697] No > resources available to allocate! > I1017 13:42:13.319684 13347 hierarchical_allocator_process.hpp:659] Performed > allocation for 0 slaves in 241676ns > I1017 13:42:13.320008 13343 sched.cpp:407] Framework registered with > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.320247 13343 sched.cpp:421] Scheduler::registered took 22306ns > I1017 13:42:13.326701 13344 leveldb.cpp:343] Persisting action (305 bytes) to > leveldb took 9.696717ms > I1017 13:42:13.326954 13344 replica.cpp:676] Persisted action at 3 > I1017 13:42:13.327303 13344 replica.cpp:661] Replica learned APPEND action at > position 3 > I1017 13:42:13.328037 13342 registrar.cpp:490] Successfully updated the > 'registry' in 25.80992ms > I1017 13:42:13.328490 13342 master.cpp:3022] Registered slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) with > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I1017 13:42:13.328840 13342 slave.cpp:765] Registered with master > master@127.0.1.1:48221; given slave ID 20141017-134213-16842879-48221-13328-0 > I1017 13:42:13.329071 13342 slave.cpp:2436] Received ping from > slave-observer(35)@127.0.1.1:48221 > I1017 13:42:13.328671 13346 hierarchical_allocator_process.hpp:442] Added > slave 20141017-134213-16842879-48221-13328-0 (trusty) with cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] available) > I1017 13:42:13.329674 13346 hierarchical_allocator_process.hpp:734] Offering > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20141017-134213-16842879-48221-13328-0 to framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.329936 13346 hierarchical_allocator_process.hpp:679] Performed > allocation for slave 20141017-134213-16842879-48221-13328-0 in 305301ns > I1017 13:42:13.329319 13349 log.cpp:699] Attempting to truncate the log to 3 > I1017 13:42:13.330471 13349 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 4 > I1017 13:42:13.330883 13349 replica.cpp:508] Replica received write request > for position 4 > I1017 13:42:13.330140 13348 master.cpp:3729] Sending 1 offers to framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:13.331497 13348 sched.cpp:544] Scheduler::resourceOffers took > 29643ns > I1017 13:42:13.332465 13346 master.cpp:2315] Processing reply for offers: [ > 20141017-134213-16842879-48221-13328-0 ] on slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) for > framework 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > W1017 13:42:13.332710 13346 master.cpp:1969] Executor default for task 1 uses > less CPUs (None) than the minimum required (0.01). Please update your > executor, as this will be mandatory in future releases. > W1017 13:42:13.332919 13346 master.cpp:1980] Executor default for task 1 uses > less memory (None) than the minimum required (32MB). Please update your > executor, as this will be mandatory in future releases. > I1017 13:42:13.333108 13346 master.cpp:2397] Authorizing framework principal > 'test-principal' to launch task 1 as user 'jenkins' > I1017 13:42:13.333549 13346 master.hpp:869] Adding task 1 with resources > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20141017-134213-16842879-48221-13328-0 (trusty) > I1017 13:42:13.333825 13346 master.cpp:2463] Launching task 1 of framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 with resources > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) > I1017 13:42:13.334203 13346 slave.cpp:1079] Got assigned task 1 for framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.334875 13345 master.cpp:2639] Asked to kill task 1 of > framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.335054 13345 master.cpp:2734] Telling slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) to > kill task 1 of framework 20141017-134213-16842879-48221-13328-0000 (default) > at scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:13.335261 13345 slave.cpp:1371] Asked to kill task 1 of framework > 20141017-134213-16842879-48221-13328-0000 > W1017 13:42:13.335489 13345 slave.cpp:1412] Killing task 1 of framework > 20141017-134213-16842879-48221-13328-0000 before it was launched > I1017 13:42:13.336645 13345 slave.cpp:2200] Handling status update > TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of > framework 20141017-134213-16842879-48221-13328-0000 from @0.0.0.0:0 > W1017 13:42:13.336925 13345 slave.cpp:2233] Could not find the executor for > status update TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for > task 1 of framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.337209 13345 slave.cpp:2971] Cleaning up framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.337323 13342 status_update_manager.cpp:320] Received status > update TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of > framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.337715 13342 status_update_manager.cpp:499] Creating > StatusUpdate stream for task 1 of framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.338084 13342 status_update_manager.cpp:373] Forwarding status > update TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of > framework 20141017-134213-16842879-48221-13328-0000 to master@127.0.1.1:48221 > I1017 13:42:13.338402 13342 status_update_manager.cpp:282] Closing status > update streams for framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.338677 13342 status_update_manager.cpp:530] Cleaning up status > update stream for task 1 of framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.338562 13347 master.cpp:3352] Forwarding status update > TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of > framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.339165 13347 master.cpp:3324] Status update TASK_KILLED (UUID: > fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of framework > 20141017-134213-16842879-48221-13328-0000 from slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) > I1017 13:42:13.339725 13344 hierarchical_allocator_process.hpp:563] Recovered > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total > allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on > slave 20141017-134213-16842879-48221-13328-0 from framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.339395 13348 sched.cpp:635] Scheduler::statusUpdate took > 25435ns > I1017 13:42:13.340298 13347 master.cpp:4551] Removing task 1 with resources > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework > 20141017-134213-16842879-48221-13328-0000 on slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) > I1017 13:42:13.340586 13347 master.cpp:2813] Forwarding status update > acknowledgement fad148df-b3ef-44a2-836d-f1675c97a682 for task 1 of framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 to slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) > E1017 13:42:13.340823 13345 slave.cpp:3469] Failed to find the mtime of > '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/slaves/20141017-134213-16842879-48221-13328-0/frameworks/20141017-134213-16842879-48221-13328-0000': > Error invoking stat for > '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/slaves/20141017-134213-16842879-48221-13328-0/frameworks/20141017-134213-16842879-48221-13328-0000': > No such file or directory > I1017 13:42:13.341151 13345 slave.cpp:2357] Status update manager > successfully handled status update TASK_KILLED (UUID: > fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.341425 13345 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task > 1 of framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.341714 13349 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 10.636355ms > I1017 13:42:13.341894 13349 replica.cpp:676] Persisted action at 4 > I1017 13:42:13.342350 13348 replica.cpp:655] Replica received learned notice > for position 4 > E1017 13:42:13.342881 13345 slave.cpp:1780] Failed to handle status update > acknowledgement (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of > framework 20141017-134213-16842879-48221-13328-0000: Cannot find the status > update stream for task 1 of framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:13.343281 13328 slave.cpp:1189] Launching task 1 for framework > 20141017-134213-16842879-48221-13328-0000 > W1017 13:42:13.343456 13328 slave.cpp:1194] Ignoring run task 1 because the > framework 20141017-134213-16842879-48221-13328-0000 does not exist > I1017 13:42:13.353662 13348 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 10.997701ms > I1017 13:42:13.354200 13348 leveldb.cpp:401] Deleting ~2 keys from leveldb > took 260599ns > I1017 13:42:13.354375 13348 replica.cpp:676] Persisted action at 4 > I1017 13:42:13.354562 13348 replica.cpp:661] Replica learned TRUNCATE action > at position 4 > I1017 13:42:14.104835 13348 hierarchical_allocator_process.hpp:734] Offering > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20141017-134213-16842879-48221-13328-0 to framework > 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:14.105361 13348 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 615690ns > I1017 13:42:14.105605 13346 master.cpp:3729] Sending 1 offers to framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:14.106204 13346 sched.cpp:544] Scheduler::resourceOffers took > 20474ns > I1017 13:42:15.106046 13344 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 71533ns > 2014-10-17 > 13:42:15,828:13328(0x2ae8f942b700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:37292] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I1017 13:42:16.107005 13346 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 65058ns > I1017 13:42:17.108440 13344 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 85967ns > I1017 13:42:18.104810 13347 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I1017 13:42:18.109485 13346 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 47948ns > I1017 13:42:19.110553 13348 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 66051ns > 2014-10-17 > 13:42:19,167:13328(0x2ae8f942b700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:37292] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I1017 13:42:20.111214 13343 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 79222ns > I1017 13:42:21.112387 13349 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 101805ns > I1017 13:42:22.113950 13345 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 87310ns > 2014-10-17 > 13:42:22,504:13328(0x2ae8f942b700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:37292] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I1017 13:42:23.105613 13348 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I1017 13:42:23.115574 13345 hierarchical_allocator_process.hpp:659] Performed > allocation for 1 slaves in 159780ns > tests/slave_tests.cpp:1141: Failure > Failed to wait 10secs for removeFramework > tests/slave_tests.cpp:1133: Failure > Actual function call count doesn't match EXPECT_CALL(slave, > removeFramework(_))... > Expected: to be called once > Actual: never called - unsatisfied and active > I1017 13:42:23.350111 13342 master.cpp:768] Framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 disconnected > I1017 13:42:23.350391 13342 master.cpp:1731] Disconnecting framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:23.350551 13342 master.cpp:1747] Deactivating framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 > I1017 13:42:23.350930 13342 master.cpp:790] Giving framework > 20141017-134213-16842879-48221-13328-0000 (default) at > scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 0ns to failover > I1017 13:42:23.351084 13347 hierarchical_allocator_process.hpp:405] > Deactivated framework 20141017-134213-16842879-48221-13328-0000 > I1017 13:42:23.351521 13347 hierarchical_allocator_process.hpp:563] Recovered > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total > allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on > slave 20141017-134213-16842879-48221-13328-0 from framework > 20141017-134213-16842879-48221-13328-0000 > lt-mesos-tests: > ../3rdparty/libprocess/include/process/c++11/dispatch.hpp:155: auto > process::dispatch(const PID<mesos::internal::slave::Slave> &, void > (mesos::internal::slave::Slave::*)(const > process::Future<Option<mesos::MasterInfo> > &), > process::Future<Option<mesos::MasterInfo> >)::<anonymous > class>::operator()(process::ProcessBase *) const: Assertion `t != __null' > failed. > *** Aborted at 1413578543 (unix time) try "date -d @1413578543" if you are > using GNU date *** > I1017 13:42:23.357753 13347 master.cpp:677] Master terminating > I1017 13:42:23.357992 13347 master.cpp:4580] Removing executor 'default' with > resources of framework 20141017-134213-16842879-48221-13328-0000 on slave > 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) > PC: @ 0x2ae8cb23fbb9 (unknown) > [ FAILED ] SlaveTest.KillTaskBetweenRunTaskParts (10399 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)