[ 
https://issues.apache.org/jira/browse/MESOS-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone resolved MESOS-1945.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 0.21.0

commit 5ccf43e5ad037e42b7488da80e61504cc721ae37
Author: Bernd Mathiske <be...@mesosphere.io>
Date:   Tue Oct 21 10:04:54 2014 -0700

    Fixed flaky SlaveTest.KillTaskBetweenRunTaskParts test.
    
    Moving EXPECT_CALL(removeFramework) before killTask() removes a race,
    which made the expected call occur before it is expected.
    
    Review: https://reviews.apache.org/r/26929


> SlaveTest.KillTaskBetweenRunTaskParts is flaky
> ----------------------------------------------
>
>                 Key: MESOS-1945
>                 URL: https://issues.apache.org/jira/browse/MESOS-1945
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 0.21.0
>            Reporter: Vinod Kone
>            Assignee: Bernd Mathiske
>             Fix For: 0.21.0
>
>
> Observed this on the internal CI.
> [~bernd-mesos] Can you take a look?
> {code}
> [ RUN      ] SlaveTest.KillTaskBetweenRunTaskParts
> Using temporary directory '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_RmlPwG'
> I1017 13:42:13.066948 13328 leveldb.cpp:176] Opened db in 102.342262ms
> I1017 13:42:13.096580 13328 leveldb.cpp:183] Compacted db in 29.603997ms
> I1017 13:42:13.096628 13328 leveldb.cpp:198] Created db iterator in 10276ns
> I1017 13:42:13.096638 13328 leveldb.cpp:204] Seeked to beginning of db in 
> 4732ns
> I1017 13:42:13.096644 13328 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 3353ns
> I1017 13:42:13.096659 13328 replica.cpp:741] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1017 13:42:13.096951 13349 recover.cpp:437] Starting replica recovery
> I1017 13:42:13.097007 13349 recover.cpp:463] Replica is in EMPTY status
> I1017 13:42:13.097256 13349 replica.cpp:638] Replica in EMPTY status received 
> a broadcasted recover request
> I1017 13:42:13.097306 13349 recover.cpp:188] Received a recover response from 
> a replica in EMPTY status
> I1017 13:42:13.097378 13349 recover.cpp:554] Updating replica status to 
> STARTING
> I1017 13:42:13.101631 13345 master.cpp:312] Master 
> 20141017-134213-16842879-48221-13328 (trusty) started on 127.0.1.1:48221
> I1017 13:42:13.102226 13345 master.cpp:358] Master only allowing 
> authenticated frameworks to register
> I1017 13:42:13.102473 13345 master.cpp:363] Master only allowing 
> authenticated slaves to register
> I1017 13:42:13.102738 13345 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_RmlPwG/credentials'
> I1017 13:42:13.103142 13345 master.cpp:392] Authorization enabled
> I1017 13:42:13.103667 13346 hierarchical_allocator_process.hpp:299] 
> Initializing hierarchical allocator process with master : 
> master@127.0.1.1:48221
> I1017 13:42:13.103966 13342 master.cpp:120] No whitelist given. Advertising 
> offers for all slaves
> I1017 13:42:13.104833 13345 master.cpp:1242] The newly elected leader is 
> master@127.0.1.1:48221 with id 20141017-134213-16842879-48221-13328
> I1017 13:42:13.105020 13345 master.cpp:1255] Elected as the leading master!
> I1017 13:42:13.105200 13345 master.cpp:1073] Recovering from registrar
> I1017 13:42:13.105465 13347 registrar.cpp:313] Recovering registrar
> I1017 13:42:13.112493 13349 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 15.090983ms
> I1017 13:42:13.112735 13349 replica.cpp:320] Persisted replica status to 
> STARTING
> I1017 13:42:13.113172 13349 recover.cpp:463] Replica is in STARTING status
> I1017 13:42:13.113713 13349 replica.cpp:638] Replica in STARTING status 
> received a broadcasted recover request
> I1017 13:42:13.113998 13349 recover.cpp:188] Received a recover response from 
> a replica in STARTING status
> I1017 13:42:13.114323 13349 recover.cpp:554] Updating replica status to VOTING
> I1017 13:42:13.131239 13349 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 16.594369ms
> I1017 13:42:13.131573 13349 replica.cpp:320] Persisted replica status to 
> VOTING
> I1017 13:42:13.131916 13344 recover.cpp:568] Successfully joined the Paxos 
> group
> I1017 13:42:13.132225 13342 recover.cpp:452] Recover process terminated
> I1017 13:42:13.132542 13343 log.cpp:656] Attempting to start the writer
> I1017 13:42:13.134614 13343 replica.cpp:474] Replica received implicit 
> promise request with proposal 1
> I1017 13:42:13.155139 13343 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 20.162122ms
> I1017 13:42:13.155519 13343 replica.cpp:342] Persisted promised to 1
> I1017 13:42:13.155941 13343 coordinator.cpp:230] Coordinator attemping to 
> fill missing position
> I1017 13:42:13.156524 13343 replica.cpp:375] Replica received explicit 
> promise request for position 0 with proposal 2
> I1017 13:42:13.170680 13343 leveldb.cpp:343] Persisting action (8 bytes) to 
> leveldb took 13.967251ms
> I1017 13:42:13.171041 13343 replica.cpp:676] Persisted action at 0
> I1017 13:42:13.171551 13343 replica.cpp:508] Replica received write request 
> for position 0
> I1017 13:42:13.171787 13343 leveldb.cpp:438] Reading position from leveldb 
> took 30854ns
> I1017 13:42:13.182826 13343 leveldb.cpp:343] Persisting action (14 bytes) to 
> leveldb took 10.867833ms
> I1017 13:42:13.183177 13343 replica.cpp:676] Persisted action at 0
> I1017 13:42:13.186982 13348 replica.cpp:655] Replica received learned notice 
> for position 0
> I1017 13:42:13.201468 13348 leveldb.cpp:343] Persisting action (16 bytes) to 
> leveldb took 14.332407ms
> I1017 13:42:13.201720 13348 replica.cpp:676] Persisted action at 0
> I1017 13:42:13.202015 13348 replica.cpp:661] Replica learned NOP action at 
> position 0
> I1017 13:42:13.202435 13348 log.cpp:672] Writer started with ending position 0
> I1017 13:42:13.202925 13348 leveldb.cpp:438] Reading position from leveldb 
> took 27747ns
> I1017 13:42:13.204242 13348 registrar.cpp:346] Successfully fetched the 
> registry (0B) in 98.612992ms
> I1017 13:42:13.204498 13348 registrar.cpp:445] Applied 1 operations in 
> 14313ns; attempting to update the 'registry'
> I1017 13:42:13.205639 13349 log.cpp:680] Attempting to append 119 bytes to 
> the log
> I1017 13:42:13.205909 13349 coordinator.cpp:340] Coordinator attempting to 
> write APPEND action at position 1
> I1017 13:42:13.206362 13349 replica.cpp:508] Replica received write request 
> for position 1
> I1017 13:42:13.221566 13349 leveldb.cpp:343] Persisting action (136 bytes) to 
> leveldb took 15.040219ms
> I1017 13:42:13.221835 13349 replica.cpp:676] Persisted action at 1
> I1017 13:42:13.222483 13349 replica.cpp:655] Replica received learned notice 
> for position 1
> I1017 13:42:13.241530 13349 leveldb.cpp:343] Persisting action (138 bytes) to 
> leveldb took 18.838403ms
> I1017 13:42:13.241823 13349 replica.cpp:676] Persisted action at 1
> I1017 13:42:13.242197 13349 replica.cpp:661] Replica learned APPEND action at 
> position 1
> I1017 13:42:13.242813 13349 registrar.cpp:490] Successfully updated the 
> 'registry' in 38.1248ms
> I1017 13:42:13.243016 13349 registrar.cpp:376] Successfully recovered 
> registrar
> I1017 13:42:13.243237 13349 log.cpp:699] Attempting to truncate the log to 1
> I1017 13:42:13.243494 13349 master.cpp:1100] Recovered 0 slaves from the 
> Registry (83B) ; allowing 10mins for slaves to re-register
> I1017 13:42:13.243643 13342 coordinator.cpp:340] Coordinator attempting to 
> write TRUNCATE action at position 2
> I1017 13:42:13.244231 13342 replica.cpp:508] Replica received write request 
> for position 2
> I1017 13:42:13.261124 13342 leveldb.cpp:343] Persisting action (16 bytes) to 
> leveldb took 16.733315ms
> I1017 13:42:13.261438 13342 replica.cpp:676] Persisted action at 2
> I1017 13:42:13.264452 13345 replica.cpp:655] Replica received learned notice 
> for position 2
> I1017 13:42:13.281764 13345 leveldb.cpp:343] Persisting action (18 bytes) to 
> leveldb took 17.088595ms
> I1017 13:42:13.282115 13345 leveldb.cpp:401] Deleting ~1 keys from leveldb 
> took 63443ns
> I1017 13:42:13.282454 13345 replica.cpp:676] Persisted action at 2
> I1017 13:42:13.282613 13345 replica.cpp:661] Replica learned TRUNCATE action 
> at position 2
> I1017 13:42:13.288882 13348 slave.cpp:169] Slave started on 127.0.1.1:48221
> I1017 13:42:13.289211 13348 credentials.hpp:84] Loading credential for 
> authentication from 
> '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/credential'
> I1017 13:42:13.289453 13348 slave.cpp:276] Slave using credential for: 
> test-principal
> I1017 13:42:13.289701 13348 slave.cpp:289] Slave resources: cpus(*):2; 
> mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I1017 13:42:13.289937 13348 slave.cpp:318] Slave hostname: trusty
> I1017 13:42:13.290114 13348 slave.cpp:319] Slave checkpoint: false
> I1017 13:42:13.290704 13348 state.cpp:33] Recovering state from 
> '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/meta'
> I1017 13:42:13.291016 13348 status_update_manager.cpp:193] Recovering status 
> update manager
> I1017 13:42:13.291292 13348 slave.cpp:3362] Finished recovery
> I1017 13:42:13.291786 13348 slave.cpp:600] New master detected at 
> master@127.0.1.1:48221
> I1017 13:42:13.292021 13344 status_update_manager.cpp:167] New master 
> detected at master@127.0.1.1:48221
> I1017 13:42:13.292275 13348 slave.cpp:674] Authenticating with master 
> master@127.0.1.1:48221
> I1017 13:42:13.292551 13342 authenticatee.hpp:133] Creating new client SASL 
> connection
> I1017 13:42:13.292855 13342 master.cpp:3787] Authenticating 
> (623)@127.0.1.1:48221
> I1017 13:42:13.293114 13342 authenticator.hpp:161] Creating new server SASL 
> connection
> I1017 13:42:13.293398 13342 authenticatee.hpp:224] Received SASL 
> authentication mechanisms: CRAM-MD5
> I1017 13:42:13.293592 13342 authenticatee.hpp:250] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> I1017 13:42:13.293789 13342 authenticator.hpp:267] Received SASL 
> authentication start
> I1017 13:42:13.293995 13342 authenticator.hpp:389] Authentication requires 
> more steps
> I1017 13:42:13.294193 13342 authenticatee.hpp:270] Received SASL 
> authentication step
> I1017 13:42:13.294400 13342 authenticator.hpp:295] Received SASL 
> authentication step
> I1017 13:42:13.294586 13342 auxprop.cpp:81] Request to lookup properties for 
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' 
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
> SASL_AUXPROP_AUTHZID: false 
> I1017 13:42:13.294780 13342 auxprop.cpp:153] Looking up auxiliary property 
> '*userPassword'
> I1017 13:42:13.294963 13342 auxprop.cpp:153] Looking up auxiliary property 
> '*cmusaslsecretCRAM-MD5'
> I1017 13:42:13.295145 13342 auxprop.cpp:81] Request to lookup properties for 
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' 
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
> SASL_AUXPROP_AUTHZID: true 
> I1017 13:42:13.295318 13342 auxprop.cpp:103] Skipping auxiliary property 
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I1017 13:42:13.295490 13342 auxprop.cpp:103] Skipping auxiliary property 
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I1017 13:42:13.295668 13342 authenticator.hpp:381] Authentication success
> I1017 13:42:13.295869 13342 authenticatee.hpp:310] Authentication success
> I1017 13:42:13.296062 13342 master.cpp:3827] Successfully authenticated 
> principal 'test-principal' at (623)@127.0.1.1:48221
> I1017 13:42:13.296386 13348 slave.cpp:647] Detecting new master
> I1017 13:42:13.301076 13348 slave.cpp:731] Successfully authenticated with 
> master master@127.0.1.1:48221
> I1017 13:42:13.301614 13346 master.cpp:2968] Registering slave at 
> (623)@127.0.1.1:48221 (trusty) with id 20141017-134213-16842879-48221-13328-0
> I1017 13:42:13.301964 13346 registrar.cpp:445] Applied 1 operations in 
> 21609ns; attempting to update the 'registry'
> I1017 13:42:13.303146 13344 log.cpp:680] Attempting to append 284 bytes to 
> the log
> I1017 13:42:13.303452 13344 coordinator.cpp:340] Coordinator attempting to 
> write APPEND action at position 3
> I1017 13:42:13.303969 13344 replica.cpp:508] Replica received write request 
> for position 3
> I1017 13:42:13.304587 13348 slave.cpp:1048] Will retry registration in 
> 3.95735ms if necessary
> I1017 13:42:13.306241 13328 sched.cpp:137] Version: 0.21.0
> I1017 13:42:13.306624 13346 sched.cpp:233] New master detected at 
> master@127.0.1.1:48221
> I1017 13:42:13.306804 13346 sched.cpp:283] Authenticating with master 
> master@127.0.1.1:48221
> I1017 13:42:13.307168 13346 authenticatee.hpp:133] Creating new client SASL 
> connection
> I1017 13:42:13.307487 13346 master.cpp:3787] Authenticating 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:13.307837 13346 authenticator.hpp:161] Creating new server SASL 
> connection
> I1017 13:42:13.308207 13346 authenticatee.hpp:224] Received SASL 
> authentication mechanisms: CRAM-MD5
> I1017 13:42:13.308398 13346 authenticatee.hpp:250] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> I1017 13:42:13.308619 13346 authenticator.hpp:267] Received SASL 
> authentication start
> I1017 13:42:13.308823 13346 authenticator.hpp:389] Authentication requires 
> more steps
> I1017 13:42:13.309046 13346 authenticatee.hpp:270] Received SASL 
> authentication step
> I1017 13:42:13.309221 13348 slave.cpp:1048] Will retry registration in 
> 18.553497ms if necessary
> I1017 13:42:13.309278 13345 master.cpp:2956] Ignoring register slave message 
> from (623)@127.0.1.1:48221 (trusty) as admission is already in progress
> I1017 13:42:13.309834 13348 authenticator.hpp:295] Received SASL 
> authentication step
> I1017 13:42:13.310006 13348 auxprop.cpp:81] Request to lookup properties for 
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' 
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
> SASL_AUXPROP_AUTHZID: false 
> I1017 13:42:13.310238 13348 auxprop.cpp:153] Looking up auxiliary property 
> '*userPassword'
> I1017 13:42:13.310441 13348 auxprop.cpp:153] Looking up auxiliary property 
> '*cmusaslsecretCRAM-MD5'
> I1017 13:42:13.310624 13348 auxprop.cpp:81] Request to lookup properties for 
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' 
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
> SASL_AUXPROP_AUTHZID: true 
> I1017 13:42:13.310813 13348 auxprop.cpp:103] Skipping auxiliary property 
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I1017 13:42:13.310987 13348 auxprop.cpp:103] Skipping auxiliary property 
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I1017 13:42:13.311210 13348 authenticator.hpp:381] Authentication success
> I1017 13:42:13.311472 13343 authenticatee.hpp:310] Authentication success
> I1017 13:42:13.315773 13344 leveldb.cpp:343] Persisting action (303 bytes) to 
> leveldb took 11.584908ms
> I1017 13:42:13.316170 13344 replica.cpp:676] Persisted action at 3
> I1017 13:42:13.316678 13344 replica.cpp:655] Replica received learned notice 
> for position 3
> I1017 13:42:13.317164 13348 master.cpp:3827] Successfully authenticated 
> principal 'test-principal' at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:13.317832 13343 sched.cpp:357] Successfully authenticated with 
> master master@127.0.1.1:48221
> I1017 13:42:13.318039 13343 sched.cpp:476] Sending registration request to 
> master@127.0.1.1:48221
> I1017 13:42:13.318295 13347 master.cpp:1362] Received registration request 
> for framework 'default' at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:13.318481 13347 master.cpp:1321] Authorizing framework principal 
> 'test-principal' to receive offers for role '*'
> I1017 13:42:13.318783 13347 master.cpp:1426] Registering framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:13.319156 13347 hierarchical_allocator_process.hpp:329] Added 
> framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.319458 13347 hierarchical_allocator_process.hpp:697] No 
> resources available to allocate!
> I1017 13:42:13.319684 13347 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 0 slaves in 241676ns
> I1017 13:42:13.320008 13343 sched.cpp:407] Framework registered with 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.320247 13343 sched.cpp:421] Scheduler::registered took 22306ns
> I1017 13:42:13.326701 13344 leveldb.cpp:343] Persisting action (305 bytes) to 
> leveldb took 9.696717ms
> I1017 13:42:13.326954 13344 replica.cpp:676] Persisted action at 3
> I1017 13:42:13.327303 13344 replica.cpp:661] Replica learned APPEND action at 
> position 3
> I1017 13:42:13.328037 13342 registrar.cpp:490] Successfully updated the 
> 'registry' in 25.80992ms
> I1017 13:42:13.328490 13342 master.cpp:3022] Registered slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) with 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I1017 13:42:13.328840 13342 slave.cpp:765] Registered with master 
> master@127.0.1.1:48221; given slave ID 20141017-134213-16842879-48221-13328-0
> I1017 13:42:13.329071 13342 slave.cpp:2436] Received ping from 
> slave-observer(35)@127.0.1.1:48221
> I1017 13:42:13.328671 13346 hierarchical_allocator_process.hpp:442] Added 
> slave 20141017-134213-16842879-48221-13328-0 (trusty) with cpus(*):2; 
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; 
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I1017 13:42:13.329674 13346 hierarchical_allocator_process.hpp:734] Offering 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 
> 20141017-134213-16842879-48221-13328-0 to framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.329936 13346 hierarchical_allocator_process.hpp:679] Performed 
> allocation for slave 20141017-134213-16842879-48221-13328-0 in 305301ns
> I1017 13:42:13.329319 13349 log.cpp:699] Attempting to truncate the log to 3
> I1017 13:42:13.330471 13349 coordinator.cpp:340] Coordinator attempting to 
> write TRUNCATE action at position 4
> I1017 13:42:13.330883 13349 replica.cpp:508] Replica received write request 
> for position 4
> I1017 13:42:13.330140 13348 master.cpp:3729] Sending 1 offers to framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:13.331497 13348 sched.cpp:544] Scheduler::resourceOffers took 
> 29643ns
> I1017 13:42:13.332465 13346 master.cpp:2315] Processing reply for offers: [ 
> 20141017-134213-16842879-48221-13328-0 ] on slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) for 
> framework 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> W1017 13:42:13.332710 13346 master.cpp:1969] Executor default for task 1 uses 
> less CPUs (None) than the minimum required (0.01). Please update your 
> executor, as this will be mandatory in future releases.
> W1017 13:42:13.332919 13346 master.cpp:1980] Executor default for task 1 uses 
> less memory (None) than the minimum required (32MB). Please update your 
> executor, as this will be mandatory in future releases.
> I1017 13:42:13.333108 13346 master.cpp:2397] Authorizing framework principal 
> 'test-principal' to launch task 1 as user 'jenkins'
> I1017 13:42:13.333549 13346 master.hpp:869] Adding task 1 with resources 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 
> 20141017-134213-16842879-48221-13328-0 (trusty)
> I1017 13:42:13.333825 13346 master.cpp:2463] Launching task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 with resources 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty)
> I1017 13:42:13.334203 13346 slave.cpp:1079] Got assigned task 1 for framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.334875 13345 master.cpp:2639] Asked to kill task 1 of 
> framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.335054 13345 master.cpp:2734] Telling slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty) to 
> kill task 1 of framework 20141017-134213-16842879-48221-13328-0000 (default) 
> at scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:13.335261 13345 slave.cpp:1371] Asked to kill task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000
> W1017 13:42:13.335489 13345 slave.cpp:1412] Killing task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000 before it was launched
> I1017 13:42:13.336645 13345 slave.cpp:2200] Handling status update 
> TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of 
> framework 20141017-134213-16842879-48221-13328-0000 from @0.0.0.0:0
> W1017 13:42:13.336925 13345 slave.cpp:2233] Could not find the executor for 
> status update TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for 
> task 1 of framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.337209 13345 slave.cpp:2971] Cleaning up framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.337323 13342 status_update_manager.cpp:320] Received status 
> update TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of 
> framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.337715 13342 status_update_manager.cpp:499] Creating 
> StatusUpdate stream for task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.338084 13342 status_update_manager.cpp:373] Forwarding status 
> update TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of 
> framework 20141017-134213-16842879-48221-13328-0000 to master@127.0.1.1:48221
> I1017 13:42:13.338402 13342 status_update_manager.cpp:282] Closing status 
> update streams for framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.338677 13342 status_update_manager.cpp:530] Cleaning up status 
> update stream for task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.338562 13347 master.cpp:3352] Forwarding status update 
> TASK_KILLED (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of 
> framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.339165 13347 master.cpp:3324] Status update TASK_KILLED (UUID: 
> fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000 from slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty)
> I1017 13:42:13.339725 13344 hierarchical_allocator_process.hpp:563] Recovered 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total 
> allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on 
> slave 20141017-134213-16842879-48221-13328-0 from framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.339395 13348 sched.cpp:635] Scheduler::statusUpdate took 
> 25435ns
> I1017 13:42:13.340298 13347 master.cpp:4551] Removing task 1 with resources 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework 
> 20141017-134213-16842879-48221-13328-0000 on slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty)
> I1017 13:42:13.340586 13347 master.cpp:2813] Forwarding status update 
> acknowledgement fad148df-b3ef-44a2-836d-f1675c97a682 for task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 to slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty)
> E1017 13:42:13.340823 13345 slave.cpp:3469] Failed to find the mtime of 
> '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/slaves/20141017-134213-16842879-48221-13328-0/frameworks/20141017-134213-16842879-48221-13328-0000':
>  Error invoking stat for 
> '/tmp/SlaveTest_KillTaskBetweenRunTaskParts_Z7tpIo/slaves/20141017-134213-16842879-48221-13328-0/frameworks/20141017-134213-16842879-48221-13328-0000':
>  No such file or directory
> I1017 13:42:13.341151 13345 slave.cpp:2357] Status update manager 
> successfully handled status update TASK_KILLED (UUID: 
> fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.341425 13345 status_update_manager.cpp:398] Received status 
> update acknowledgement (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 
> 1 of framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.341714 13349 leveldb.cpp:343] Persisting action (16 bytes) to 
> leveldb took 10.636355ms
> I1017 13:42:13.341894 13349 replica.cpp:676] Persisted action at 4
> I1017 13:42:13.342350 13348 replica.cpp:655] Replica received learned notice 
> for position 4
> E1017 13:42:13.342881 13345 slave.cpp:1780] Failed to handle status update 
> acknowledgement (UUID: fad148df-b3ef-44a2-836d-f1675c97a682) for task 1 of 
> framework 20141017-134213-16842879-48221-13328-0000: Cannot find the status 
> update stream for task 1 of framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:13.343281 13328 slave.cpp:1189] Launching task 1 for framework 
> 20141017-134213-16842879-48221-13328-0000
> W1017 13:42:13.343456 13328 slave.cpp:1194] Ignoring run task 1 because the 
> framework 20141017-134213-16842879-48221-13328-0000 does not exist
> I1017 13:42:13.353662 13348 leveldb.cpp:343] Persisting action (18 bytes) to 
> leveldb took 10.997701ms
> I1017 13:42:13.354200 13348 leveldb.cpp:401] Deleting ~2 keys from leveldb 
> took 260599ns
> I1017 13:42:13.354375 13348 replica.cpp:676] Persisted action at 4
> I1017 13:42:13.354562 13348 replica.cpp:661] Replica learned TRUNCATE action 
> at position 4
> I1017 13:42:14.104835 13348 hierarchical_allocator_process.hpp:734] Offering 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 
> 20141017-134213-16842879-48221-13328-0 to framework 
> 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:14.105361 13348 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 615690ns
> I1017 13:42:14.105605 13346 master.cpp:3729] Sending 1 offers to framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:14.106204 13346 sched.cpp:544] Scheduler::resourceOffers took 
> 20474ns
> I1017 13:42:15.106046 13344 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 71533ns
> 2014-10-17 
> 13:42:15,828:13328(0x2ae8f942b700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:37292] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> I1017 13:42:16.107005 13346 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 65058ns
> I1017 13:42:17.108440 13344 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 85967ns
> I1017 13:42:18.104810 13347 master.cpp:120] No whitelist given. Advertising 
> offers for all slaves
> I1017 13:42:18.109485 13346 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 47948ns
> I1017 13:42:19.110553 13348 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 66051ns
> 2014-10-17 
> 13:42:19,167:13328(0x2ae8f942b700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:37292] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> I1017 13:42:20.111214 13343 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 79222ns
> I1017 13:42:21.112387 13349 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 101805ns
> I1017 13:42:22.113950 13345 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 87310ns
> 2014-10-17 
> 13:42:22,504:13328(0x2ae8f942b700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:37292] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> I1017 13:42:23.105613 13348 master.cpp:120] No whitelist given. Advertising 
> offers for all slaves
> I1017 13:42:23.115574 13345 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 1 slaves in 159780ns
> tests/slave_tests.cpp:1141: Failure
> Failed to wait 10secs for removeFramework
> tests/slave_tests.cpp:1133: Failure
> Actual function call count doesn't match EXPECT_CALL(slave, 
> removeFramework(_))...
>          Expected: to be called once
>            Actual: never called - unsatisfied and active
> I1017 13:42:23.350111 13342 master.cpp:768] Framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 disconnected
> I1017 13:42:23.350391 13342 master.cpp:1731] Disconnecting framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:23.350551 13342 master.cpp:1747] Deactivating framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221
> I1017 13:42:23.350930 13342 master.cpp:790] Giving framework 
> 20141017-134213-16842879-48221-13328-0000 (default) at 
> scheduler-f07acea1-8d44-4b35-bf61-3215b6258c97@127.0.1.1:48221 0ns to failover
> I1017 13:42:23.351084 13347 hierarchical_allocator_process.hpp:405] 
> Deactivated framework 20141017-134213-16842879-48221-13328-0000
> I1017 13:42:23.351521 13347 hierarchical_allocator_process.hpp:563] Recovered 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total 
> allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on 
> slave 20141017-134213-16842879-48221-13328-0 from framework 
> 20141017-134213-16842879-48221-13328-0000
> lt-mesos-tests: 
> ../3rdparty/libprocess/include/process/c++11/dispatch.hpp:155: auto 
> process::dispatch(const PID<mesos::internal::slave::Slave> &, void 
> (mesos::internal::slave::Slave::*)(const 
> process::Future<Option<mesos::MasterInfo> > &), 
> process::Future<Option<mesos::MasterInfo> >)::<anonymous 
> class>::operator()(process::ProcessBase *) const: Assertion `t != __null' 
> failed.
> *** Aborted at 1413578543 (unix time) try "date -d @1413578543" if you are 
> using GNU date ***
> I1017 13:42:23.357753 13347 master.cpp:677] Master terminating
> I1017 13:42:23.357992 13347 master.cpp:4580] Removing executor 'default' with 
> resources  of framework 20141017-134213-16842879-48221-13328-0000 on slave 
> 20141017-134213-16842879-48221-13328-0 at (623)@127.0.1.1:48221 (trusty)
> PC: @     0x2ae8cb23fbb9 (unknown)
> [  FAILED  ] SlaveTest.KillTaskBetweenRunTaskParts (10399 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to