[ https://issues.apache.org/jira/browse/MESOS-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Lambert updated MESOS-1543: --------------------------------- Sprint: Q2 Sprint 5 > MasterTest.OrphanTasks is flaky > ------------------------------- > > Key: MESOS-1543 > URL: https://issues.apache.org/jira/browse/MESOS-1543 > Project: Mesos > Issue Type: Bug > Components: test > Reporter: Vinod Kone > Assignee: Yifan Gu > > Looks like the slave's detector was discarded before the slave shutdown. > {code} > [ RUN ] MasterTest.OrphanTasks > Using temporary directory '/tmp/MasterTest_OrphanTasks_s8r6NU' > I0625 16:07:58.861265 26203 leveldb.cpp:176] Opened db in 48.741714ms > I0625 16:07:58.877020 26203 leveldb.cpp:183] Compacted db in 15.444957ms > I0625 16:07:58.877488 26203 leveldb.cpp:198] Created db iterator in 8308ns > I0625 16:07:58.877691 26203 leveldb.cpp:204] Seeked to beginning of db in > 2133ns > I0625 16:07:58.878000 26203 leveldb.cpp:273] Iterated through 0 keys in the > db in 1896ns > I0625 16:07:58.878221 26203 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0625 16:07:58.878715 26229 recover.cpp:425] Starting replica recovery > I0625 16:07:58.878783 26229 recover.cpp:451] Replica is in EMPTY status > I0625 16:07:58.879034 26229 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0625 16:07:58.879088 26229 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0625 16:07:58.879166 26229 recover.cpp:542] Updating replica status to > STARTING > I0625 16:07:58.882308 26228 master.cpp:288] Master > 20140625-160758-16842879-47563-26203 (lucid) started on 127.0.1.1:47563 > I0625 16:07:58.882339 26228 master.cpp:325] Master only allowing > authenticated frameworks to register > I0625 16:07:58.882345 26228 master.cpp:330] Master only allowing > authenticated slaves to register > I0625 16:07:58.882351 26228 credentials.hpp:35] Loading credentials for > authentication from '/tmp/MasterTest_OrphanTasks_s8r6NU/credentials' > I0625 16:07:58.882427 26228 master.cpp:356] Authorization enabled > I0625 16:07:58.883036 26228 hierarchical_allocator_process.hpp:301] > Initializing hierarchical allocator process with master : > master@127.0.1.1:47563 > I0625 16:07:58.883080 26228 master.cpp:122] No whitelist given. Advertising > offers for all slaves > I0625 16:07:58.883249 26228 master.cpp:1122] The newly elected leader is > master@127.0.1.1:47563 with id 20140625-160758-16842879-47563-26203 > I0625 16:07:58.883262 26228 master.cpp:1135] Elected as the leading master! > I0625 16:07:58.883268 26228 master.cpp:953] Recovering from registrar > I0625 16:07:58.883323 26228 registrar.cpp:313] Recovering registrar > I0625 16:07:58.895361 26229 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 16.126207ms > I0625 16:07:58.895427 26229 replica.cpp:320] Persisted replica status to > STARTING > I0625 16:07:58.895555 26229 recover.cpp:451] Replica is in STARTING status > I0625 16:07:58.895905 26229 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0625 16:07:58.895972 26229 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0625 16:07:58.896081 26229 recover.cpp:542] Updating replica status to VOTING > I0625 16:07:58.914243 26229 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 18.082424ms > I0625 16:07:58.914566 26229 replica.cpp:320] Persisted replica status to > VOTING > I0625 16:07:58.914927 26229 recover.cpp:556] Successfully joined the Paxos > group > I0625 16:07:58.915011 26229 recover.cpp:440] Recover process terminated > I0625 16:07:58.915163 26229 log.cpp:656] Attempting to start the writer > I0625 16:07:58.915468 26229 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0625 16:07:58.935538 26229 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 20.034561ms > I0625 16:07:58.936040 26229 replica.cpp:342] Persisted promised to 1 > I0625 16:07:58.947181 26230 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0625 16:07:58.947559 26230 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0625 16:07:58.962541 26230 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 14.946528ms > I0625 16:07:58.962818 26230 replica.cpp:676] Persisted action at 0 > I0625 16:07:58.963426 26230 replica.cpp:508] Replica received write request > for position 0 > I0625 16:07:58.963462 26230 leveldb.cpp:438] Reading position from leveldb > took 19380ns > I0625 16:07:58.982928 26230 leveldb.cpp:343] Persisting action (14 bytes) to > leveldb took 19.433595ms > I0625 16:07:58.983219 26230 replica.cpp:676] Persisted action at 0 > I0625 16:07:58.987048 26223 replica.cpp:655] Replica received learned notice > for position 0 > I0625 16:07:59.000260 26223 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 13.156945ms > I0625 16:07:59.000320 26223 replica.cpp:676] Persisted action at 0 > I0625 16:07:59.000331 26223 replica.cpp:661] Replica learned NOP action at > position 0 > I0625 16:07:59.000578 26223 log.cpp:672] Writer started with ending position 0 > I0625 16:07:59.000834 26223 leveldb.cpp:438] Reading position from leveldb > took 22576ns > I0625 16:07:59.002019 26223 registrar.cpp:346] Successfully fetched the > registry (0B) > I0625 16:07:59.002043 26223 registrar.cpp:422] Attempting to update the > 'registry' > I0625 16:07:59.003206 26223 log.cpp:680] Attempting to append 118 bytes to > the log > I0625 16:07:59.003253 26223 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 1 > I0625 16:07:59.003466 26223 replica.cpp:508] Replica received write request > for position 1 > I0625 16:07:59.018249 26223 leveldb.cpp:343] Persisting action (135 bytes) to > leveldb took 14.747648ms > I0625 16:07:59.018712 26223 replica.cpp:676] Persisted action at 1 > I0625 16:07:59.027173 26227 replica.cpp:655] Replica received learned notice > for position 1 > I0625 16:07:59.041352 26227 leveldb.cpp:343] Persisting action (137 bytes) to > leveldb took 14.12408ms > I0625 16:07:59.041412 26227 replica.cpp:676] Persisted action at 1 > I0625 16:07:59.041424 26227 replica.cpp:661] Replica learned APPEND action at > position 1 > I0625 16:07:59.041815 26227 registrar.cpp:479] Successfully updated 'registry' > I0625 16:07:59.041858 26227 registrar.cpp:372] Successfully recovered > registrar > I0625 16:07:59.041903 26227 log.cpp:699] Attempting to truncate the log to 1 > I0625 16:07:59.041981 26227 master.cpp:980] Recovered 0 slaves from the > Registry (82B) ; allowing 10mins for slaves to re-register > I0625 16:07:59.042021 26227 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 2 > I0625 16:07:59.042628 26226 replica.cpp:508] Replica received write request > for position 2 > I0625 16:07:59.046061 26228 slave.cpp:168] Slave started on > 117)@127.0.1.1:47563 > I0625 16:07:59.046083 26228 credentials.hpp:35] Loading credentials for > authentication from '/tmp/MasterTest_OrphanTasks_R5G2iM/credential' > I0625 16:07:59.046162 26228 slave.cpp:268] Slave using credential for: > test-principal > I0625 16:07:59.046252 26228 slave.cpp:281] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0625 16:07:59.046324 26228 slave.cpp:309] Slave hostname: lucid > I0625 16:07:59.046334 26228 slave.cpp:310] Slave checkpoint: false > I0625 16:07:59.046794 26228 state.cpp:33] Recovering state from > '/tmp/MasterTest_OrphanTasks_R5G2iM/meta' > I0625 16:07:59.046892 26228 status_update_manager.cpp:193] Recovering status > update manager > I0625 16:07:59.046967 26228 slave.cpp:3111] Finished recovery > I0625 16:07:59.047200 26228 slave.cpp:584] New master detected at > master@127.0.1.1:47563 > I0625 16:07:59.047224 26228 slave.cpp:660] Authenticating with master > master@127.0.1.1:47563 > I0625 16:07:59.047263 26228 slave.cpp:633] Detecting new master > I0625 16:07:59.047297 26228 status_update_manager.cpp:167] New master > detected at master@127.0.1.1:47563 > I0625 16:07:59.047333 26228 authenticatee.hpp:128] Creating new client SASL > connection > I0625 16:07:59.047412 26228 master.cpp:3499] Authenticating > slave(117)@127.0.1.1:47563 > I0625 16:07:59.047493 26228 authenticator.hpp:156] Creating new server SASL > connection > I0625 16:07:59.047550 26228 authenticatee.hpp:219] Received SASL > authentication mechanisms: CRAM-MD5 > I0625 16:07:59.047564 26228 authenticatee.hpp:245] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0625 16:07:59.047585 26228 authenticator.hpp:262] Received SASL > authentication start > I0625 16:07:59.047621 26228 authenticator.hpp:384] Authentication requires > more steps > I0625 16:07:59.047646 26228 authenticatee.hpp:265] Received SASL > authentication step > I0625 16:07:59.047678 26228 authenticator.hpp:290] Received SASL > authentication step > I0625 16:07:59.047693 26228 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false > I0625 16:07:59.047699 26228 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I0625 16:07:59.047708 26228 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I0625 16:07:59.047716 26228 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true > I0625 16:07:59.047721 26228 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I0625 16:07:59.047726 26228 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I0625 16:07:59.047736 26228 authenticator.hpp:376] Authentication success > I0625 16:07:59.047760 26228 authenticatee.hpp:305] Authentication success > I0625 16:07:59.047780 26228 master.cpp:3539] Successfully authenticated > principal 'test-principal' at slave(117)@127.0.1.1:47563 > I0625 16:07:59.047847 26228 slave.cpp:717] Successfully authenticated with > master master@127.0.1.1:47563 > I0625 16:07:59.047886 26228 slave.cpp:955] Will retry registration in > 11.156008ms if necessary > I0625 16:07:59.047976 26223 master.cpp:2781] Registering slave at > slave(117)@127.0.1.1:47563 (lucid) with id > 20140625-160758-16842879-47563-26203-0 > I0625 16:07:59.048074 26223 registrar.cpp:422] Attempting to update the > 'registry' > I0625 16:07:59.051329 26203 sched.cpp:139] Version: 0.20.0 > I0625 16:07:59.051686 26225 sched.cpp:235] New master detected at > master@127.0.1.1:47563 > I0625 16:07:59.051718 26225 sched.cpp:285] Authenticating with master > master@127.0.1.1:47563 > I0625 16:07:59.051816 26225 authenticatee.hpp:128] Creating new client SASL > connection > I0625 16:07:59.051900 26225 master.cpp:3499] Authenticating > scheduler-949b4db2-5d34-4876-bff8-a706ff8e4c72@127.0.1.1:47563 > I0625 16:07:59.051978 26225 authenticator.hpp:156] Creating new server SASL > connection > I0625 16:07:59.052034 26225 authenticatee.hpp:219] Received SASL > authentication mechanisms: CRAM-MD5 > I0625 16:07:59.052047 26225 authenticatee.hpp:245] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0625 16:07:59.052067 26225 authenticator.hpp:262] Received SASL > authentication start > I0625 16:07:59.052099 26225 authenticator.hpp:384] Authentication requires > more steps > I0625 16:07:59.052124 26225 authenticatee.hpp:265] Received SASL > authentication step > I0625 16:07:59.052155 26225 authenticator.hpp:290] Received SASL > authentication step > I0625 16:07:59.052168 26225 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false > I0625 16:07:59.052175 26225 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I0625 16:07:59.052183 26225 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I0625 16:07:59.052191 26225 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true > I0625 16:07:59.052196 26225 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I0625 16:07:59.052201 26225 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I0625 16:07:59.052211 26225 authenticator.hpp:376] Authentication success > I0625 16:07:59.052234 26225 authenticatee.hpp:305] Authentication success > I0625 16:07:59.052255 26225 master.cpp:3539] Successfully authenticated > principal 'test-principal' at > scheduler-949b4db2-5d34-4876-bff8-a706ff8e4c72@127.0.1.1:47563 > I0625 16:07:59.052322 26225 sched.cpp:359] Successfully authenticated with > master master@127.0.1.1:47563 > I0625 16:07:59.052335 26225 sched.cpp:478] Sending registration request to > master@127.0.1.1:47563 > I0625 16:07:59.052376 26225 master.cpp:1241] Received registration request > from scheduler-949b4db2-5d34-4876-bff8-a706ff8e4c72@127.0.1.1:47563 > I0625 16:07:59.052393 26225 master.cpp:1201] Authorizing framework principal > 'test-principal' to receive offers for role '*' > I0625 16:07:59.052490 26225 master.cpp:1300] Registering framework > 20140625-160758-16842879-47563-26203-0000 at > scheduler-949b4db2-5d34-4876-bff8-a706ff8e4c72@127.0.1.1:47563 > I0625 16:07:59.052580 26225 sched.cpp:409] Framework registered with > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.052602 26225 sched.cpp:423] Scheduler::registered took 8220ns > I0625 16:07:59.052634 26225 hierarchical_allocator_process.hpp:331] Added > framework 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.052642 26225 hierarchical_allocator_process.hpp:724] No > resources available to allocate! > I0625 16:07:59.052647 26225 hierarchical_allocator_process.hpp:686] Performed > allocation for 0 slaves in 5845ns > I0625 16:07:59.061020 26226 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 18.344559ms > I0625 16:07:59.061079 26226 replica.cpp:676] Persisted action at 2 > I0625 16:07:59.061353 26226 replica.cpp:655] Replica received learned notice > for position 2 > I0625 16:07:59.066944 26223 slave.cpp:955] Will retry registration in > 32.268831ms if necessary > I0625 16:07:59.067044 26223 master.cpp:2769] Ignoring register slave message > from slave(117)@127.0.1.1:47563 (lucid) as admission is already in progress > I0625 16:07:59.077322 26226 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 15.931404ms > I0625 16:07:59.077587 26226 leveldb.cpp:401] Deleting ~1 keys from leveldb > took 51614ns > I0625 16:07:59.077605 26226 replica.cpp:676] Persisted action at 2 > I0625 16:07:59.077615 26226 replica.cpp:661] Replica learned TRUNCATE action > at position 2 > I0625 16:07:59.078006 26226 log.cpp:680] Attempting to append 289 bytes to > the log > I0625 16:07:59.078053 26226 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 3 > I0625 16:07:59.078299 26226 replica.cpp:508] Replica received write request > for position 3 > I0625 16:07:59.094300 26226 leveldb.cpp:343] Persisting action (308 bytes) to > leveldb took 15.951731ms > I0625 16:07:59.094590 26226 replica.cpp:676] Persisted action at 3 > I0625 16:07:59.107091 26227 replica.cpp:655] Replica received learned notice > for position 3 > I0625 16:07:59.107347 26224 slave.cpp:955] Will retry registration in > 75.040916ms if necessary > I0625 16:07:59.107457 26224 master.cpp:2769] Ignoring register slave message > from slave(117)@127.0.1.1:47563 (lucid) as admission is already in progress > I0625 16:07:59.118427 26227 leveldb.cpp:343] Persisting action (310 bytes) to > leveldb took 11.273294ms > I0625 16:07:59.118489 26227 replica.cpp:676] Persisted action at 3 > I0625 16:07:59.118499 26227 replica.cpp:661] Replica learned APPEND action at > position 3 > I0625 16:07:59.118921 26227 registrar.cpp:479] Successfully updated 'registry' > I0625 16:07:59.119007 26227 log.cpp:699] Attempting to truncate the log to 3 > I0625 16:07:59.119071 26227 master.cpp:2821] Registered slave > 20140625-160758-16842879-47563-26203-0 at slave(117)@127.0.1.1:47563 (lucid) > I0625 16:07:59.119088 26227 master.cpp:3967] Adding slave > 20140625-160758-16842879-47563-26203-0 at slave(117)@127.0.1.1:47563 (lucid) > with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0625 16:07:59.119207 26227 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 4 > I0625 16:07:59.119288 26227 slave.cpp:751] Registered with master > master@127.0.1.1:47563; given slave ID 20140625-160758-16842879-47563-26203-0 > I0625 16:07:59.119355 26227 hierarchical_allocator_process.hpp:444] Added > slave 20140625-160758-16842879-47563-26203-0 (lucid) with cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] available) > I0625 16:07:59.119410 26227 hierarchical_allocator_process.hpp:750] Offering > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20140625-160758-16842879-47563-26203-0 to framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.119496 26227 hierarchical_allocator_process.hpp:706] Performed > allocation for slave 20140625-160758-16842879-47563-26203-0 in 110886ns > I0625 16:07:59.119549 26227 slave.cpp:2308] Received ping from > slave-observer(97)@127.0.1.1:47563 > I0625 16:07:59.119604 26227 master.hpp:794] Adding offer > 20140625-160758-16842879-47563-26203-0 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140625-160758-16842879-47563-26203-0 (lucid) > I0625 16:07:59.119644 26227 master.cpp:3446] Sending 1 offers to framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.119900 26227 sched.cpp:546] Scheduler::resourceOffers took > 125487ns > I0625 16:07:59.120072 26227 master.hpp:804] Removing offer > 20140625-160758-16842879-47563-26203-0 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140625-160758-16842879-47563-26203-0 (lucid) > I0625 16:07:59.120121 26227 master.cpp:2125] Processing reply for offers: [ > 20140625-160758-16842879-47563-26203-0 ] on slave > 20140625-160758-16842879-47563-26203-0 at slave(117)@127.0.1.1:47563 (lucid) > for framework 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.120141 26227 master.cpp:2211] Authorizing framework principal > 'test-principal' to launch task 0 as user 'jenkins' > I0625 16:07:59.120472 26227 master.hpp:766] Adding task 0 with resources > cpus(*):1; mem(*):64 on slave 20140625-160758-16842879-47563-26203-0 (lucid) > I0625 16:07:59.120501 26227 master.cpp:2277] Launching task 0 of framework > 20140625-160758-16842879-47563-26203-0000 with resources cpus(*):1; mem(*):64 > on slave 20140625-160758-16842879-47563-26203-0 at slave(117)@127.0.1.1:47563 > (lucid) > I0625 16:07:59.120621 26227 replica.cpp:508] Replica received write request > for position 4 > I0625 16:07:59.121064 26224 slave.cpp:986] Got assigned task 0 for framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.121354 26224 slave.cpp:1096] Launching task 0 for framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.124053 26224 exec.cpp:131] Version: 0.20.0 > I0625 16:07:59.124178 26224 slave.cpp:1206] Queuing task '0' for executor > default of framework '20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.124223 26224 slave.cpp:2453] Monitoring executor 'default' of > framework '20140625-160758-16842879-47563-26203-0000' in container > '42b56b06-5a72-41bb-8137-67926c47c9e9' > I0625 16:07:59.124274 26224 hierarchical_allocator_process.hpp:546] Framework > 20140625-160758-16842879-47563-26203-0000 left cpus(*):1; mem(*):960; > disk(*):1024; ports(*):[31000-32000] unused on slave > 20140625-160758-16842879-47563-26203-0 > I0625 16:07:59.124358 26224 hierarchical_allocator_process.hpp:588] Framework > 20140625-160758-16842879-47563-26203-0000 filtered slave > 20140625-160758-16842879-47563-26203-0 for 5secs > I0625 16:07:59.124452 26224 exec.cpp:181] Executor started at: > executor(36)@127.0.1.1:47563 with pid 26203 > I0625 16:07:59.124528 26224 slave.cpp:545] Successfully attached file > '/tmp/MasterTest_OrphanTasks_R5G2iM/slaves/20140625-160758-16842879-47563-26203-0/frameworks/20140625-160758-16842879-47563-26203-0000/executors/default/runs/42b56b06-5a72-41bb-8137-67926c47c9e9' > I0625 16:07:59.124557 26224 slave.cpp:1717] Got registration for executor > 'default' of framework 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.124634 26224 slave.cpp:1836] Flushing queued task 0 for > executor 'default' of framework 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.124703 26224 exec.cpp:205] Executor registered on slave > 20140625-160758-16842879-47563-26203-0 > I0625 16:07:59.125650 26224 exec.cpp:217] Executor::registered took 15164ns > I0625 16:07:59.125702 26224 exec.cpp:292] Executor asked to run task '0' > I0625 16:07:59.125732 26224 exec.cpp:301] Executor::launchTask took 21us > I0625 16:07:59.127326 26224 exec.cpp:524] Executor sending status update > TASK_RUNNING (UUID: 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task 0 of > framework 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.127782 26223 slave.cpp:2071] Handling status update > TASK_RUNNING (UUID: 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task 0 of > framework 20140625-160758-16842879-47563-26203-0000 from > executor(36)@127.0.1.1:47563 > I0625 16:07:59.127879 26223 status_update_manager.cpp:320] Received status > update TASK_RUNNING (UUID: 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task 0 > of framework 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.127893 26223 status_update_manager.cpp:499] Creating > StatusUpdate stream for task 0 of framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.127962 26223 status_update_manager.cpp:373] Forwarding status > update TASK_RUNNING (UUID: 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task 0 > of framework 20140625-160758-16842879-47563-26203-0000 to > master@127.0.1.1:47563 > I0625 16:07:59.128090 26223 master.cpp:3107] Status update TASK_RUNNING > (UUID: 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task 0 of framework > 20140625-160758-16842879-47563-26203-0000 from slave > 20140625-160758-16842879-47563-26203-0 at slave(117)@127.0.1.1:47563 (lucid) > I0625 16:07:59.128124 26223 slave.cpp:2229] Status update manager > successfully handled status update TASK_RUNNING (UUID: > 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task 0 of framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.128134 26223 slave.cpp:2235] Sending acknowledgement for > status update TASK_RUNNING (UUID: 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for > task 0 of framework 20140625-160758-16842879-47563-26203-0000 to > executor(36)@127.0.1.1:47563 > I0625 16:07:59.128206 26223 sched.cpp:637] Scheduler::statusUpdate took > 18758ns > I0625 16:07:59.128267 26223 master.cpp:2631] Forwarding status update > acknowledgement 77f58dcb-2ae3-45d2-a071-2c808614ab3a for task 0 of framework > 20140625-160758-16842879-47563-26203-0000 to slave > 20140625-160758-16842879-47563-26203-0 at slave(117)@127.0.1.1:47563 (lucid) > I0625 16:07:59.128363 26223 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task > 0 of framework 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.128412 26223 slave.cpp:1657] Status update manager > successfully handled status update acknowledgement (UUID: > 77f58dcb-2ae3-45d2-a071-2c808614ab3a) for task 0 of framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.129055 26226 process.cpp:3335] Handling HTTP event for process > 'master' with path: '/master/state.json' > I0625 16:07:59.129106 26226 http.cpp:452] HTTP request for > '/master/state.json' > I0625 16:07:59.129865 26231 process.cpp:1037] Socket closed while receiving > I0625 16:07:59.130645 26203 master.cpp:619] Master terminating > I0625 16:07:59.130695 26203 master.hpp:784] Removing task 0 with resources > cpus(*):1; mem(*):64 on slave 20140625-160758-16842879-47563-26203-0 (lucid) > W0625 16:07:59.130738 26203 master.cpp:4253] Removing task 0 of framework > 20140625-160758-16842879-47563-26203-0000 and slave > 20140625-160758-16842879-47563-26203-0 in non-terminal state TASK_RUNNING > I0625 16:07:59.131604 26226 slave.cpp:2315] master@127.0.1.1:47563 exited > W0625 16:07:59.131628 26226 slave.cpp:2318] Master disconnected! Waiting for > a new master to be elected > I0625 16:07:59.132665 26224 exec.cpp:338] Executor received status update > acknowledgement 77f58dcb-2ae3-45d2-a071-2c808614ab3a for task 0 of framework > 20140625-160758-16842879-47563-26203-0000 > I0625 16:07:59.138859 26227 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 18.191907ms > I0625 16:07:59.138913 26227 replica.cpp:676] Persisted action at 4 > I0625 16:07:59.190522 26203 leveldb.cpp:176] Opened db in 42.905633ms > I0625 16:07:59.229253 26203 leveldb.cpp:183] Compacted db in 38.417135ms > I0625 16:07:59.229497 26203 leveldb.cpp:198] Created db iterator in 9638ns > I0625 16:07:59.229696 26203 leveldb.cpp:204] Seeked to beginning of db in > 19529ns > I0625 16:07:59.229907 26203 leveldb.cpp:273] Iterated through 5 keys in the > db in 46457ns > I0625 16:07:59.230082 26203 replica.cpp:741] Replica recovered with log > positions 1 -> 4 with 0 holes and 1 unlearned > I0625 16:07:59.230548 26226 recover.cpp:425] Starting replica recovery > I0625 16:07:59.230633 26226 recover.cpp:451] Replica is in VOTING status > I0625 16:07:59.230676 26226 recover.cpp:440] Recover process terminated > I0625 16:07:59.234074 26224 master.cpp:288] Master > 20140625-160759-16842879-47563-26203 (lucid) started on 127.0.1.1:47563 > I0625 16:07:59.234104 26224 master.cpp:325] Master only allowing > authenticated frameworks to register > I0625 16:07:59.234110 26224 master.cpp:330] Master only allowing > authenticated slaves to register > I0625 16:07:59.234118 26224 credentials.hpp:35] Loading credentials for > authentication from '/tmp/MasterTest_OrphanTasks_s8r6NU/credentials' > I0625 16:07:59.234195 26224 master.cpp:356] Authorization enabled > I0625 16:07:59.234711 26224 hierarchical_allocator_process.hpp:301] > Initializing hierarchical allocator process with master : > master@127.0.1.1:47563 > I0625 16:07:59.234755 26224 master.cpp:122] No whitelist given. Advertising > offers for all slaves > I0625 16:07:59.234928 26224 master.cpp:1122] The newly elected leader is > master@127.0.1.1:47563 with id 20140625-160759-16842879-47563-26203 > I0625 16:07:59.234941 26224 master.cpp:1135] Elected as the leading master! > I0625 16:07:59.234948 26224 master.cpp:953] Recovering from registrar > I0625 16:07:59.235004 26224 registrar.cpp:313] Recovering registrar > I0625 16:07:59.235229 26224 log.cpp:656] Attempting to start the writer > I0625 16:07:59.235543 26224 replica.cpp:474] Replica received implicit > promise request with proposal 2 > I0625 16:07:59.246944 26224 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 11.362596ms > I0625 16:07:59.247259 26224 replica.cpp:342] Persisted promised to 2 > I0625 16:07:59.257194 26230 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0625 16:07:59.257627 26230 replica.cpp:375] Replica received explicit > promise request for position 4 with proposal 3 > I0625 16:07:59.257693 26230 leveldb.cpp:438] Reading position from leveldb > took 45817ns > I0625 16:07:59.270833 26230 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 13.105094ms > I0625 16:07:59.271126 26230 replica.cpp:676] Persisted action at 4 > I0625 16:07:59.271867 26230 replica.cpp:508] Replica received write request > for position 4 > I0625 16:07:59.272071 26230 leveldb.cpp:438] Reading position from leveldb > took 24316ns > I0625 16:07:59.283974 26230 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 11.714283ms > I0625 16:07:59.284390 26230 replica.cpp:676] Persisted action at 4 > I0625 16:07:59.287039 26226 replica.cpp:655] Replica received learned notice > for position 4 > I0625 16:07:59.297823 26226 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 10.735977ms > I0625 16:07:59.297930 26226 leveldb.cpp:401] Deleting ~2 keys from leveldb > took 49963ns > I0625 16:07:59.297947 26226 replica.cpp:676] Persisted action at 4 > I0625 16:07:59.297956 26226 replica.cpp:661] Replica learned TRUNCATE action > at position 4 > I0625 16:07:59.298205 26226 log.cpp:672] Writer started with ending position 4 > I0625 16:07:59.298506 26226 leveldb.cpp:438] Reading position from leveldb > took 37502ns > I0625 16:07:59.298533 26226 leveldb.cpp:438] Reading position from leveldb > took 6627ns > I0625 16:07:59.298846 26226 registrar.cpp:346] Successfully fetched the > registry (250B) > I0625 16:07:59.298873 26226 registrar.cpp:422] Attempting to update the > 'registry' > I0625 16:07:59.300153 26226 log.cpp:680] Attempting to append 289 bytes to > the log > I0625 16:07:59.300201 26226 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 5 > I0625 16:07:59.300427 26226 replica.cpp:508] Replica received write request > for position 5 > I0625 16:07:59.313813 26226 leveldb.cpp:343] Persisting action (308 bytes) to > leveldb took 13.351282ms > I0625 16:07:59.314107 26226 replica.cpp:676] Persisted action at 5 > I0625 16:07:59.327086 26224 replica.cpp:655] Replica received learned notice > for position 5 > I0625 16:07:59.338047 26224 leveldb.cpp:343] Persisting action (310 bytes) to > leveldb took 10.688745ms > I0625 16:07:59.338348 26224 replica.cpp:676] Persisted action at 5 > I0625 16:07:59.338714 26224 replica.cpp:661] Replica learned APPEND action at > position 5 > I0625 16:07:59.339380 26224 registrar.cpp:479] Successfully updated 'registry' > I0625 16:07:59.339436 26223 log.cpp:699] Attempting to truncate the log to 5 > I0625 16:07:59.339745 26223 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 6 > I0625 16:07:59.339993 26223 replica.cpp:508] Replica received write request > for position 6 > I0625 16:07:59.340585 26224 registrar.cpp:372] Successfully recovered > registrar > I0625 16:07:59.341030 26228 master.cpp:980] Recovered 1 slaves from the > Registry (250B) ; allowing 10mins for slaves to re-register > I0625 16:07:59.342396 26225 slave.cpp:584] New master detected at > master@127.0.1.1:47563 > I0625 16:07:59.342427 26225 slave.cpp:660] Authenticating with master > master@127.0.1.1:47563 > I0625 16:07:59.342473 26225 slave.cpp:633] Detecting new master > I0625 16:07:59.342536 26225 sched.cpp:229] Scheduler::disconnected took 8596ns > I0625 16:07:59.342546 26225 sched.cpp:235] New master detected at > master@127.0.1.1:47563 > I0625 16:07:59.342557 26225 sched.cpp:285] Authenticating with master > master@127.0.1.1:47563 > I0625 16:07:59.342622 26225 status_update_manager.cpp:167] New master > detected at master@127.0.1.1:47563 > I0625 16:07:59.342663 26225 authenticatee.hpp:128] Creating new client SASL > connection > I0625 16:07:59.342738 26225 authenticatee.hpp:128] Creating new client SASL > connection > I0625 16:07:59.342778 26225 master.cpp:843] Dropping > 'mesos.internal.AuthenticateMessage' message since not recovered yet > I0625 16:07:59.342794 26225 master.cpp:843] Dropping > 'mesos.internal.AuthenticateMessage' message since not recovered yet > I0625 16:07:59.353075 26223 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 13.023984ms > I0625 16:07:59.353137 26223 replica.cpp:676] Persisted action at 6 > I0625 16:07:59.353423 26223 replica.cpp:655] Replica received learned notice > for position 6 > I0625 16:07:59.364477 26223 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 11.017392ms > I0625 16:07:59.364581 26223 leveldb.cpp:401] Deleting ~2 keys from leveldb > took 48053ns > I0625 16:07:59.364598 26223 replica.cpp:676] Persisted action at 6 > I0625 16:07:59.364608 26223 replica.cpp:661] Replica learned TRUNCATE action > at position 6 > 2014-06-25 > 16:08:00,021:26203(0x2b44fb781700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53090] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > 2014-06-25 > 16:08:03,358:26203(0x2b44fb781700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53090] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > 2014-06-25 > 16:08:06,695:26203(0x2b44fb781700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53090] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > tests/master_tests.cpp:1936: Failure > Failed to wait 10secs for slaveReregisteredMessage > I0625 16:08:09.357589 26203 master.cpp:619] Master terminating > F0625 16:08:09.357723 26226 slave.cpp:571] Check failed: > !_master.isDiscarded() > *** Check failure stack trace: *** > @ 0x2b44e959da2d google::LogMessage::Fail() > @ 0x2b44e95a292f google::LogMessage::SendToLog() > @ 0x2b44e959f90a google::LogMessage::Flush() > @ 0x2b44e959fd1d google::LogMessageFatal::~LogMessageFatal() > @ 0x2b44e90d82f0 mesos::internal::slave::Slave::detected() > @ 0x4dca91 std::tr1::_Function_handler<>::_M_invoke() > @ 0x2b44e94b42da process::ProcessManager::resume() > @ 0x2b44e94b4bec process::schedule() > @ 0x2b44eb4ba9ca start_thread > @ 0x2b44eb7bb1cd (unknown) > make[3]: *** [check-local] Aborted > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)