[ https://issues.apache.org/jira/browse/MESOS-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Lambert updated MESOS-1437: --------------------------------- Sprint: Q2'14 Sprint 3 > SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch is flaky > ------------------------------------------------------------- > > Key: MESOS-1437 > URL: https://issues.apache.org/jira/browse/MESOS-1437 > Project: Mesos > Issue Type: Bug > Components: test > Affects Versions: 0.19.0 > Environment: ubuntu-14.04-gcc > Reporter: Yan Xu > > {noformat} > [ RUN ] SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch > Using temporary directory > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_lUbOAD' > I0529 09:29:24.491727 18175 leveldb.cpp:176] Opened db in 3.742412ms > I0529 09:29:24.493413 18175 leveldb.cpp:183] Compacted db in 1.183942ms > I0529 09:29:24.493865 18175 leveldb.cpp:198] Created db iterator in 6242ns > I0529 09:29:24.494138 18175 leveldb.cpp:204] Seeked to beginning of db in > 1213ns > I0529 09:29:24.494526 18175 leveldb.cpp:273] Iterated through 0 keys in the > db in 638ns > I0529 09:29:24.494839 18175 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0529 09:29:24.495407 18197 recover.cpp:425] Starting replica recovery > I0529 09:29:24.495491 18197 recover.cpp:451] Replica is in EMPTY status > I0529 09:29:24.495844 18197 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0529 09:29:24.495916 18197 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0529 09:29:24.496022 18197 recover.cpp:542] Updating replica status to > STARTING > I0529 09:29:24.497161 18194 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 844433ns > I0529 09:29:24.497185 18194 replica.cpp:320] Persisted replica status to > STARTING > I0529 09:29:24.497251 18194 recover.cpp:451] Replica is in STARTING status > I0529 09:29:24.497478 18194 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0529 09:29:24.497534 18194 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0529 09:29:24.497622 18194 recover.cpp:542] Updating replica status to VOTING > I0529 09:29:24.498047 18194 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 376402ns > I0529 09:29:24.498067 18194 replica.cpp:320] Persisted replica status to > VOTING > I0529 09:29:24.498108 18194 recover.cpp:556] Successfully joined the Paxos > group > I0529 09:29:24.498159 18194 recover.cpp:440] Recover process terminated > I0529 09:29:24.503008 18196 master.cpp:272] Master > 20140529-092924-16842879-48472-18175 (trusty) started on 127.0.1.1:48472 > I0529 09:29:24.503432 18196 master.cpp:309] Master only allowing > authenticated frameworks to register > I0529 09:29:24.503765 18196 master.cpp:314] Master only allowing > authenticated slaves to register > I0529 09:29:24.504178 18196 credentials.hpp:35] Loading credentials for > authentication from > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_lUbOAD/credentials' > I0529 09:29:24.504432 18196 master.cpp:340] Master enabling authorization > I0529 09:29:24.504879 18200 hierarchical_allocator_process.hpp:301] > Initializing hierarchical allocator process with master : > master@127.0.1.1:48472 > I0529 09:29:24.505163 18198 master.cpp:108] No whitelist given. Advertising > offers for all slaves > I0529 09:29:24.506216 18196 master.cpp:957] The newly elected leader is > master@127.0.1.1:48472 with id 20140529-092924-16842879-48472-18175 > I0529 09:29:24.506657 18196 master.cpp:970] Elected as the leading master! > I0529 09:29:24.506963 18196 master.cpp:788] Recovering from registrar > I0529 09:29:24.507419 18201 registrar.cpp:313] Recovering registrar > I0529 09:29:24.507659 18194 log.cpp:656] Attempting to start the writer > I0529 09:29:24.507969 18194 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0529 09:29:24.508533 18194 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 540486ns > I0529 09:29:24.508553 18194 replica.cpp:342] Persisted promised to 1 > I0529 09:29:24.508718 18194 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0529 09:29:24.509006 18194 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0529 09:29:24.509305 18194 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 278308ns > I0529 09:29:24.509552 18194 replica.cpp:676] Persisted action at 0 > I0529 09:29:24.510025 18194 replica.cpp:508] Replica received write request > for position 0 > I0529 09:29:24.510303 18194 leveldb.cpp:438] Reading position from leveldb > took 16831ns > I0529 09:29:24.510577 18194 leveldb.cpp:343] Persisting action (14 bytes) to > leveldb took 252132ns > I0529 09:29:24.510615 18194 replica.cpp:676] Persisted action at 0 > I0529 09:29:24.514835 18197 replica.cpp:655] Replica received learned notice > for position 0 > I0529 09:29:24.515504 18197 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 639727ns > I0529 09:29:24.515524 18197 replica.cpp:676] Persisted action at 0 > I0529 09:29:24.515534 18197 replica.cpp:661] Replica learned NOP action at > position 0 > I0529 09:29:24.515691 18197 log.cpp:672] Writer started with ending position 0 > I0529 09:29:24.516027 18198 leveldb.cpp:438] Reading position from leveldb > took 11426ns > I0529 09:29:24.517470 18198 registrar.cpp:346] Successfully fetched the > registry (0B) > I0529 09:29:24.517500 18198 registrar.cpp:422] Attempting to update the > 'registry' > I0529 09:29:24.519122 18199 log.cpp:680] Attempting to append 119 bytes to > the log > I0529 09:29:24.519177 18199 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 1 > I0529 09:29:24.519358 18199 replica.cpp:508] Replica received write request > for position 1 > I0529 09:29:24.520508 18199 leveldb.cpp:343] Persisting action (136 bytes) to > leveldb took 1.12638ms > I0529 09:29:24.520544 18199 replica.cpp:676] Persisted action at 1 > I0529 09:29:24.520712 18199 replica.cpp:655] Replica received learned notice > for position 1 > I0529 09:29:24.521198 18199 leveldb.cpp:343] Persisting action (138 bytes) to > leveldb took 464062ns > I0529 09:29:24.521219 18199 replica.cpp:676] Persisted action at 1 > I0529 09:29:24.521229 18199 replica.cpp:661] Replica learned APPEND action at > position 1 > I0529 09:29:24.521467 18199 registrar.cpp:479] Successfully updated 'registry' > I0529 09:29:24.521507 18199 registrar.cpp:372] Successfully recovered > registrar > I0529 09:29:24.521543 18199 log.cpp:699] Attempting to truncate the log to 1 > I0529 09:29:24.521605 18199 master.cpp:815] Recovered 0 slaves from the > Registry (83B) ; allowing 10mins for slaves to re-register > I0529 09:29:24.521646 18199 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 2 > I0529 09:29:24.521832 18199 replica.cpp:508] Replica received write request > for position 2 > I0529 09:29:24.522260 18199 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 405954ns > I0529 09:29:24.522593 18199 replica.cpp:676] Persisted action at 2 > I0529 09:29:24.522948 18199 replica.cpp:655] Replica received learned notice > for position 2 > I0529 09:29:24.524375 18199 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 1.142005ms > I0529 09:29:24.524786 18199 leveldb.cpp:401] Deleting ~1 keys from leveldb > took 24473ns > I0529 09:29:24.525178 18199 replica.cpp:676] Persisted action at 2 > I0529 09:29:24.525440 18199 replica.cpp:661] Replica learned TRUNCATE action > at position 2 > I0529 09:29:24.528745 18175 sched.cpp:126] Version: 0.19.0 > I0529 09:29:24.529137 18199 slave.cpp:143] Slave started on > 136)@127.0.1.1:48472 > I0529 09:29:24.529464 18199 credentials.hpp:35] Loading credentials for > authentication from > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/credential' > I0529 09:29:24.529770 18199 slave.cpp:242] Slave using credential for: > test-principal > I0529 09:29:24.529885 18199 slave.cpp:255] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0529 09:29:24.529955 18199 slave.cpp:283] Slave hostname: trusty > I0529 09:29:24.529968 18199 slave.cpp:284] Slave checkpoint: true > I0529 09:29:24.530727 18199 state.cpp:33] Recovering state from > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta' > I0529 09:29:24.530822 18199 status_update_manager.cpp:193] Recovering status > update manager > I0529 09:29:24.530894 18199 slave.cpp:3018] Finished recovery > I0529 09:29:24.531134 18199 slave.cpp:536] New master detected at > master@127.0.1.1:48472 > I0529 09:29:24.531162 18199 slave.cpp:612] Authenticating with master > master@127.0.1.1:48472 > I0529 09:29:24.531199 18199 slave.cpp:585] Detecting new master > I0529 09:29:24.531229 18199 status_update_manager.cpp:167] New master > detected at master@127.0.1.1:48472 > I0529 09:29:24.531265 18199 authenticatee.hpp:128] Creating new client SASL > connection > I0529 09:29:24.531412 18199 master.cpp:2986] Authenticating > slave(136)@127.0.1.1:48472 > I0529 09:29:24.531489 18199 authenticator.hpp:156] Creating new server SASL > connection > I0529 09:29:24.531599 18199 authenticatee.hpp:219] Received SASL > authentication mechanisms: CRAM-MD5 > I0529 09:29:24.531620 18199 authenticatee.hpp:245] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0529 09:29:24.531644 18199 authenticator.hpp:262] Received SASL > authentication start > I0529 09:29:24.531680 18199 authenticator.hpp:384] Authentication requires > more steps > I0529 09:29:24.531707 18199 authenticatee.hpp:265] Received SASL > authentication step > I0529 09:29:24.531741 18199 authenticator.hpp:290] Received SASL > authentication step > I0529 09:29:24.531759 18199 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: false > I0529 09:29:24.531767 18199 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I0529 09:29:24.531780 18199 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I0529 09:29:24.531790 18199 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: true > I0529 09:29:24.531797 18199 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I0529 09:29:24.531803 18199 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I0529 09:29:24.531827 18199 authenticator.hpp:376] Authentication success > I0529 09:29:24.531857 18199 authenticatee.hpp:305] Authentication success > I0529 09:29:24.531882 18199 master.cpp:3026] Successfully authenticated > principal 'test-principal' at slave(136)@127.0.1.1:48472 > I0529 09:29:24.531945 18199 slave.cpp:669] Successfully authenticated with > master master@127.0.1.1:48472 > I0529 09:29:24.531986 18199 slave.cpp:902] Will retry registration in > 3.076227ms if necessary > I0529 09:29:24.532193 18196 master.cpp:2302] Registering slave at > slave(136)@127.0.1.1:48472 (trusty) with id > 20140529-092924-16842879-48472-18175-0 > I0529 09:29:24.532804 18197 registrar.cpp:422] Attempting to update the > 'registry' > I0529 09:29:24.533882 18201 log.cpp:680] Attempting to append 292 bytes to > the log > I0529 09:29:24.533936 18201 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 3 > I0529 09:29:24.534139 18201 replica.cpp:508] Replica received write request > for position 3 > I0529 09:29:24.534694 18201 leveldb.cpp:343] Persisting action (311 bytes) to > leveldb took 532639ns > I0529 09:29:24.534715 18201 replica.cpp:676] Persisted action at 3 > I0529 09:29:24.535266 18194 replica.cpp:655] Replica received learned notice > for position 3 > I0529 09:29:24.536165 18196 sched.cpp:222] New master detected at > master@127.0.1.1:48472 > I0529 09:29:24.536190 18196 sched.cpp:273] Authenticating with master > master@127.0.1.1:48472 > I0529 09:29:24.536586 18194 leveldb.cpp:343] Persisting action (313 bytes) to > leveldb took 1.293153ms > I0529 09:29:24.536613 18194 replica.cpp:676] Persisted action at 3 > I0529 09:29:24.536623 18194 replica.cpp:661] Replica learned APPEND action at > position 3 > I0529 09:29:24.536893 18194 registrar.cpp:479] Successfully updated 'registry' > I0529 09:29:24.536964 18194 log.cpp:699] Attempting to truncate the log to 3 > I0529 09:29:24.537014 18194 master.cpp:2342] Registered slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > I0529 09:29:24.537031 18194 master.cpp:3472] Adding slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0529 09:29:24.537122 18194 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 4 > I0529 09:29:24.537215 18194 hierarchical_allocator_process.hpp:444] Added > slave 20140529-092924-16842879-48472-18175-0 (trusty) with cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] available) > I0529 09:29:24.537257 18194 hierarchical_allocator_process.hpp:707] Performed > allocation for slave 20140529-092924-16842879-48472-18175-0 in 9613ns > I0529 09:29:24.537389 18194 replica.cpp:508] Replica received write request > for position 4 > I0529 09:29:24.536259 18196 authenticatee.hpp:128] Creating new client SASL > connection > I0529 09:29:24.538015 18196 master.cpp:2986] Authenticating > scheduler(130)@127.0.1.1:48472 > I0529 09:29:24.538091 18196 authenticator.hpp:156] Creating new server SASL > connection > I0529 09:29:24.538194 18196 authenticatee.hpp:219] Received SASL > authentication mechanisms: CRAM-MD5 > I0529 09:29:24.538214 18196 authenticatee.hpp:245] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0529 09:29:24.538238 18196 authenticator.hpp:262] Received SASL > authentication start > I0529 09:29:24.538272 18196 authenticator.hpp:384] Authentication requires > more steps > I0529 09:29:24.538297 18196 authenticatee.hpp:265] Received SASL > authentication step > I0529 09:29:24.538331 18196 authenticator.hpp:290] Received SASL > authentication step > I0529 09:29:24.538346 18196 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: false > I0529 09:29:24.538354 18196 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I0529 09:29:24.538365 18196 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I0529 09:29:24.538374 18196 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: true > I0529 09:29:24.538383 18196 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I0529 09:29:24.538388 18196 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I0529 09:29:24.538399 18196 authenticator.hpp:376] Authentication success > I0529 09:29:24.538424 18196 authenticatee.hpp:305] Authentication success > I0529 09:29:24.538570 18195 sched.cpp:347] Successfully authenticated with > master master@127.0.1.1:48472 > I0529 09:29:24.538591 18195 sched.cpp:466] Sending registration request to > master@127.0.1.1:48472 > I0529 09:29:24.538947 18201 slave.cpp:902] Will retry registration in > 14.852555ms if necessary > I0529 09:29:24.539002 18201 slave.cpp:703] Registered with master > master@127.0.1.1:48472; given slave ID 20140529-092924-16842879-48472-18175-0 > I0529 09:29:24.539147 18201 slave.cpp:716] Checkpointing SlaveInfo to > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/slave.info' > I0529 09:29:24.539445 18194 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 2.035571ms > I0529 09:29:24.539471 18194 replica.cpp:676] Persisted action at 4 > I0529 09:29:24.539649 18194 replica.cpp:655] Replica received learned notice > for position 4 > I0529 09:29:24.539990 18196 master.cpp:3026] Successfully authenticated > principal 'test-principal' at scheduler(130)@127.0.1.1:48472 > I0529 09:29:24.540099 18196 master.cpp:1041] Received registration request > from scheduler(130)@127.0.1.1:48472 > I0529 09:29:24.540127 18196 master.cpp:1059] Registering framework > 20140529-092924-16842879-48472-18175-0000 at scheduler(130)@127.0.1.1:48472 > I0529 09:29:24.540189 18196 master.cpp:2278] Slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > already registered, resending acknowledgement > I0529 09:29:24.540230 18196 sched.cpp:397] Framework registered with > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.540249 18196 sched.cpp:411] Scheduler::registered took 9083ns > I0529 09:29:24.540282 18196 hierarchical_allocator_process.hpp:331] Added > framework 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.540318 18196 hierarchical_allocator_process.hpp:751] Offering > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20140529-092924-16842879-48472-18175-0 to framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.540398 18196 hierarchical_allocator_process.hpp:687] Performed > allocation for 1 slaves in 103378ns > W0529 09:29:24.540426 18196 slave.cpp:728] Already registered with master > master@127.0.1.1:48472 > I0529 09:29:24.540478 18196 master.hpp:683] Adding offer > 20140529-092924-16842879-48472-18175-0 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140529-092924-16842879-48472-18175-0 (trusty) > I0529 09:29:24.540520 18196 master.cpp:2933] Sending 1 offers to framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.540613 18196 sched.cpp:534] Scheduler::resourceOffers took > 13742ns > I0529 09:29:24.541463 18194 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 1.792617ms > I0529 09:29:24.541507 18194 leveldb.cpp:401] Deleting ~2 keys from leveldb > took 19301ns > I0529 09:29:24.541520 18194 replica.cpp:676] Persisted action at 4 > I0529 09:29:24.541529 18194 replica.cpp:661] Replica learned TRUNCATE action > at position 4 > I0529 09:29:24.542551 18196 master.hpp:693] Removing offer > 20140529-092924-16842879-48472-18175-0 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140529-092924-16842879-48472-18175-0 (trusty) > I0529 09:29:24.542899 18196 master.cpp:1889] Processing reply for offers: [ > 20140529-092924-16842879-48472-18175-0 ] on slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > for framework 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.543246 18196 master.hpp:655] Adding task > 24a52e8f-0b30-4970-8829-a7383f19980e with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140529-092924-16842879-48472-18175-0 (trusty) > I0529 09:29:24.543572 18196 master.cpp:3111] Launching task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 with resources cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > I0529 09:29:24.544034 18195 slave.cpp:933] Got assigned task > 24a52e8f-0b30-4970-8829-a7383f19980e for framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.544123 18195 slave.cpp:3290] Checkpointing FrameworkInfo to > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/framework.info' > I0529 09:29:24.544325 18195 slave.cpp:3297] Checkpointing framework pid > 'scheduler(130)@127.0.1.1:48472' to > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/framework.pid' > I0529 09:29:24.544543 18195 slave.cpp:1043] Launching task > 24a52e8f-0b30-4970-8829-a7383f19980e for framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.545670 18195 slave.cpp:3612] Checkpointing ExecutorInfo to > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/executor.info' > I0529 09:29:24.546325 18195 slave.cpp:3727] Checkpointing TaskInfo to > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7/tasks/24a52e8f-0b30-4970-8829-a7383f19980e/task.info' > I0529 09:29:24.546923 18195 slave.cpp:1153] Queuing task > '24a52e8f-0b30-4970-8829-a7383f19980e' for executor > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > '20140529-092924-16842879-48472-18175-0000 > E0529 09:29:24.547292 18195 slave.cpp:2397] Termination of executor > '24a52e8f-0b30-4970-8829-a7383f19980e' of framework > '20140529-092924-16842879-48472-18175-0000' failed: Unknown container: > 7fbaa2b9-d41e-4d3e-897f-3d593b1308c7 > E0529 09:29:24.547698 18195 slave.cpp:2686] Failed to unmonitor container for > executor 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000: Not monitored > I0529 09:29:24.548846 18195 slave.cpp:2018] Handling status update > TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 from @0.0.0.0:0 > I0529 09:29:24.549253 18195 slave.cpp:3660] Terminating task > 24a52e8f-0b30-4970-8829-a7383f19980e > I0529 09:29:24.549710 18194 status_update_manager.cpp:320] Received status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.549733 18194 status_update_manager.cpp:499] Creating > StatusUpdate stream for task 24a52e8f-0b30-4970-8829-a7383f19980e of > framework 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.549865 18194 status_update_manager.hpp:342] Checkpointing > UPDATE for status update TASK_FAILED (UUID: > aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.550242 18195 slave.cpp:425] Slave terminating > I0529 09:29:24.550930 18198 master.cpp:712] Slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > disconnected > I0529 09:29:24.550952 18198 master.cpp:1344] Disconnecting slave > 20140529-092924-16842879-48472-18175-0 > I0529 09:29:24.551013 18194 status_update_manager.cpp:373] Forwarding status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 to master@127.0.1.1:48472 > I0529 09:29:24.551261 18198 hierarchical_allocator_process.hpp:483] Slave > 20140529-092924-16842879-48472-18175-0 disconnected > I0529 09:29:24.551419 18194 master.cpp:2628] Status update TASK_FAILED (UUID: > aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 from slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > I0529 09:29:24.551475 18194 master.hpp:673] Removing task > 24a52e8f-0b30-4970-8829-a7383f19980e with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140529-092924-16842879-48472-18175-0 (trusty) > GMOCK WARNING: > Uninteresting mock function call - returning directly. > Function call: statusUpdate(0x7fff69ebf410, @0x2ad9c80de320 72-byte > object <30-17 18-A5 D9-2A 00-00 00-00 00-00 00-00 00-00 D0-6B 0E-C8 D9-2A > 00-00 50-4A 0E-C8 D9-2A 00-00 38-30 D1-00 00-00 00-00 90-F1 0D-C8 D9-2A 00-00 > 1B-13 63-6B 13-EB D4-41 03-00 00-00 00-00 00-00 37-00 00-00 00-00 00-00>) > Stack trace: > I0529 09:29:24.551668 18194 sched.cpp:625] Scheduler::statusUpdate took > 87069ns > I0529 09:29:24.552121 18194 hierarchical_allocator_process.hpp:636] Recovered > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total > allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on > slave 20140529-092924-16842879-48472-18175-0 from framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.560880 18197 slave.cpp:143] Slave started on > 137)@127.0.1.1:48472 > I0529 09:29:24.560909 18197 credentials.hpp:35] Loading credentials for > authentication from > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/credential' > I0529 09:29:24.560972 18197 slave.cpp:242] Slave using credential for: > test-principal > I0529 09:29:24.561058 18197 slave.cpp:255] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0529 09:29:24.561120 18197 slave.cpp:283] Slave hostname: trusty > I0529 09:29:24.561133 18197 slave.cpp:284] Slave checkpoint: true > I0529 09:29:24.561491 18197 state.cpp:33] Recovering state from > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta' > W0529 09:29:24.562824 18197 state.cpp:426] Failed to find executor forked pid > file > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7/pids/forked.pid' > I0529 09:29:24.563506 18201 slave.cpp:3086] Recovering framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.563535 18201 slave.cpp:3462] Recovering executor > '24a52e8f-0b30-4970-8829-a7383f19980e' of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.563608 18201 slave.cpp:3660] Terminating task > 24a52e8f-0b30-4970-8829-a7383f19980e > I0529 09:29:24.564241 18201 status_update_manager.cpp:193] Recovering status > update manager > I0529 09:29:24.564257 18201 status_update_manager.cpp:201] Recovering > executor '24a52e8f-0b30-4970-8829-a7383f19980e' of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.564275 18201 status_update_manager.cpp:499] Creating > StatusUpdate stream for task 24a52e8f-0b30-4970-8829-a7383f19980e of > framework 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.564371 18201 status_update_manager.hpp:306] Replaying status > update stream for task 24a52e8f-0b30-4970-8829-a7383f19980e > W0529 09:29:24.564391 18201 status_update_manager.cpp:381] Not forwarding > status update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for > task 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 because no master is elected yet > I0529 09:29:24.564512 18201 slave.cpp:497] Successfully attached file > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7' > I0529 09:29:24.564915 18194 slave.cpp:2967] Unable to reconnect to executor > '24a52e8f-0b30-4970-8829-a7383f19980e' of framework > 20140529-092924-16842879-48472-18175-0000 because no libprocess PID was found > E0529 09:29:24.564950 18194 slave.cpp:2397] Termination of executor > '24a52e8f-0b30-4970-8829-a7383f19980e' of framework > '20140529-092924-16842879-48472-18175-0000' failed: Unknown container: > 7fbaa2b9-d41e-4d3e-897f-3d593b1308c7 > E0529 09:29:24.565001 18194 slave.cpp:2686] Failed to unmonitor container for > executor 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000: Not monitored > I0529 09:29:24.581280 18198 hierarchical_allocator_process.hpp:687] Performed > allocation for 1 slaves in 56677ns > I0529 09:29:24.581333 18198 slave.cpp:1967] Cleaning up un-reregistered > executors > I0529 09:29:24.581362 18198 slave.cpp:3018] Finished recovery > I0529 09:29:24.581876 18198 slave.cpp:536] New master detected at > master@127.0.1.1:48472 > I0529 09:29:24.581909 18198 slave.cpp:612] Authenticating with master > master@127.0.1.1:48472 > I0529 09:29:24.581959 18198 slave.cpp:585] Detecting new master > I0529 09:29:24.581992 18198 status_update_manager.cpp:167] New master > detected at master@127.0.1.1:48472 > W0529 09:29:24.582008 18198 status_update_manager.cpp:181] Resending status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.582025 18198 status_update_manager.cpp:373] Forwarding status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 to master@127.0.1.1:48472 > I0529 09:29:24.582099 18198 authenticatee.hpp:128] Creating new client SASL > connection > W0529 09:29:24.582298 18198 master.cpp:2621] Could not lookup task for status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 from slave > 20140529-092924-16842879-48472-18175-0 at slave(136)@127.0.1.1:48472 (trusty) > I0529 09:29:24.582356 18198 master.cpp:2986] Authenticating > slave(137)@127.0.1.1:48472 > I0529 09:29:24.582557 18195 authenticator.hpp:156] Creating new server SASL > connection > I0529 09:29:24.582679 18195 authenticatee.hpp:219] Received SASL > authentication mechanisms: CRAM-MD5 > I0529 09:29:24.582700 18195 authenticatee.hpp:245] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0529 09:29:24.582726 18195 authenticator.hpp:262] Received SASL > authentication start > I0529 09:29:24.582770 18195 authenticator.hpp:384] Authentication requires > more steps > I0529 09:29:24.582798 18195 authenticatee.hpp:265] Received SASL > authentication step > I0529 09:29:24.582834 18195 authenticator.hpp:290] Received SASL > authentication step > I0529 09:29:24.582852 18195 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: false > I0529 09:29:24.582861 18195 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I0529 09:29:24.582875 18195 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I0529 09:29:24.582885 18195 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'trusty' server FQDN: 'trusty' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: true > I0529 09:29:24.582892 18195 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I0529 09:29:24.582898 18195 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I0529 09:29:24.582911 18195 authenticator.hpp:376] Authentication success > I0529 09:29:24.582938 18195 authenticatee.hpp:305] Authentication success > I0529 09:29:24.582962 18195 master.cpp:3026] Successfully authenticated > principal 'test-principal' at slave(137)@127.0.1.1:48472 > I0529 09:29:24.583034 18195 slave.cpp:669] Successfully authenticated with > master master@127.0.1.1:48472 > I0529 09:29:24.583108 18195 slave.cpp:902] Will retry registration in > 13.697338ms if necessary > W0529 09:29:24.583199 18195 master.cpp:2417] Slave at > slave(137)@127.0.1.1:48472 (trusty) is being allowed to re-register with an > already in use id (20140529-092924-16842879-48472-18175-0) > I0529 09:29:24.583282 18195 slave.cpp:753] Re-registered with master > master@127.0.1.1:48472 > I0529 09:29:24.583312 18195 slave.cpp:1514] Updating framework > 20140529-092924-16842879-48472-18175-0000 pid to > scheduler(130)@127.0.1.1:48472 > I0529 09:29:24.583348 18195 slave.cpp:1522] Checkpointing framework pid > 'scheduler(130)@127.0.1.1:48472' to > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/framework.pid' > I0529 09:29:24.583536 18195 hierarchical_allocator_process.hpp:497] Slave > 20140529-092924-16842879-48472-18175-0 reconnected > W0529 09:29:24.583560 18195 status_update_manager.cpp:181] Resending status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.583573 18195 status_update_manager.cpp:373] Forwarding status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 to master@127.0.1.1:48472 > W0529 09:29:24.583658 18195 master.cpp:2621] Could not lookup task for status > update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 from slave > 20140529-092924-16842879-48472-18175-0 at slave(137)@127.0.1.1:48472 (trusty) > I0529 09:29:24.584329 18198 sched.cpp:625] Scheduler::statusUpdate took > 16101ns > tests/slave_recovery_tests.cpp:3068: Failure > Mock function called more times than expected - returning directly. > Function call: statusUpdate(0x7fff69ebf410, @0x2ad9b40bc400 72-byte > object <30-17 18-A5 D9-2A 00-00 00-00 00-00 00-00 00-00 90-31 03-B4 D9-2A > 00-00 B0-B1 0F-B4 D9-2A 00-00 38-30 D1-00 00-00 00-00 A0-39 0C-B4 D9-2A 00-00 > 1B-13 63-6B 13-EB D4-41 03-00 00-00 00-00 00-00 37-00 00-00 00-00 00-00>) > Expected: to be called once > Actual: called twice - over-saturated and active > I0529 09:29:24.584525 18198 sched.cpp:625] Scheduler::statusUpdate took > 121892ns > I0529 09:29:24.584609 18198 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.584632 18198 status_update_manager.hpp:342] Checkpointing ACK > for status update TASK_FAILED (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) > for task 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585109 18194 sched.cpp:730] Stopping framework > '20140529-092924-16842879-48472-18175-0000' > I0529 09:29:24.585196 18196 master.cpp:1272] Asked to unregister framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585211 18196 master.cpp:3344] Removing framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585304 18196 hierarchical_allocator_process.hpp:407] > Deactivated framework 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585342 18196 hierarchical_allocator_process.hpp:362] Removed > framework 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585366 18196 slave.cpp:1337] Asked to shut down framework > 20140529-092924-16842879-48472-18175-0000 by master@127.0.1.1:48472 > I0529 09:29:24.585381 18196 slave.cpp:1362] Shutting down framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585396 18196 slave.cpp:2552] Cleaning up executor > '24a52e8f-0b30-4970-8829-a7383f19980e' of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585631 18196 slave.cpp:2627] Cleaning up framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.585710 18196 gc.cpp:56] Scheduling > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7' > for gc 6.99999322304296days in the future > I0529 09:29:24.585954 18196 gc.cpp:56] Scheduling > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e' > for gc 6.99999322254518days in the future > I0529 09:29:24.586061 18196 gc.cpp:56] Scheduling > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e/runs/7fbaa2b9-d41e-4d3e-897f-3d593b1308c7' > for gc 6.99999322225778days in the future > I0529 09:29:24.586094 18196 gc.cpp:56] Scheduling > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000/executors/24a52e8f-0b30-4970-8829-a7383f19980e' > for gc 6.99999322199111days in the future > I0529 09:29:24.586122 18196 gc.cpp:56] Scheduling > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000' > for gc 6.99999322143407days in the future > I0529 09:29:24.586148 18196 gc.cpp:56] Scheduling > '/tmp/SlaveRecoveryTest_0_RestartBeforeContainerizerLaunch_q4MPqy/meta/slaves/20140529-092924-16842879-48472-18175-0/frameworks/20140529-092924-16842879-48472-18175-0000' > for gc 6.99999322119111days in the future > I0529 09:29:24.586599 18196 master.cpp:574] Master terminating > I0529 09:29:24.589107 18199 slave.cpp:2224] master@127.0.1.1:48472 exited > W0529 09:29:24.589130 18199 slave.cpp:2227] Master disconnected! Waiting for > a new master to be elected > I0529 09:29:24.589537 18198 status_update_manager.cpp:530] Cleaning up status > update stream for task 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.589920 18198 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.590104 18199 slave.cpp:1604] Status update manager > successfully handled status update acknowledgement (UUID: > aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > E0529 09:29:24.590286 18199 slave.cpp:1615] Status update acknowledgement > (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of unknown framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.590569 18198 status_update_manager.cpp:282] Closing status > update streams for framework 20140529-092924-16842879-48472-18175-0000 > E0529 09:29:24.590759 18194 slave.cpp:1597] Failed to handle status update > acknowledgement (UUID: aee0620e-c349-42ce-8cce-7b3835adad95) for task > 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000: Cannot find the status update > stream for task 24a52e8f-0b30-4970-8829-a7383f19980e of framework > 20140529-092924-16842879-48472-18175-0000 > I0529 09:29:24.592489 18175 slave.cpp:425] Slave terminating > [ FAILED ] SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch, where > TypeParam = mesos::internal::slave::MesosContainerizer (112 ms) > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)