[ https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Mahler reopened MESOS-1613: ------------------------------------ Looks like it's still flaky: {noformat} Changes Summary Made ephemeral ports a resource and killed private resources. (details) Do not send ephemeral_ports resource to frameworks. (details) Create static mesos library. (details) Re-enabled HealthCheckTest.ConsecutiveFailures test. (details) Merge resourcesRecovered and resourcesUnused. (details) Added executor metrics for slave. (details) [ RUN ] HealthCheckTest.ConsecutiveFailures Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_fBrAEu' I0806 15:06:59.268267 9210 leveldb.cpp:176] Opened db in 29.926087ms I0806 15:06:59.275971 9210 leveldb.cpp:183] Compacted db in 7.40006ms I0806 15:06:59.276254 9210 leveldb.cpp:198] Created db iterator in 7678ns I0806 15:06:59.276741 9210 leveldb.cpp:204] Seeked to beginning of db in 2076ns I0806 15:06:59.277034 9210 leveldb.cpp:273] Iterated through 0 keys in the db in 1908ns I0806 15:06:59.277307 9210 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0806 15:06:59.277868 9233 recover.cpp:425] Starting replica recovery I0806 15:06:59.277946 9233 recover.cpp:451] Replica is in EMPTY status I0806 15:06:59.278240 9233 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request I0806 15:06:59.278296 9233 recover.cpp:188] Received a recover response from a replica in EMPTY status I0806 15:06:59.278391 9233 recover.cpp:542] Updating replica status to STARTING I0806 15:06:59.282282 9234 master.cpp:287] Master 20140806-150659-16842879-60888-9210 (lucid) started on 127.0.1.1:60888 I0806 15:06:59.282316 9234 master.cpp:324] Master only allowing authenticated frameworks to register I0806 15:06:59.282322 9234 master.cpp:329] Master only allowing authenticated slaves to register I0806 15:06:59.282330 9234 credentials.hpp:36] Loading credentials for authentication from '/tmp/HealthCheckTest_ConsecutiveFailures_fBrAEu/credentials' I0806 15:06:59.282508 9234 master.cpp:358] Authorization enabled I0806 15:06:59.283121 9234 hierarchical_allocator_process.hpp:296] Initializing hierarchical allocator process with master : master@127.0.1.1:60888 I0806 15:06:59.283174 9234 master.cpp:121] No whitelist given. Advertising offers for all slaves I0806 15:06:59.283413 9234 master.cpp:1127] The newly elected leader is master@127.0.1.1:60888 with id 20140806-150659-16842879-60888-9210 I0806 15:06:59.283429 9234 master.cpp:1140] Elected as the leading master! I0806 15:06:59.283435 9234 master.cpp:958] Recovering from registrar I0806 15:06:59.283491 9234 registrar.cpp:313] Recovering registrar I0806 15:06:59.284046 9233 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 5.600113ms I0806 15:06:59.284080 9233 replica.cpp:320] Persisted replica status to STARTING I0806 15:06:59.284226 9233 recover.cpp:451] Replica is in STARTING status I0806 15:06:59.284580 9233 replica.cpp:638] Replica in STARTING status received a broadcasted recover request I0806 15:06:59.284643 9233 recover.cpp:188] Received a recover response from a replica in STARTING status I0806 15:06:59.284747 9233 recover.cpp:542] Updating replica status to VOTING I0806 15:06:59.289934 9233 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 5.119357ms I0806 15:06:59.290256 9233 replica.cpp:320] Persisted replica status to VOTING I0806 15:06:59.290876 9237 recover.cpp:556] Successfully joined the Paxos group I0806 15:06:59.291131 9232 recover.cpp:440] Recover process terminated I0806 15:06:59.300732 9236 log.cpp:656] Attempting to start the writer I0806 15:06:59.301061 9236 replica.cpp:474] Replica received implicit promise request with proposal 1 I0806 15:06:59.306172 9236 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 5.070193ms I0806 15:06:59.306229 9236 replica.cpp:342] Persisted promised to 1 I0806 15:06:59.306747 9236 coordinator.cpp:230] Coordinator attemping to fill missing position I0806 15:06:59.307143 9236 replica.cpp:375] Replica received explicit promise request for position 0 with proposal 2 I0806 15:06:59.309715 9236 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 2.521311ms I0806 15:06:59.310199 9236 replica.cpp:676] Persisted action at 0 I0806 15:06:59.320276 9234 replica.cpp:508] Replica received write request for position 0 I0806 15:06:59.320335 9234 leveldb.cpp:438] Reading position from leveldb took 26656ns I0806 15:06:59.325726 9234 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 5.358479ms I0806 15:06:59.325781 9234 replica.cpp:676] Persisted action at 0 I0806 15:06:59.325999 9234 replica.cpp:655] Replica received learned notice for position 0 I0806 15:06:59.328487 9234 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 2.458504ms I0806 15:06:59.328887 9234 replica.cpp:676] Persisted action at 0 I0806 15:06:59.329144 9234 replica.cpp:661] Replica learned NOP action at position 0 I0806 15:06:59.340193 9236 log.cpp:672] Writer started with ending position 0 I0806 15:06:59.340481 9236 leveldb.cpp:438] Reading position from leveldb took 30496ns I0806 15:06:59.341670 9236 registrar.cpp:346] Successfully fetched the registry (0B) I0806 15:06:59.341696 9236 registrar.cpp:422] Attempting to update the 'registry' I0806 15:06:59.342859 9236 log.cpp:680] Attempting to append 117 bytes to the log I0806 15:06:59.342912 9236 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1 I0806 15:06:59.343128 9236 replica.cpp:508] Replica received write request for position 1 I0806 15:06:59.350430 9236 leveldb.cpp:343] Persisting action (134 bytes) to leveldb took 7.261692ms I0806 15:06:59.350914 9236 replica.cpp:676] Persisted action at 1 I0806 15:06:59.360164 9234 replica.cpp:655] Replica received learned notice for position 1 I0806 15:06:59.365715 9234 leveldb.cpp:343] Persisting action (136 bytes) to leveldb took 5.492077ms I0806 15:06:59.365774 9234 replica.cpp:676] Persisted action at 1 I0806 15:06:59.365784 9234 replica.cpp:661] Replica learned APPEND action at position 1 I0806 15:06:59.366268 9234 registrar.cpp:479] Successfully updated 'registry' I0806 15:06:59.366322 9234 registrar.cpp:372] Successfully recovered registrar I0806 15:06:59.366368 9234 log.cpp:699] Attempting to truncate the log to 1 I0806 15:06:59.366452 9234 master.cpp:985] Recovered 0 slaves from the Registry (81B) ; allowing 10mins for slaves to re-register I0806 15:06:59.366508 9234 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 2 I0806 15:06:59.367105 9232 replica.cpp:508] Replica received write request for position 2 I0806 15:06:59.369526 9232 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 2.392686ms I0806 15:06:59.369560 9232 replica.cpp:676] Persisted action at 2 I0806 15:06:59.369807 9232 replica.cpp:655] Replica received learned notice for position 2 I0806 15:06:59.377322 9232 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 7.463724ms I0806 15:06:59.377411 9232 leveldb.cpp:401] Deleting ~1 keys from leveldb took 35552ns I0806 15:06:59.377426 9232 replica.cpp:676] Persisted action at 2 I0806 15:06:59.377435 9232 replica.cpp:661] Replica learned TRUNCATE action at position 2 I0806 15:06:59.378401 9210 containerizer.cpp:124] Using isolation: posix/cpu,posix/mem I0806 15:06:59.381873 9236 slave.cpp:167] Slave started on 96)@127.0.1.1:60888 I0806 15:06:59.381899 9236 credentials.hpp:84] Loading credential for authentication from '/tmp/HealthCheckTest_ConsecutiveFailures_jnWV1D/credential' I0806 15:06:59.381995 9236 slave.cpp:265] Slave using credential for: test-principal I0806 15:06:59.382102 9236 slave.cpp:278] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] I0806 15:06:59.382159 9236 slave.cpp:306] Slave hostname: lucid I0806 15:06:59.382169 9236 slave.cpp:307] Slave checkpoint: false I0806 15:06:59.382702 9236 state.cpp:33] Recovering state from '/tmp/HealthCheckTest_ConsecutiveFailures_jnWV1D/meta' I0806 15:06:59.382812 9236 status_update_manager.cpp:193] Recovering status update manager I0806 15:06:59.382863 9236 containerizer.cpp:287] Recovering containerizer I0806 15:06:59.383121 9236 slave.cpp:3175] Finished recovery I0806 15:06:59.383474 9236 slave.cpp:589] New master detected at master@127.0.1.1:60888 I0806 15:06:59.383501 9236 slave.cpp:663] Authenticating with master master@127.0.1.1:60888 I0806 15:06:59.383543 9236 slave.cpp:636] Detecting new master I0806 15:06:59.383576 9236 status_update_manager.cpp:167] New master detected at master@127.0.1.1:60888 I0806 15:06:59.383646 9236 authenticatee.hpp:128] Creating new client SASL connection I0806 15:06:59.383748 9236 master.cpp:3545] Authenticating slave(96)@127.0.1.1:60888 I0806 15:06:59.383833 9236 authenticator.hpp:156] Creating new server SASL connection I0806 15:06:59.383906 9236 authenticatee.hpp:219] Received SASL authentication mechanisms: CRAM-MD5 I0806 15:06:59.383926 9236 authenticatee.hpp:245] Attempting to authenticate with mechanism 'CRAM-MD5' I0806 15:06:59.383949 9236 authenticator.hpp:262] Received SASL authentication start I0806 15:06:59.383987 9236 authenticator.hpp:384] Authentication requires more steps I0806 15:06:59.384011 9236 authenticatee.hpp:265] Received SASL authentication step I0806 15:06:59.384042 9236 authenticator.hpp:290] Received SASL authentication step I0806 15:06:59.384055 9236 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I0806 15:06:59.384062 9236 auxprop.cpp:153] Looking up auxiliary property '*userPassword' I0806 15:06:59.384069 9236 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I0806 15:06:59.384076 9236 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I0806 15:06:59.384081 9236 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I0806 15:06:59.384086 9236 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I0806 15:06:59.384096 9236 authenticator.hpp:376] Authentication success I0806 15:06:59.384119 9236 authenticatee.hpp:305] Authentication success I0806 15:06:59.384138 9236 master.cpp:3585] Successfully authenticated principal 'test-principal' at slave(96)@127.0.1.1:60888 I0806 15:06:59.384209 9236 slave.cpp:720] Successfully authenticated with master master@127.0.1.1:60888 I0806 15:06:59.384249 9236 slave.cpp:971] Will retry registration in 2.692473ms if necessary I0806 15:06:59.384337 9233 master.cpp:2767] Registering slave at slave(96)@127.0.1.1:60888 (lucid) with id 20140806-150659-16842879-60888-9210-0 I0806 15:06:59.384434 9233 registrar.cpp:422] Attempting to update the 'registry' I0806 15:06:59.385726 9231 log.cpp:680] Attempting to append 287 bytes to the log I0806 15:06:59.385782 9231 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 3 I0806 15:06:59.385999 9231 replica.cpp:508] Replica received write request for position 3 I0806 15:06:59.388566 9231 leveldb.cpp:343] Persisting action (306 bytes) to leveldb took 2.532556ms I0806 15:06:59.388595 9231 replica.cpp:676] Persisted action at 3 I0806 15:06:59.388803 9231 replica.cpp:655] Replica received learned notice for position 3 I0806 15:06:59.389863 9233 slave.cpp:971] Will retry registration in 39.249021ms if necessary I0806 15:06:59.389991 9233 master.cpp:2755] Ignoring register slave message from slave(96)@127.0.1.1:60888 (lucid) as admission is already in progress I0806 15:06:59.391331 9210 sched.cpp:139] Version: 0.20.0 I0806 15:06:59.391752 9235 sched.cpp:235] New master detected at master@127.0.1.1:60888 I0806 15:06:59.391777 9235 sched.cpp:285] Authenticating with master master@127.0.1.1:60888 I0806 15:06:59.391863 9235 authenticatee.hpp:128] Creating new client SASL connection I0806 15:06:59.391944 9235 master.cpp:3545] Authenticating scheduler-c7f52aa8-e390-465a-aad0-7e440c08940a@127.0.1.1:60888 I0806 15:06:59.392031 9235 authenticator.hpp:156] Creating new server SASL connection I0806 15:06:59.392089 9235 authenticatee.hpp:219] Received SASL authentication mechanisms: CRAM-MD5 I0806 15:06:59.392102 9235 authenticatee.hpp:245] Attempting to authenticate with mechanism 'CRAM-MD5' I0806 15:06:59.392122 9235 authenticator.hpp:262] Received SASL authentication start I0806 15:06:59.392155 9235 authenticator.hpp:384] Authentication requires more steps I0806 15:06:59.392179 9235 authenticatee.hpp:265] Received SASL authentication step I0806 15:06:59.392213 9235 authenticator.hpp:290] Received SASL authentication step I0806 15:06:59.392230 9235 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I0806 15:06:59.392242 9235 auxprop.cpp:153] Looking up auxiliary property '*userPassword' I0806 15:06:59.392258 9235 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I0806 15:06:59.392273 9235 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I0806 15:06:59.392310 9235 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I0806 15:06:59.392323 9235 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I0806 15:06:59.392343 9235 authenticator.hpp:376] Authentication success I0806 15:06:59.392387 9235 authenticatee.hpp:305] Authentication success I0806 15:06:59.392452 9235 master.cpp:3585] Successfully authenticated principal 'test-principal' at scheduler-c7f52aa8-e390-465a-aad0-7e440c08940a@127.0.1.1:60888 I0806 15:06:59.392570 9235 sched.cpp:359] Successfully authenticated with master master@127.0.1.1:60888 I0806 15:06:59.392585 9235 sched.cpp:478] Sending registration request to master@127.0.1.1:60888 I0806 15:06:59.392628 9235 master.cpp:1246] Received registration request from scheduler-c7f52aa8-e390-465a-aad0-7e440c08940a@127.0.1.1:60888 I0806 15:06:59.392645 9235 master.cpp:1206] Authorizing framework principal 'test-principal' to receive offers for role '*' I0806 15:06:59.392765 9235 master.cpp:1305] Registering framework 20140806-150659-16842879-60888-9210-0000 at scheduler-c7f52aa8-e390-465a-aad0-7e440c08940a@127.0.1.1:60888 I0806 15:06:59.392875 9235 sched.cpp:409] Framework registered with 20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.392906 9235 sched.cpp:423] Scheduler::registered took 16129ns I0806 15:06:59.392942 9235 hierarchical_allocator_process.hpp:326] Added framework 20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.392951 9235 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0806 15:06:59.392956 9235 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 5598ns I0806 15:06:59.399893 9231 leveldb.cpp:343] Persisting action (308 bytes) to leveldb took 11.054696ms I0806 15:06:59.399946 9231 replica.cpp:676] Persisted action at 3 I0806 15:06:59.399956 9231 replica.cpp:661] Replica learned APPEND action at position 3 I0806 15:06:59.400360 9231 registrar.cpp:479] Successfully updated 'registry' I0806 15:06:59.400446 9231 log.cpp:699] Attempting to truncate the log to 3 I0806 15:06:59.400507 9231 master.cpp:2807] Registered slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:06:59.400524 9231 master.cpp:4018] Adding slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] I0806 15:06:59.400640 9231 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 4 I0806 15:06:59.400725 9231 slave.cpp:754] Registered with master master@127.0.1.1:60888; given slave ID 20140806-150659-16842879-60888-9210-0 I0806 15:06:59.400805 9231 hierarchical_allocator_process.hpp:439] Added slave 20140806-150659-16842879-60888-9210-0 (lucid) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] available) I0806 15:06:59.400858 9231 hierarchical_allocator_process.hpp:714] Offering cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140806-150659-16842879-60888-9210-0 to framework 20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.400949 9231 hierarchical_allocator_process.hpp:670] Performed allocation for slave 20140806-150659-16842879-60888-9210-0 in 114359ns I0806 15:06:59.401000 9231 slave.cpp:2324] Received ping from slave-observer(80)@127.0.1.1:60888 I0806 15:06:59.401064 9231 master.hpp:816] Adding offer 20140806-150659-16842879-60888-9210-0 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140806-150659-16842879-60888-9210-0 (lucid) I0806 15:06:59.401111 9231 master.cpp:3492] Sending 1 offers to framework 20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.401257 9231 sched.cpp:546] Scheduler::resourceOffers took 20565ns I0806 15:06:59.401422 9231 replica.cpp:508] Replica received write request for position 4 I0806 15:06:59.402755 9234 master.hpp:826] Removing offer 20140806-150659-16842879-60888-9210-0 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140806-150659-16842879-60888-9210-0 (lucid) I0806 15:06:59.402817 9234 master.cpp:2130] Processing reply for offers: [ 20140806-150659-16842879-60888-9210-0 ] on slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) for framework 20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.402847 9234 master.cpp:2216] Authorizing framework principal 'test-principal' to launch task 1 as user 'jenkins' I0806 15:06:59.403132 9234 master.hpp:788] Adding task 1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140806-150659-16842879-60888-9210-0 (lucid) I0806 15:06:59.403168 9234 master.cpp:2282] Launching task 1 of framework 20140806-150659-16842879-60888-9210-0000 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:06:59.403311 9234 slave.cpp:1002] Got assigned task 1 for framework 20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.403586 9234 slave.cpp:1112] Launching task 1 for framework 20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.405397 9234 slave.cpp:1222] Queuing task '1' for executor 1 of framework '20140806-150659-16842879-60888-9210-0000 I0806 15:06:59.405566 9234 containerizer.cpp:427] Starting container '64988b19-cec6-4228-80e5-6e80a271bb8a' for executor '1' of framework '20140806-150659-16842879-60888-9210-0000' I0806 15:06:59.405800 9234 slave.cpp:543] Successfully attached file '/tmp/HealthCheckTest_ConsecutiveFailures_jnWV1D/slaves/20140806-150659-16842879-60888-9210-0/frameworks/20140806-150659-16842879-60888-9210-0000/executors/1/runs/64988b19-cec6-4228-80e5-6e80a271bb8a' I0806 15:06:59.410970 9231 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 9.489363ms I0806 15:06:59.411043 9231 replica.cpp:676] Persisted action at 4 I0806 15:06:59.411597 9231 replica.cpp:655] Replica received learned notice for position 4 I0806 15:06:59.412961 9234 launcher.cpp:137] Forked child with pid '12834' for container '64988b19-cec6-4228-80e5-6e80a271bb8a' I0806 15:06:59.413980 9234 containerizer.cpp:537] Fetching URIs for container '64988b19-cec6-4228-80e5-6e80a271bb8a' using command '/var/jenkins/workspace/mesos-ubuntu-10.04-gcc/src/mesos-fetcher' I0806 15:06:59.425834 9231 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 14.188783ms I0806 15:06:59.425940 9231 leveldb.cpp:401] Deleting ~2 keys from leveldb took 49144ns I0806 15:06:59.425961 9231 replica.cpp:676] Persisted action at 4 I0806 15:06:59.425974 9231 replica.cpp:661] Replica learned TRUNCATE action at position 4 E0806 15:07:00.245468 9231 slave.cpp:2471] Container '64988b19-cec6-4228-80e5-6e80a271bb8a' for executor '1' of framework '20140806-150659-16842879-60888-9210-0000' failed to start: TaskInfo/ExecutorInfo not supported I0806 15:07:00.301451 9231 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 41375ns I0806 15:07:00.611855 9231 slave.cpp:1733] Got registration for executor '1' of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.612200 9231 slave.cpp:1852] Flushing queued task 1 for executor '1' of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.612619 9238 process.cpp:1098] Socket closed while receiving I0806 15:07:00.617409 9231 slave.cpp:2087] Handling status update TASK_RUNNING (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 from executor(1)@127.0.1.1:52559 I0806 15:07:00.617502 9231 status_update_manager.cpp:320] Received status update TASK_RUNNING (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.617517 9231 status_update_manager.cpp:499] Creating StatusUpdate stream for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.617617 9231 status_update_manager.cpp:373] Forwarding status update TASK_RUNNING (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 to master@127.0.1.1:60888 I0806 15:07:00.617749 9231 master.cpp:3136] Forwarding status update TASK_RUNNING (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.617787 9231 master.cpp:3102] Status update TASK_RUNNING (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 from slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:00.617817 9231 slave.cpp:2245] Status update manager successfully handled status update TASK_RUNNING (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.617827 9231 slave.cpp:2251] Sending acknowledgement for status update TASK_RUNNING (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 to executor(1)@127.0.1.1:52559 I0806 15:07:00.618036 9231 sched.cpp:637] Scheduler::statusUpdate took 33324ns I0806 15:07:00.618113 9231 master.cpp:2617] Forwarding status update acknowledgement d3e56bf8-179e-4d3b-bfe2-f78b824968ae for task 1 of framework 20140806-150659-16842879-60888-9210-0000 to slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:00.618216 9231 status_update_manager.cpp:398] Received status update acknowledgement (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.618271 9231 slave.cpp:1673] Status update manager successfully handled status update acknowledgement (UUID: d3e56bf8-179e-4d3b-bfe2-f78b824968ae) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:00.618535 9238 process.cpp:1098] Socket closed while receiving 2014-08-06 15:07:00,854:9210(0x2b245964a700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:36839] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I0806 15:07:01.330152 9230 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 43958ns I0806 15:07:01.677136 9230 slave.cpp:2087] Handling status update TASK_RUNNING (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 from executor(1)@127.0.1.1:52559 I0806 15:07:01.677273 9230 status_update_manager.cpp:320] Received status update TASK_RUNNING (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:01.677304 9230 status_update_manager.cpp:373] Forwarding status update TASK_RUNNING (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 to master@127.0.1.1:60888 I0806 15:07:01.677428 9230 master.cpp:3136] Forwarding status update TASK_RUNNING (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:01.677466 9230 master.cpp:3102] Status update TASK_RUNNING (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 from slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:01.677496 9230 slave.cpp:2245] Status update manager successfully handled status update TASK_RUNNING (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:01.677505 9230 slave.cpp:2251] Sending acknowledgement for status update TASK_RUNNING (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 to executor(1)@127.0.1.1:52559 I0806 15:07:01.677682 9230 sched.cpp:637] Scheduler::statusUpdate took 22626ns I0806 15:07:01.677749 9230 master.cpp:2617] Forwarding status update acknowledgement 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27 for task 1 of framework 20140806-150659-16842879-60888-9210-0000 to slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:01.677850 9230 status_update_manager.cpp:398] Received status update acknowledgement (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:01.677902 9230 slave.cpp:1673] Status update manager successfully handled status update acknowledgement (UUID: 0b3404a9-3e62-4780-b2bb-9a6c1c47dc27) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:01.678419 9238 process.cpp:1098] Socket closed while receiving I0806 15:07:02.330945 9233 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 68481ns I0806 15:07:02.683594 9233 slave.cpp:2087] Handling status update TASK_RUNNING (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 from executor(1)@127.0.1.1:52559 I0806 15:07:02.683812 9233 status_update_manager.cpp:320] Received status update TASK_RUNNING (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:02.683856 9233 status_update_manager.cpp:373] Forwarding status update TASK_RUNNING (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 to master@127.0.1.1:60888 I0806 15:07:02.684006 9233 master.cpp:3136] Forwarding status update TASK_RUNNING (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:02.684054 9233 master.cpp:3102] Status update TASK_RUNNING (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 from slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:02.684089 9233 slave.cpp:2245] Status update manager successfully handled status update TASK_RUNNING (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:02.684098 9233 slave.cpp:2251] Sending acknowledgement for status update TASK_RUNNING (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 to executor(1)@127.0.1.1:52559 I0806 15:07:02.684299 9233 sched.cpp:637] Scheduler::statusUpdate took 40572ns I0806 15:07:02.684434 9233 master.cpp:2617] Forwarding status update acknowledgement 77ba4fce-7925-43af-abeb-3cdb53a0fb75 for task 1 of framework 20140806-150659-16842879-60888-9210-0000 to slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:02.684790 9238 process.cpp:1098] Socket closed while receiving I0806 15:07:02.684916 9232 status_update_manager.cpp:398] Received status update acknowledgement (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:02.684984 9232 slave.cpp:1673] Status update manager successfully handled status update acknowledgement (UUID: 77ba4fce-7925-43af-abeb-3cdb53a0fb75) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:03.340453 9235 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 44308ns I0806 15:07:03.692570 9235 slave.cpp:2087] Handling status update TASK_RUNNING (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 from executor(1)@127.0.1.1:52559 I0806 15:07:03.692721 9235 status_update_manager.cpp:320] Received status update TASK_RUNNING (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:03.692755 9235 status_update_manager.cpp:373] Forwarding status update TASK_RUNNING (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 to master@127.0.1.1:60888 I0806 15:07:03.692875 9235 master.cpp:3136] Forwarding status update TASK_RUNNING (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:03.692911 9235 master.cpp:3102] Status update TASK_RUNNING (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 from slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:03.692942 9235 slave.cpp:2245] Status update manager successfully handled status update TASK_RUNNING (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:03.692952 9235 slave.cpp:2251] Sending acknowledgement for status update TASK_RUNNING (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 in health state unhealthy of framework 20140806-150659-16842879-60888-9210-0000 to executor(1)@127.0.1.1:52559 I0806 15:07:03.693111 9235 sched.cpp:637] Scheduler::statusUpdate took 23054ns I0806 15:07:03.693174 9235 master.cpp:2617] Forwarding status update acknowledgement e83bd7a9-a974-443e-9292-9b38ee0c8a8c for task 1 of framework 20140806-150659-16842879-60888-9210-0000 to slave 20140806-150659-16842879-60888-9210-0 at slave(96)@127.0.1.1:60888 (lucid) I0806 15:07:03.693272 9235 status_update_manager.cpp:398] Received status update acknowledgement (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:03.693323 9235 slave.cpp:1673] Status update manager successfully handled status update acknowledgement (UUID: e83bd7a9-a974-443e-9292-9b38ee0c8a8c) for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:03.693723 9238 process.cpp:1098] Socket closed while receiving 2014-08-06 15:07:04,190:9210(0x2b245964a700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:36839] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I0806 15:07:04.289959 9234 master.cpp:121] No whitelist given. Advertising offers for all slaves I0806 15:07:04.353030 9234 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 75401ns I0806 15:07:05.360039 9230 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 39386ns I0806 15:07:06.371474 9233 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 125156ns I0806 15:07:07.380746 9233 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 38124ns 2014-08-06 15:07:07,527:9210(0x2b245964a700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:36839] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I0806 15:07:08.391015 9235 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 72442ns I0806 15:07:09.299963 9235 master.cpp:121] No whitelist given. Advertising offers for all slaves I0806 15:07:09.404326 9237 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 44315ns I0806 15:07:10.411216 9236 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 80194ns 2014-08-06 15:07:10,864:9210(0x2b245964a700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:36839] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I0806 15:07:11.420480 9236 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 55269ns I0806 15:07:12.806651 9236 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 107843ns tests/health_check_tests.cpp:336: Failure Failed to wait 10secs for status4 tests/health_check_tests.cpp:311: Failure Actual function call count doesn't match EXPECT_CALL(sched, statusUpdate(&driver, _))... Expected: to be called 6 times Actual: called 4 times - unsatisfied and active I0806 15:07:13.794618 9232 master.cpp:715] Framework 20140806-150659-16842879-60888-9210-0000 disconnected I0806 15:07:13.794680 9232 master.cpp:1586] Deactivating framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.794806 9232 master.cpp:737] Giving framework 20140806-150659-16842879-60888-9210-0000 0ns to failover I0806 15:07:13.795145 9232 hierarchical_allocator_process.hpp:402] Deactivated framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.795831 9236 master.cpp:3372] Framework failover timeout, removing framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.795866 9236 master.cpp:3868] Removing framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.796226 9236 master.hpp:806] Removing task 1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20140806-150659-16842879-60888-9210-0 (lucid) W0806 15:07:13.796470 9236 master.cpp:4311] Removing task 1 of framework 20140806-150659-16842879-60888-9210-0000 and slave 20140806-150659-16842879-60888-9210-0 in non-terminal state TASK_RUNNING I0806 15:07:13.796775 9236 slave.cpp:1406] Asked to shut down framework 20140806-150659-16842879-60888-9210-0000 by master@127.0.1.1:60888 I0806 15:07:13.796835 9236 slave.cpp:1431] Shutting down framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.796895 9236 slave.cpp:2855] Shutting down executor '1' of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.805369 9235 hierarchical_allocator_process.hpp:560] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on slave 20140806-150659-16842879-60888-9210-0 from framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.805450 9235 hierarchical_allocator_process.hpp:591] Framework 20140806-150659-16842879-60888-9210-0000 filtered slave 20140806-150659-16842879-60888-9210-0 for 5secs I0806 15:07:13.805552 9235 hierarchical_allocator_process.hpp:357] Removed framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:13.806340 9234 master.cpp:624] Master terminating I0806 15:07:13.813288 9238 process.cpp:1098] Socket closed while receiving I0806 15:07:13.813813 9234 hierarchical_allocator_process.hpp:650] Performed allocation for 1 slaves in 25871ns I0806 15:07:13.819965 9236 slave.cpp:2356] master@127.0.1.1:60888 exited W0806 15:07:13.820013 9236 slave.cpp:2359] Master disconnected! Waiting for a new master to be elected I0806 15:07:13.832304 9235 containerizer.cpp:909] Destroying container '64988b19-cec6-4228-80e5-6e80a271bb8a' I0806 15:07:13.902968 9238 process.cpp:1037] Socket closed while receiving 2014-08-06 15:07:14,199:9210(0x2b245964a700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:36839] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I0806 15:07:14.370625 9234 containerizer.cpp:1025] Executor for container '64988b19-cec6-4228-80e5-6e80a271bb8a' has exited I0806 15:07:14.371238 9234 slave.cpp:2573] Executor '1' of framework 20140806-150659-16842879-60888-9210-0000 terminated with signal Killed I0806 15:07:14.371309 9234 slave.cpp:2709] Cleaning up executor '1' of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:14.371554 9234 slave.cpp:2784] Cleaning up framework 20140806-150659-16842879-60888-9210-0000 E0806 15:07:14.371716 9234 slave.cpp:2843] Failed to unmonitor container for executor 1 of framework 20140806-150659-16842879-60888-9210-0000: Not monitored I0806 15:07:14.371762 9234 gc.cpp:56] Scheduling '/tmp/HealthCheckTest_ConsecutiveFailures_jnWV1D/slaves/20140806-150659-16842879-60888-9210-0/frameworks/20140806-150659-16842879-60888-9210-0000/executors/1/runs/64988b19-cec6-4228-80e5-6e80a271bb8a' for gc 6.99999570068444days in the future I0806 15:07:14.371861 9234 gc.cpp:56] Scheduling '/tmp/HealthCheckTest_ConsecutiveFailures_jnWV1D/slaves/20140806-150659-16842879-60888-9210-0/frameworks/20140806-150659-16842879-60888-9210-0000/executors/1' for gc 6.99999569990815days in the future I0806 15:07:14.371892 9234 gc.cpp:56] Scheduling '/tmp/HealthCheckTest_ConsecutiveFailures_jnWV1D/slaves/20140806-150659-16842879-60888-9210-0/frameworks/20140806-150659-16842879-60888-9210-0000' for gc 6.99999569887704days in the future I0806 15:07:14.371935 9234 status_update_manager.cpp:282] Closing status update streams for framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:14.371968 9234 status_update_manager.cpp:530] Cleaning up status update stream for task 1 of framework 20140806-150659-16842879-60888-9210-0000 I0806 15:07:14.375880 9210 slave.cpp:466] Slave terminating [ FAILED ] HealthCheckTest.ConsecutiveFailures (15139 ms) {noformat} > HealthCheckTest.ConsecutiveFailures is flaky > -------------------------------------------- > > Key: MESOS-1613 > URL: https://issues.apache.org/jira/browse/MESOS-1613 > Project: Mesos > Issue Type: Bug > Components: test > Affects Versions: 0.20.0 > Environment: Ubuntu 10.04 GCC > Reporter: Vinod Kone > Assignee: Timothy Chen > > {code} > [ RUN ] HealthCheckTest.ConsecutiveFailures > Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV' > I0717 04:39:59.288471 5009 leveldb.cpp:176] Opened db in 21.575631ms > I0717 04:39:59.295274 5009 leveldb.cpp:183] Compacted db in 6.471982ms > I0717 04:39:59.295552 5009 leveldb.cpp:198] Created db iterator in 16783ns > I0717 04:39:59.296026 5009 leveldb.cpp:204] Seeked to beginning of db in > 2125ns > I0717 04:39:59.296257 5009 leveldb.cpp:273] Iterated through 0 keys in the > db in 10747ns > I0717 04:39:59.296584 5009 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0717 04:39:59.297322 5033 recover.cpp:425] Starting replica recovery > I0717 04:39:59.297413 5033 recover.cpp:451] Replica is in EMPTY status > I0717 04:39:59.297824 5033 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0717 04:39:59.297899 5033 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0717 04:39:59.297997 5033 recover.cpp:542] Updating replica status to > STARTING > I0717 04:39:59.301985 5031 master.cpp:288] Master > 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280 > I0717 04:39:59.302026 5031 master.cpp:325] Master only allowing > authenticated frameworks to register > I0717 04:39:59.302032 5031 master.cpp:330] Master only allowing > authenticated slaves to register > I0717 04:39:59.302039 5031 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials' > I0717 04:39:59.302283 5031 master.cpp:359] Authorization enabled > I0717 04:39:59.302971 5031 hierarchical_allocator_process.hpp:301] > Initializing hierarchical allocator process with master : > master@127.0.1.1:40280 > I0717 04:39:59.303022 5031 master.cpp:122] No whitelist given. Advertising > offers for all slaves > I0717 04:39:59.303390 5033 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 5.325097ms > I0717 04:39:59.303419 5033 replica.cpp:320] Persisted replica status to > STARTING > I0717 04:39:59.304076 5030 master.cpp:1128] The newly elected leader is > master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009 > I0717 04:39:59.304095 5030 master.cpp:1141] Elected as the leading master! > I0717 04:39:59.304102 5030 master.cpp:959] Recovering from registrar > I0717 04:39:59.304182 5030 registrar.cpp:313] Recovering registrar > I0717 04:39:59.304635 5033 recover.cpp:451] Replica is in STARTING status > I0717 04:39:59.304962 5033 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0717 04:39:59.305026 5033 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0717 04:39:59.305130 5033 recover.cpp:542] Updating replica status to VOTING > I0717 04:39:59.310416 5033 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 5.204157ms > I0717 04:39:59.310459 5033 replica.cpp:320] Persisted replica status to > VOTING > I0717 04:39:59.310534 5033 recover.cpp:556] Successfully joined the Paxos > group > I0717 04:39:59.310607 5033 recover.cpp:440] Recover process terminated > I0717 04:39:59.310773 5033 log.cpp:656] Attempting to start the writer > I0717 04:39:59.311157 5033 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0717 04:39:59.313451 5033 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 2.271822ms > I0717 04:39:59.313627 5033 replica.cpp:342] Persisted promised to 1 > I0717 04:39:59.318038 5031 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0717 04:39:59.318430 5031 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0717 04:39:59.323459 5031 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 5.004323ms > I0717 04:39:59.323493 5031 replica.cpp:676] Persisted action at 0 > I0717 04:39:59.323799 5031 replica.cpp:508] Replica received write request > for position 0 > I0717 04:39:59.323837 5031 leveldb.cpp:438] Reading position from leveldb > took 21901ns > I0717 04:39:59.329038 5031 leveldb.cpp:343] Persisting action (14 bytes) to > leveldb took 5.175998ms > I0717 04:39:59.329244 5031 replica.cpp:676] Persisted action at 0 > I0717 04:39:59.332335 5037 replica.cpp:655] Replica received learned notice > for position 0 > I0717 04:39:59.337894 5037 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 5.532845ms > I0717 04:39:59.337926 5037 replica.cpp:676] Persisted action at 0 > I0717 04:39:59.337934 5037 replica.cpp:661] Replica learned NOP action at > position 0 > I0717 04:39:59.338152 5037 log.cpp:672] Writer started with ending position 0 > I0717 04:39:59.338438 5037 leveldb.cpp:438] Reading position from leveldb > took 15499ns > I0717 04:39:59.339648 5037 registrar.cpp:346] Successfully fetched the > registry (0B) > I0717 04:39:59.339676 5037 registrar.cpp:422] Attempting to update the > 'registry' > I0717 04:39:59.340885 5037 log.cpp:680] Attempting to append 117 bytes to > the log > I0717 04:39:59.340929 5037 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 1 > I0717 04:39:59.341130 5037 replica.cpp:508] Replica received write request > for position 1 > I0717 04:39:59.343456 5037 leveldb.cpp:343] Persisting action (134 bytes) to > leveldb took 2.30603ms > I0717 04:39:59.343638 5037 replica.cpp:676] Persisted action at 1 > I0717 04:39:59.352447 5034 replica.cpp:655] Replica received learned notice > for position 1 > I0717 04:39:59.365022 5034 leveldb.cpp:343] Persisting action (136 bytes) to > leveldb took 12.534333ms > I0717 04:39:59.365072 5034 replica.cpp:676] Persisted action at 1 > I0717 04:39:59.365082 5034 replica.cpp:661] Replica learned APPEND action at > position 1 > I0717 04:39:59.365470 5034 registrar.cpp:479] Successfully updated 'registry' > I0717 04:39:59.365520 5034 registrar.cpp:372] Successfully recovered > registrar > I0717 04:39:59.365578 5034 log.cpp:699] Attempting to truncate the log to 1 > I0717 04:39:59.365656 5034 master.cpp:986] Recovered 0 slaves from the > Registry (81B) ; allowing 10mins for slaves to re-register > I0717 04:39:59.365696 5034 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 2 > I0717 04:39:59.366221 5030 replica.cpp:508] Replica received write request > for position 2 > I0717 04:39:59.374423 5030 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 8.172997ms > I0717 04:39:59.374459 5030 replica.cpp:676] Persisted action at 2 > I0717 04:39:59.374671 5030 replica.cpp:655] Replica received learned notice > for position 2 > I0717 04:39:59.383867 5030 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 9.168201ms > I0717 04:39:59.383939 5030 leveldb.cpp:401] Deleting ~1 keys from leveldb > took 33989ns > I0717 04:39:59.383952 5030 replica.cpp:676] Persisted action at 2 > I0717 04:39:59.383960 5030 replica.cpp:661] Replica learned TRUNCATE action > at position 2 > I0717 04:39:59.387097 5009 containerizer.cpp:124] Using isolation: > posix/cpu,posix/mem > I0717 04:39:59.391551 5009 sched.cpp:139] Version: 0.20.0 > I0717 04:39:59.392025 5036 slave.cpp:168] Slave started on > 116)@127.0.1.1:40280 > I0717 04:39:59.392042 5036 credentials.hpp:84] Loading credential for > authentication from > '/tmp/HealthCheckTest_ConsecutiveFailures_sOluTu/credential' > I0717 04:39:59.392148 5036 slave.cpp:266] Slave using credential for: > test-principal > I0717 04:39:59.392262 5036 slave.cpp:279] Slave resources: cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0717 04:39:59.392325 5036 slave.cpp:324] Slave hostname: lucid > I0717 04:39:59.392336 5036 slave.cpp:325] Slave checkpoint: false > I0717 04:39:59.392853 5036 state.cpp:33] Recovering state from > '/tmp/HealthCheckTest_ConsecutiveFailures_sOluTu/meta' > I0717 04:39:59.393005 5036 sched.cpp:235] New master detected at > master@127.0.1.1:40280 > I0717 04:39:59.393023 5036 sched.cpp:285] Authenticating with master > master@127.0.1.1:40280 > I0717 04:39:59.393102 5036 status_update_manager.cpp:193] Recovering status > update manager > I0717 04:39:59.393179 5036 authenticatee.hpp:128] Creating new client SASL > connection > I0717 04:39:59.393326 5036 master.cpp:3512] Authenticating > scheduler-37c02322-fe0b-4cd3-b1b5-b5aab929c19c@127.0.1.1:40280 > I0717 04:39:59.393432 5036 containerizer.cpp:287] Recovering containerizer > I0717 04:39:59.393569 5036 authenticator.hpp:156] Creating new server SASL > connection > I0717 04:39:59.393748 5036 authenticatee.hpp:219] Received SASL > authentication mechanisms: CRAM-MD5 > I0717 04:39:59.393764 5036 authenticatee.hpp:245] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0717 04:39:59.393842 5036 authenticator.hpp:262] Received SASL > authentication start > I0717 04:39:59.393888 5036 authenticator.hpp:384] Authentication requires > more steps > I0717 04:39:59.393934 5036 slave.cpp:3127] Finished recovery > I0717 04:39:59.394132 5036 authenticatee.hpp:265] Received SASL > authentication step > I0717 04:39:59.394208 5036 authenticator.hpp:290] Received SASL > authentication step > I0717 04:39:59.394234 5036 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false > I0717 04:39:59.394243 5036 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I0717 04:39:59.394255 5036 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I0717 04:39:59.394268 5036 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true > I0717 04:39:59.394274 5036 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I0717 04:39:59.394279 5036 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I0717 04:39:59.394290 5036 authenticator.hpp:376] Authentication success > I0717 04:39:59.394330 5036 slave.cpp:600] New master detected at > master@127.0.1.1:40280 > I0717 04:39:59.394348 5036 slave.cpp:676] Authenticating with master > master@127.0.1.1:40280 > I0717 04:39:59.394392 5036 slave.cpp:649] Detecting new master > I0717 04:39:59.394434 5036 authenticatee.hpp:305] Authentication success > I0717 04:39:59.394461 5036 master.cpp:3552] Successfully authenticated > principal 'test-principal' at > scheduler-37c02322-fe0b-4cd3-b1b5-b5aab929c19c@127.0.1.1:40280 > I0717 04:39:59.394516 5036 status_update_manager.cpp:167] New master > detected at master@127.0.1.1:40280 > I0717 04:39:59.394542 5036 authenticatee.hpp:128] Creating new client SASL > connection > I0717 04:39:59.394605 5036 sched.cpp:359] Successfully authenticated with > master master@127.0.1.1:40280 > I0717 04:39:59.394618 5036 sched.cpp:478] Sending registration request to > master@127.0.1.1:40280 > I0717 04:39:59.394666 5036 master.cpp:3512] Authenticating > slave(116)@127.0.1.1:40280 > I0717 04:39:59.394728 5036 master.cpp:1247] Received registration request > from scheduler-37c02322-fe0b-4cd3-b1b5-b5aab929c19c@127.0.1.1:40280 > I0717 04:39:59.394757 5036 master.cpp:1207] Authorizing framework principal > 'test-principal' to receive offers for role '*' > I0717 04:39:59.394816 5036 authenticator.hpp:156] Creating new server SASL > connection > I0717 04:39:59.394917 5036 authenticatee.hpp:219] Received SASL > authentication mechanisms: CRAM-MD5 > I0717 04:39:59.394932 5036 authenticatee.hpp:245] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0717 04:39:59.394989 5036 master.cpp:1306] Registering framework > 20140717-043959-16842879-40280-5009-0000 at > scheduler-37c02322-fe0b-4cd3-b1b5-b5aab929c19c@127.0.1.1:40280 > I0717 04:39:59.395094 5036 authenticator.hpp:262] Received SASL > authentication start > I0717 04:39:59.395126 5036 authenticator.hpp:384] Authentication requires > more steps > I0717 04:39:59.395166 5036 sched.cpp:409] Framework registered with > 20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.395187 5036 sched.cpp:423] Scheduler::registered took 9209ns > I0717 04:39:59.395221 5036 hierarchical_allocator_process.hpp:331] Added > framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.395228 5036 hierarchical_allocator_process.hpp:724] No > resources available to allocate! > I0717 04:39:59.395234 5036 hierarchical_allocator_process.hpp:686] Performed > allocation for 0 slaves in 5941ns > I0717 04:39:59.395278 5036 authenticatee.hpp:265] Received SASL > authentication step > I0717 04:39:59.395310 5036 authenticator.hpp:290] Received SASL > authentication step > I0717 04:39:59.395323 5036 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false > I0717 04:39:59.395329 5036 auxprop.cpp:153] Looking up auxiliary property > '*userPassword' > I0717 04:39:59.395336 5036 auxprop.cpp:153] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I0717 04:39:59.395344 5036 auxprop.cpp:81] Request to lookup properties for > user: 'test-principal' realm: 'lucid' server FQDN: 'lucid' > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true > I0717 04:39:59.395349 5036 auxprop.cpp:103] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I0717 04:39:59.395354 5036 auxprop.cpp:103] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I0717 04:39:59.395362 5036 authenticator.hpp:376] Authentication success > I0717 04:39:59.395385 5036 authenticatee.hpp:305] Authentication success > I0717 04:39:59.395402 5036 master.cpp:3552] Successfully authenticated > principal 'test-principal' at slave(116)@127.0.1.1:40280 > I0717 04:39:59.395457 5036 slave.cpp:733] Successfully authenticated with > master master@127.0.1.1:40280 > I0717 04:39:59.395503 5036 slave.cpp:971] Will retry registration in > 15.135662ms if necessary > I0717 04:39:59.395601 5037 master.cpp:2789] Registering slave at > slave(116)@127.0.1.1:40280 (lucid) with id > 20140717-043959-16842879-40280-5009-0 > I0717 04:39:59.395699 5037 registrar.cpp:422] Attempting to update the > 'registry' > I0717 04:39:59.396891 5031 log.cpp:680] Attempting to append 287 bytes to > the log > I0717 04:39:59.396939 5031 coordinator.cpp:340] Coordinator attempting to > write APPEND action at position 3 > I0717 04:39:59.397140 5031 replica.cpp:508] Replica received write request > for position 3 > I0717 04:39:59.405050 5031 leveldb.cpp:343] Persisting action (306 bytes) to > leveldb took 7.878137ms > I0717 04:39:59.405097 5031 replica.cpp:676] Persisted action at 3 > I0717 04:39:59.405342 5031 replica.cpp:655] Replica received learned notice > for position 3 > I0717 04:39:59.412286 5037 slave.cpp:971] Will retry registration in > 9.096666ms if necessary > I0717 04:39:59.412365 5037 master.cpp:2777] Ignoring register slave message > from slave(116)@127.0.1.1:40280 (lucid) as admission is already in progress > I0717 04:39:59.414619 5031 leveldb.cpp:343] Persisting action (308 bytes) to > leveldb took 9.256058ms > I0717 04:39:59.414646 5031 replica.cpp:676] Persisted action at 3 > I0717 04:39:59.414654 5031 replica.cpp:661] Replica learned APPEND action at > position 3 > I0717 04:39:59.415129 5031 registrar.cpp:479] Successfully updated 'registry' > I0717 04:39:59.415212 5031 log.cpp:699] Attempting to truncate the log to 3 > I0717 04:39:59.415269 5031 master.cpp:2829] Registered slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:39:59.415293 5031 master.cpp:3980] Adding slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] > I0717 04:39:59.415415 5031 coordinator.cpp:340] Coordinator attempting to > write TRUNCATE action at position 4 > I0717 04:39:59.415485 5031 slave.cpp:767] Registered with master > master@127.0.1.1:40280; given slave ID 20140717-043959-16842879-40280-5009-0 > I0717 04:39:59.415550 5031 hierarchical_allocator_process.hpp:444] Added > slave 20140717-043959-16842879-40280-5009-0 (lucid) with cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] available) > I0717 04:39:59.415602 5031 hierarchical_allocator_process.hpp:750] Offering > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20140717-043959-16842879-40280-5009-0 to framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.415699 5031 hierarchical_allocator_process.hpp:706] Performed > allocation for slave 20140717-043959-16842879-40280-5009-0 in 119984ns > I0717 04:39:59.415757 5031 slave.cpp:2324] Received ping from > slave-observer(104)@127.0.1.1:40280 > I0717 04:39:59.415824 5031 master.hpp:800] Adding offer > 20140717-043959-16842879-40280-5009-0 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140717-043959-16842879-40280-5009-0 (lucid) > I0717 04:39:59.415864 5031 master.cpp:3459] Sending 1 offers to framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.416179 5033 replica.cpp:508] Replica received write request > for position 4 > I0717 04:39:59.416697 5031 sched.cpp:546] Scheduler::resourceOffers took > 34644ns > I0717 04:39:59.417361 5037 master.hpp:810] Removing offer > 20140717-043959-16842879-40280-5009-0 with resources cpus(*):2; mem(*):1024; > disk(*):1024; ports(*):[31000-32000] on slave > 20140717-043959-16842879-40280-5009-0 (lucid) > I0717 04:39:59.417414 5037 master.cpp:2133] Processing reply for offers: [ > 20140717-043959-16842879-40280-5009-0 ] on slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > for framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.417453 5037 master.cpp:2219] Authorizing framework principal > 'test-principal' to launch task 1 as user 'jenkins' > I0717 04:39:59.417733 5037 master.hpp:772] Adding task 1 with resources > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20140717-043959-16842879-40280-5009-0 (lucid) > I0717 04:39:59.417768 5037 master.cpp:2285] Launching task 1 of framework > 20140717-043959-16842879-40280-5009-0000 with resources cpus(*):2; > mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:39:59.417913 5037 slave.cpp:1002] Got assigned task 1 for framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.418223 5037 slave.cpp:1112] Launching task 1 for framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.419910 5037 slave.cpp:1222] Queuing task '1' for executor 1 > of framework '20140717-043959-16842879-40280-5009-0000 > I0717 04:39:59.420047 5037 containerizer.cpp:427] Starting container > '31868c4e-911b-4198-88b2-2ec5a7022adb' for executor '1' of framework > '20140717-043959-16842879-40280-5009-0000' > I0717 04:39:59.420281 5037 slave.cpp:561] Successfully attached file > '/tmp/HealthCheckTest_ConsecutiveFailures_sOluTu/slaves/20140717-043959-16842879-40280-5009-0/frameworks/20140717-043959-16842879-40280-5009-0000/executors/1/runs/31868c4e-911b-4198-88b2-2ec5a7022adb' > I0717 04:39:59.424384 5033 leveldb.cpp:343] Persisting action (16 bytes) to > leveldb took 8.154005ms > I0717 04:39:59.424438 5033 replica.cpp:676] Persisted action at 4 > I0717 04:39:59.424990 5035 replica.cpp:655] Replica received learned notice > for position 4 > I0717 04:39:59.425606 5037 launcher.cpp:137] Forked child with pid '5746' > for container '31868c4e-911b-4198-88b2-2ec5a7022adb' > I0717 04:39:59.428802 5037 containerizer.cpp:537] Fetching URIs for > container '31868c4e-911b-4198-88b2-2ec5a7022adb' using command > '/var/jenkins/workspace/mesos-ubuntu-10.04-gcc/src/mesos-fetcher' > I0717 04:39:59.434170 5035 leveldb.cpp:343] Persisting action (18 bytes) to > leveldb took 9.084582ms > I0717 04:39:59.434267 5035 leveldb.cpp:401] Deleting ~2 keys from leveldb > took 45795ns > I0717 04:39:59.434283 5035 replica.cpp:676] Persisted action at 4 > I0717 04:39:59.434299 5035 replica.cpp:661] Replica learned TRUNCATE action > at position 4 > 2014-07-17 > 04:40:00,248:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I0717 04:40:00.276789 5035 slave.cpp:2469] Monitoring executor '1' of > framework '20140717-043959-16842879-40280-5009-0000' in container > '31868c4e-911b-4198-88b2-2ec5a7022adb' > I0717 04:40:00.312453 5035 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 43185ns > I0717 04:40:00.318514 5035 slave.cpp:1733] Got registration for executor '1' > of framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:00.319555 5035 slave.cpp:1852] Flushing queued task 1 for > executor '1' of framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:00.320169 5038 process.cpp:1098] Socket closed while receiving > I0717 04:40:00.325037 5035 slave.cpp:2087] Handling status update > TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of > framework 20140717-043959-16842879-40280-5009-0000 from > executor(1)@127.0.1.1:59033 > I0717 04:40:00.325140 5035 status_update_manager.cpp:320] Received status > update TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 > of framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:00.325155 5035 status_update_manager.cpp:499] Creating > StatusUpdate stream for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:00.325295 5035 status_update_manager.cpp:373] Forwarding status > update TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 > of framework 20140717-043959-16842879-40280-5009-0000 to > master@127.0.1.1:40280 > I0717 04:40:00.325455 5035 master.cpp:3120] Status update TASK_RUNNING > (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 from slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:00.325495 5035 slave.cpp:2245] Status update manager > successfully handled status update TASK_RUNNING (UUID: > a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:00.325506 5035 slave.cpp:2251] Sending acknowledgement for > status update TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for > task 1 of framework 20140717-043959-16842879-40280-5009-0000 to > executor(1)@127.0.1.1:59033 > I0717 04:40:00.325717 5035 sched.cpp:637] Scheduler::statusUpdate took > 31980ns > I0717 04:40:00.325794 5035 master.cpp:2639] Forwarding status update > acknowledgement a769a032-5aa7-41fb-9830-d6745df0e18e for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 to slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:00.325899 5035 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task > 1 of framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:00.325950 5035 slave.cpp:1673] Status update manager > successfully handled status update acknowledgement (UUID: > a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:00.326431 5038 process.cpp:1098] Socket closed while receiving > I0717 04:40:01.322517 5031 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 42022ns > I0717 04:40:01.376111 5031 slave.cpp:2087] Handling status update > TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 in > health state unhealthy of framework 20140717-043959-16842879-40280-5009-0000 > from executor(1)@127.0.1.1:59033 > I0717 04:40:01.376238 5031 status_update_manager.cpp:320] Received status > update TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 > in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:01.376266 5031 status_update_manager.cpp:373] Forwarding status > update TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 > in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 to master@127.0.1.1:40280 > I0717 04:40:01.376397 5031 master.cpp:3120] Status update TASK_RUNNING > (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 in health state > unhealthy of framework 20140717-043959-16842879-40280-5009-0000 from slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:01.376454 5031 slave.cpp:2245] Status update manager > successfully handled status update TASK_RUNNING (UUID: > 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 in health state unhealthy of > framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:01.376466 5031 slave.cpp:2251] Sending acknowledgement for > status update TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for > task 1 in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 to executor(1)@127.0.1.1:59033 > I0717 04:40:01.376648 5031 sched.cpp:637] Scheduler::statusUpdate took > 18560ns > I0717 04:40:01.376713 5031 master.cpp:2639] Forwarding status update > acknowledgement 8e06aa77-eb92-486e-a20f-0d67752a7754 for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 to slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:01.376857 5035 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task > 1 of framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:01.376911 5035 slave.cpp:1673] Status update manager > successfully handled status update acknowledgement (UUID: > 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:01.377177 5038 process.cpp:1098] Socket closed while receiving > 2014-07-17 > 04:40:04,062:5009(0x2ad17a97f700):ZOO_WARN@zookeeper_interest@1557: Exceeded > deadline by 483ms > 2014-07-17 > 04:40:04,062:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I0717 04:40:04.063637 5030 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 51405ns > I0717 04:40:04.134610 5034 slave.cpp:2087] Handling status update > TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 in > health state unhealthy of framework 20140717-043959-16842879-40280-5009-0000 > from executor(1)@127.0.1.1:59033 > I0717 04:40:04.134814 5034 status_update_manager.cpp:320] Received status > update TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 > in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:04.134857 5034 status_update_manager.cpp:373] Forwarding status > update TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 > in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 to master@127.0.1.1:40280 > I0717 04:40:04.135036 5034 master.cpp:3120] Status update TASK_RUNNING > (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 in health state > unhealthy of framework 20140717-043959-16842879-40280-5009-0000 from slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:04.135073 5034 slave.cpp:2245] Status update manager > successfully handled status update TASK_RUNNING (UUID: > cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 in health state unhealthy of > framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:04.135083 5034 slave.cpp:2251] Sending acknowledgement for > status update TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for > task 1 in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 to executor(1)@127.0.1.1:59033 > I0717 04:40:04.135267 5034 sched.cpp:637] Scheduler::statusUpdate took > 27539ns > I0717 04:40:04.135339 5034 master.cpp:2639] Forwarding status update > acknowledgement cc04d02c-753e-4fc8-b35a-61cc33b4d796 for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 to slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:04.135452 5034 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task > 1 of framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:04.135503 5034 slave.cpp:1673] Status update manager > successfully handled status update acknowledgement (UUID: > cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:04.135925 5038 process.cpp:1098] Socket closed while receiving > I0717 04:40:04.312355 5036 master.cpp:122] No whitelist given. Advertising > offers for all slaves > I0717 04:40:05.072487 5036 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 49651ns > I0717 04:40:05.084954 5036 slave.cpp:2087] Handling status update > TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 in > health state unhealthy of framework 20140717-043959-16842879-40280-5009-0000 > from executor(1)@127.0.1.1:59033 > I0717 04:40:05.085073 5036 status_update_manager.cpp:320] Received status > update TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 > in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:05.085100 5036 status_update_manager.cpp:373] Forwarding status > update TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 > in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 to master@127.0.1.1:40280 > I0717 04:40:05.085227 5036 master.cpp:3120] Status update TASK_RUNNING > (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 in health state > unhealthy of framework 20140717-043959-16842879-40280-5009-0000 from slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:05.085263 5036 slave.cpp:2245] Status update manager > successfully handled status update TASK_RUNNING (UUID: > 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 in health state unhealthy of > framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:05.085273 5036 slave.cpp:2251] Sending acknowledgement for > status update TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for > task 1 in health state unhealthy of framework > 20140717-043959-16842879-40280-5009-0000 to executor(1)@127.0.1.1:59033 > I0717 04:40:05.087087 5037 sched.cpp:637] Scheduler::statusUpdate took > 19205ns > I0717 04:40:05.087158 5030 master.cpp:2639] Forwarding status update > acknowledgement 5ec41670-9320-4ecb-9767-6c020d771cd9 for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 to slave > 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid) > I0717 04:40:05.087779 5038 process.cpp:1098] Socket closed while receiving > I0717 04:40:05.088122 5032 status_update_manager.cpp:398] Received status > update acknowledgement (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task > 1 of framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:05.088302 5036 slave.cpp:1673] Status update manager > successfully handled status update acknowledgement (UUID: > 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:06.082578 5031 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 71517ns > I0717 04:40:07.092687 5037 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 34535ns > 2014-07-17 > 04:40:07,399:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I0717 04:40:08.102656 5035 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 70999ns > I0717 04:40:09.112658 5031 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 35887ns > I0717 04:40:09.322443 5037 master.cpp:122] No whitelist given. Advertising > offers for all slaves > 2014-07-17 > 04:40:11,494:5009(0x2ad17a97f700):ZOO_WARN@zookeeper_interest@1557: Exceeded > deadline by 764ms > 2014-07-17 > 04:40:11,495:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I0717 04:40:11.499094 5037 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 46842ns > I0717 04:40:12.502545 5034 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 34124ns > I0717 04:40:13.512511 5032 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 33974ns > I0717 04:40:14.332535 5031 master.cpp:122] No whitelist given. Advertising > offers for all slaves > I0717 04:40:14.422514 5037 slave.cpp:2324] Received ping from > slave-observer(104)@127.0.1.1:40280 > I0717 04:40:14.522536 5037 hierarchical_allocator_process.hpp:686] Performed > allocation for 1 slaves in 46113ns > 2014-07-17 > 04:40:14,832:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > tests/health_check_tests.cpp:247: Failure > Failed to wait 10secs for status4 > tests/health_check_tests.cpp:222: Failure > Actual function call count doesn't match EXPECT_CALL(sched, > statusUpdate(&driver, _))... > Expected: to be called 6 times > Actual: called 4 times - unsatisfied and active > I0717 04:40:15.093816 5009 master.cpp:625] Master terminating > I0717 04:40:15.093973 5009 master.hpp:790] Removing task 1 with resources > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave > 20140717-043959-16842879-40280-5009-0 (lucid) > W0717 04:40:15.094081 5009 master.cpp:4266] Removing task 1 of framework > 20140717-043959-16842879-40280-5009-0000 and slave > 20140717-043959-16842879-40280-5009-0 in non-terminal state TASK_RUNNING > I0717 04:40:15.097363 5036 slave.cpp:2331] master@127.0.1.1:40280 exited > W0717 04:40:15.097385 5036 slave.cpp:2334] Master disconnected! Waiting for > a new master to be elected > I0717 04:40:15.097913 5032 containerizer.cpp:903] Destroying container > '31868c4e-911b-4198-88b2-2ec5a7022adb' > I0717 04:40:15.146953 5038 process.cpp:1037] Socket closed while receiving > I0717 04:40:15.532558 5035 containerizer.cpp:1019] Executor for container > '31868c4e-911b-4198-88b2-2ec5a7022adb' has exited > I0717 04:40:15.532965 5035 slave.cpp:2527] Executor '1' of framework > 20140717-043959-16842879-40280-5009-0000 terminated with signal Killed > I0717 04:40:15.534054 5035 slave.cpp:2087] Handling status update > TASK_FAILED (UUID: 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of > framework 20140717-043959-16842879-40280-5009-0000 from @0.0.0.0:0 > I0717 04:40:15.534097 5035 slave.cpp:3769] Terminating task 1 > W0717 04:40:15.534314 5035 containerizer.cpp:809] Ignoring update for > unknown container: 31868c4e-911b-4198-88b2-2ec5a7022adb > I0717 04:40:15.534819 5030 status_update_manager.cpp:320] Received status > update TASK_FAILED (UUID: 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of > framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:15.534883 5030 status_update_manager.cpp:373] Forwarding status > update TASK_FAILED (UUID: 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of > framework 20140717-043959-16842879-40280-5009-0000 to master@127.0.1.1:40280 > I0717 04:40:15.535047 5030 slave.cpp:2245] Status update manager > successfully handled status update TASK_FAILED (UUID: > 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:15.535524 5009 slave.cpp:484] Slave terminating > I0717 04:40:15.535552 5009 slave.cpp:1406] Asked to shut down framework > 20140717-043959-16842879-40280-5009-0000 by @0.0.0.0:0 > I0717 04:40:15.535562 5009 slave.cpp:1431] Shutting down framework > 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:15.535588 5009 slave.cpp:2661] Cleaning up executor '1' of > framework 20140717-043959-16842879-40280-5009-0000 > I0717 04:40:15.535840 5009 slave.cpp:2736] Cleaning up framework > 20140717-043959-16842879-40280-5009-0000 > [ FAILED ] HealthCheckTest.ConsecutiveFailures (16270 ms) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)