[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2015-12-09 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049850#comment-15049850
 ] 

Benjamin Mahler commented on MESOS-1613:


For posterity, I also wasn't able to reproduce this by just running in 
repetition. However, when I ran one {{openssl speed}} for each core on my 
laptop in order to induce load, I could reproduce easily. We probably want to 
direct folks to try this when they are having trouble reproducing something 
flaky from CI.

I will post a fix through MESOS-4106.

> HealthCheckTest.ConsecutiveFailures is flaky
> 
>
> Key: MESOS-1613
> URL: https://issues.apache.org/jira/browse/MESOS-1613
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.20.0
> Environment: Ubuntu 10.04 GCC
>Reporter: Vinod Kone
>Assignee: Timothy Chen
>  Labels: flaky, mesosphere
>
> {code}
> [ RUN  ] HealthCheckTest.ConsecutiveFailures
> Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
> I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
> I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
> I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
> I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
> 2125ns
> I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 10747ns
> I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
> I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
> I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
> a broadcasted recover request
> I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
> a replica in EMPTY status
> I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
> STARTING
> I0717 04:39:59.301985  5031 master.cpp:288] Master 
> 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
> I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
> authenticated frameworks to register
> I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
> authenticated slaves to register
> I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
> I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
> I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
> Initializing hierarchical allocator process with master : 
> master@127.0.1.1:40280
> I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
> offers for all slaves
> I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 5.325097ms
> I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
> STARTING
> I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
> master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
> I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
> I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
> I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
> I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
> I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
> received a broadcasted recover request
> I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
> a replica in STARTING status
> I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
> I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 5.204157ms
> I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
> VOTING
> I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
> group
> I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
> I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
> I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
> promise request with proposal 1
> I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 2.271822ms
> I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
> I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
> fill missing position
> I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
> promise request for position 0 with proposal 2
> I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
> leveldb took 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-09-29 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152062#comment-14152062
 ] 

Timothy Chen commented on MESOS-1613:
-

Added output to stdout and stderr in my branch, hopefully to see what's going on

 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 I0717 04:39:59.323799  5031 replica.cpp:508] Replica received write request 
 for position 0
 I0717 04:39:59.323837  5031 leveldb.cpp:438] Reading position from leveldb 
 took 21901ns
 I0717 04:39:59.329038  5031 leveldb.cpp:343] Persisting action (14 bytes) to 
 leveldb took 5.175998ms
 I0717 04:39:59.329244  5031 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-08-07 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089385#comment-14089385
 ] 

Timothy Chen commented on MESOS-1613:
-

Yes I will need the task logs and need your help to get output from Twitter 
machines as I can't repro this.
I'll try to log the task output in the tests to help debug this.

 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 I0717 04:39:59.323799  5031 replica.cpp:508] Replica received write request 
 for position 0
 I0717 04:39:59.323837  5031 leveldb.cpp:438] Reading position from leveldb 
 took 21901ns
 I0717 04:39:59.329038  5031 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-08-07 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090065#comment-14090065
 ] 

Benjamin Mahler commented on MESOS-1613:


[~tnachen] it's failing on ASF CI as well, can you triage or disable in the 
interim?

 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 I0717 04:39:59.323799  5031 replica.cpp:508] Replica received write request 
 for position 0
 I0717 04:39:59.323837  5031 leveldb.cpp:438] Reading position from leveldb 
 took 21901ns
 I0717 04:39:59.329038  5031 leveldb.cpp:343] Persisting action (14 bytes) to 
 leveldb took 5.175998ms
 I0717 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-08-07 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090266#comment-14090266
 ] 

Vinod Kone commented on MESOS-1613:
---

Chatted with [~tnachen] on IRC. Decided to temporarily disabled the test. He is 
going to send a review out with more instrumentation to the test to find the 
root cause.

{code}
commit 7814140991217209494f5ec0f266e1d7b0f1d897
Author: Vinod Kone vinodk...@gmail.com
Date:   Thu Aug 7 20:50:14 2014 -0700

Temporarily disabled HealthcheckTest.ConsecutiveFailures test due to
MESOS-1613.
{code}



 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-08-06 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088425#comment-14088425
 ] 

Benjamin Mahler commented on MESOS-1613:


So far only Twitter CI is exposing this flakiness. I've pasted the full logs in 
the comment above, are you looking for logging from the command executor? We 
may want to investigate wiring up the tests to expose them in the output to 
make this easier for you to debug.

 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 I0717 04:39:59.323799  5031 replica.cpp:508] Replica received write request 
 for position 0
 I0717 04:39:59.323837  5031 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-07-23 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072336#comment-14072336
 ] 

Timothy Chen commented on MESOS-1613:
-

[~vi...@twitter.com] Vinod I have a reviewboard out already, I saw you 
commented earlier but can you try it or if it looks good commit it?

Thanks!

 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 I0717 04:39:59.323799  5031 replica.cpp:508] Replica received write request 
 for position 0
 I0717 04:39:59.323837  5031 leveldb.cpp:438] Reading position from leveldb 
 took 21901ns
 I0717 04:39:59.329038  5031 leveldb.cpp:343] Persisting action 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-07-17 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065720#comment-14065720
 ] 

Timothy Chen commented on MESOS-1613:
-

From the log it's actually failing because it couldn't find the healthy_state 
set when the task is killed, which is added in 169ec89.
Both Apache Jenkins and on my Ubuntu they both passes, so I'm not really sure 
how this is failing as it should get picked up.

How does the Twitter CI run? Does it do a make clean and re clone before 
running it again?

 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 I0717 04:39:59.323799  5031 

[jira] [Commented] (MESOS-1613) HealthCheckTest.ConsecutiveFailures is flaky

2014-07-17 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065888#comment-14065888
 ] 

Vinod Kone commented on MESOS-1613:
---

Disabled on HEAD.

Updated mesos-tnachen Jenkins build (which watches the health_check branch 
of yours) as follows:

GLOG_v=1 MESOS_VERBOSE=1 make distcheck GTEST_SHUFFLE=1 
GTEST_FILTER=*ConsecutiveFailures* GTEST_REPEAT=100 GTEST_BREAK_ON_FAILURE=1

Lets hope we can repro. Also, feel free to add debugging info in your branch 
and push upstream to your repo.

 HealthCheckTest.ConsecutiveFailures is flaky
 

 Key: MESOS-1613
 URL: https://issues.apache.org/jira/browse/MESOS-1613
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.0
 Environment: Ubuntu 10.04 GCC
Reporter: Vinod Kone
Assignee: Timothy Chen

 {code}
 [ RUN  ] HealthCheckTest.ConsecutiveFailures
 Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
 I0717 04:39:59.288471  5009 leveldb.cpp:176] Opened db in 21.575631ms
 I0717 04:39:59.295274  5009 leveldb.cpp:183] Compacted db in 6.471982ms
 I0717 04:39:59.295552  5009 leveldb.cpp:198] Created db iterator in 16783ns
 I0717 04:39:59.296026  5009 leveldb.cpp:204] Seeked to beginning of db in 
 2125ns
 I0717 04:39:59.296257  5009 leveldb.cpp:273] Iterated through 0 keys in the 
 db in 10747ns
 I0717 04:39:59.296584  5009 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0717 04:39:59.297322  5033 recover.cpp:425] Starting replica recovery
 I0717 04:39:59.297413  5033 recover.cpp:451] Replica is in EMPTY status
 I0717 04:39:59.297824  5033 replica.cpp:638] Replica in EMPTY status received 
 a broadcasted recover request
 I0717 04:39:59.297899  5033 recover.cpp:188] Received a recover response from 
 a replica in EMPTY status
 I0717 04:39:59.297997  5033 recover.cpp:542] Updating replica status to 
 STARTING
 I0717 04:39:59.301985  5031 master.cpp:288] Master 
 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
 I0717 04:39:59.302026  5031 master.cpp:325] Master only allowing 
 authenticated frameworks to register
 I0717 04:39:59.302032  5031 master.cpp:330] Master only allowing 
 authenticated slaves to register
 I0717 04:39:59.302039  5031 credentials.hpp:36] Loading credentials for 
 authentication from 
 '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
 I0717 04:39:59.302283  5031 master.cpp:359] Authorization enabled
 I0717 04:39:59.302971  5031 hierarchical_allocator_process.hpp:301] 
 Initializing hierarchical allocator process with master : 
 master@127.0.1.1:40280
 I0717 04:39:59.303022  5031 master.cpp:122] No whitelist given. Advertising 
 offers for all slaves
 I0717 04:39:59.303390  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.325097ms
 I0717 04:39:59.303419  5033 replica.cpp:320] Persisted replica status to 
 STARTING
 I0717 04:39:59.304076  5030 master.cpp:1128] The newly elected leader is 
 master@127.0.1.1:40280 with id 20140717-043959-16842879-40280-5009
 I0717 04:39:59.304095  5030 master.cpp:1141] Elected as the leading master!
 I0717 04:39:59.304102  5030 master.cpp:959] Recovering from registrar
 I0717 04:39:59.304182  5030 registrar.cpp:313] Recovering registrar
 I0717 04:39:59.304635  5033 recover.cpp:451] Replica is in STARTING status
 I0717 04:39:59.304962  5033 replica.cpp:638] Replica in STARTING status 
 received a broadcasted recover request
 I0717 04:39:59.305026  5033 recover.cpp:188] Received a recover response from 
 a replica in STARTING status
 I0717 04:39:59.305130  5033 recover.cpp:542] Updating replica status to VOTING
 I0717 04:39:59.310416  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 5.204157ms
 I0717 04:39:59.310459  5033 replica.cpp:320] Persisted replica status to 
 VOTING
 I0717 04:39:59.310534  5033 recover.cpp:556] Successfully joined the Paxos 
 group
 I0717 04:39:59.310607  5033 recover.cpp:440] Recover process terminated
 I0717 04:39:59.310773  5033 log.cpp:656] Attempting to start the writer
 I0717 04:39:59.311157  5033 replica.cpp:474] Replica received implicit 
 promise request with proposal 1
 I0717 04:39:59.313451  5033 leveldb.cpp:306] Persisting metadata (8 bytes) to 
 leveldb took 2.271822ms
 I0717 04:39:59.313627  5033 replica.cpp:342] Persisted promised to 1
 I0717 04:39:59.318038  5031 coordinator.cpp:230] Coordinator attemping to 
 fill missing position
 I0717 04:39:59.318430  5031 replica.cpp:375] Replica received explicit 
 promise request for position 0 with proposal 2
 I0717 04:39:59.323459  5031 leveldb.cpp:343] Persisting action (8 bytes) to 
 leveldb took 5.004323ms
 I0717 04:39:59.323493  5031 replica.cpp:676] Persisted action at 0
 I0717 04:39:59.323799  5031