Bernd Mathiske created MESOS-3280:
-------------------------------------

             Summary: Master fails to access replicated log after network 
partition
                 Key: MESOS-3280
                 URL: https://issues.apache.org/jira/browse/MESOS-3280
             Project: Mesos
          Issue Type: Bug
          Components: master
            Reporter: Bernd Mathiske


In a 5 node cluster with 3 masters and 2 slaves, and ZK on each node, when a 
network partition is forced, all the masters apparently lose access to their 
replicated log. The leading master halts. Unknown reasons, but presumably 
related to replicated log access. The others fail to recover from the 
replicated log. Unknown reasons. This could have to do with ZK setup, but it 
might also be a Mesos bug. 

This was observed in a Chronos test drive scenario described in detail here:
https://github.com/mesos/chronos/issues/511

With setup instructions here:
https://github.com/mesos/chronos/issues/508





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to