[ https://issues.apache.org/jira/browse/MESOS-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804028#comment-14804028 ]
Jie Yu commented on MESOS-3280: ------------------------------- I'd be happy to assist as well. Will be useful to attach the master's log (related to replicated log). > Master fails to access replicated log after network partition > ------------------------------------------------------------- > > Key: MESOS-3280 > URL: https://issues.apache.org/jira/browse/MESOS-3280 > Project: Mesos > Issue Type: Bug > Components: master, replicated log > Affects Versions: 0.23.0 > Environment: Zookeeper version 3.4.5--1 > Reporter: Bernd Mathiske > Assignee: Neil Conway > Labels: mesosphere > > In a 5 node cluster with 3 masters and 2 slaves, and ZK on each node, when a > network partition is forced, all the masters apparently lose access to their > replicated log. The leading master halts. Unknown reasons, but presumably > related to replicated log access. The others fail to recover from the > replicated log. Unknown reasons. This could have to do with ZK setup, but it > might also be a Mesos bug. > This was observed in a Chronos test drive scenario described in detail here: > https://github.com/mesos/chronos/issues/511 > With setup instructions here: > https://github.com/mesos/chronos/issues/508 -- This message was sent by Atlassian JIRA (v6.3.4#6332)