[jira] [Comment Edited] (MESOS-3280) Master fails to access replicated log after network partition

Neil Conway (JIRA) Tue, 03 Nov 2015 13:49:09 -0800

    [ 
https://issues.apache.org/jira/browse/MESOS-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988199#comment-14988199
 ]


Neil Conway edited comment on MESOS-3280 at 11/3/15 9:47 PM:
-------------------------------------------------------------

Fix merged in 82b6112cabc838f9bfa, should be in 0.26


was (Author: neilc):
Merged in 82b6112cabc838f9bfa.

> Master fails to access replicated log after network partition
> -------------------------------------------------------------
>
>                 Key: MESOS-3280
>                 URL: https://issues.apache.org/jira/browse/MESOS-3280
>             Project: Mesos
>          Issue Type: Bug
>          Components: master, replicated log
>    Affects Versions: 0.23.0
>         Environment: Zookeeper version 3.4.5--1
>            Reporter: Bernd Mathiske
>            Assignee: Neil Conway
>              Labels: mesosphere
>             Fix For: 0.26.0
>
>         Attachments: rep-log-race-cond-logs.tar.gz, 
> rep-log-startup-race-test-1.patch
>
>
> In a 5 node cluster with 3 masters and 2 slaves, and ZK on each node, when a 
> network partition is forced, all the masters apparently lose access to their 
> replicated log. The leading master halts. Unknown reasons, but presumably 
> related to replicated log access. The others fail to recover from the 
> replicated log. Unknown reasons. This could have to do with ZK setup, but it 
> might also be a Mesos bug. 
> This was observed in a Chronos test drive scenario described in detail here:
> https://github.com/mesos/chronos/issues/511
> With setup instructions here:
> https://github.com/mesos/chronos/issues/508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-3280) Master fails to access replicated log after network partition

Reply via email to