[jira] [Updated] (MESOS-2934) Mesos master crashes when quorum set to 4
[ https://issues.apache.org/jira/browse/MESOS-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-2934: --- Labels: documentation (was: documentaion) > Mesos master crashes when quorum set to 4 > - > > Key: MESOS-2934 > URL: https://issues.apache.org/jira/browse/MESOS-2934 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.22.1 > Environment: CentOS 7 > Java 1.7.0_55 >Reporter: Craig W >Priority: Minor > Labels: documentation > > When deploying 5 mesos masters, with quorum set to 4, the masters start up > but fail to stay running. Instead they exit and then restart (Monit is used > to supervise the process) within a few seconds. This cycle continues non-stop. > The logs on the master look like this: > {noformat} > Received a recover response from a replica in EMPTY status > Received a recover response from a replica in EMPTY status > Replica in EMPTY status received a broadcasted recover request > Recovery failed: Failed to recover registrar: Failed to perform fetch within > 1mins > Replica in EMPTY status received a broadcasted recover request > Received a recover response from a replica in EMPTY status > Received a recover response from a replica in EMPTY status > Replica in EMPTY status received a broadcasted recover > The newly elected leader is master@:5050 with id > 20150625-102436-748881418-5050-2157 > Elected as the leading master! > Recovering from registrar > Recovering registrar > Unable to finish the recover protocol in 10secs, retrying > Unable to finish the recover protocol in 10secs, retrying > Recovery failed: Failed to recover registrar: Failed to perform fetch within > 1mins > {noformat} > When I change the quorum to 2 and run just 3 mesos master processes, the > cluster stays up without a hitch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-2934) Mesos master crashes when quorum set to 4
[ https://issues.apache.org/jira/browse/MESOS-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig W updated MESOS-2934: --- Labels: documentaion (was: ) Priority: Minor (was: Major) > Mesos master crashes when quorum set to 4 > - > > Key: MESOS-2934 > URL: https://issues.apache.org/jira/browse/MESOS-2934 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.22.1 > Environment: CentOS 7 > Java 1.7.0_55 >Reporter: Craig W >Priority: Minor > Labels: documentaion > > When deploying 5 mesos masters, with quorum set to 4, the masters start up > but fail to stay running. Instead they exit and then restart (Monit is used > to supervise the process) within a few seconds. This cycle continues non-stop. > The logs on the master look like this: > {noformat} > Received a recover response from a replica in EMPTY status > Received a recover response from a replica in EMPTY status > Replica in EMPTY status received a broadcasted recover request > Recovery failed: Failed to recover registrar: Failed to perform fetch within > 1mins > Replica in EMPTY status received a broadcasted recover request > Received a recover response from a replica in EMPTY status > Received a recover response from a replica in EMPTY status > Replica in EMPTY status received a broadcasted recover > The newly elected leader is master@:5050 with id > 20150625-102436-748881418-5050-2157 > Elected as the leading master! > Recovering from registrar > Recovering registrar > Unable to finish the recover protocol in 10secs, retrying > Unable to finish the recover protocol in 10secs, retrying > Recovery failed: Failed to recover registrar: Failed to perform fetch within > 1mins > {noformat} > When I change the quorum to 2 and run just 3 mesos master processes, the > cluster stays up without a hitch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)