In fact, I also tried with launching 2 masters on two separate machines, at 
first, one of them was successfully elected as a leader, and both of them 
printed several lines of messages:

Replica in EMPTY status received a broadcasted recover request
Received a recover response from a replica in EMPTY status

then the leader master aborted after outputing errors:

Recovery failed: Failed to recover registrar: Failed to perform fetch within 
1mins 
*** Check failure stack trace: ***
 @ 0x7f3c1ea105cd google::LogMessage::Fail() 
..............................

and next, the second master became the new leader, it also tried to recovery 
from the registrar, but also failed and printed errors before aborted:

Recovery failed: Failed to recover registrar: Failed to perform fetch within 
1mins 
*** Check failure stack trace: *** 
@ 0x7f3c1ea105cd google::LogMessage::Fail() 
...............................

So I guess that's not problems of zookeeper, it's the elected leader can not 
recover from registrar, could somebody be kind to illustrate some principles of 
mesos registry, or give me some suggestions?

THANKS.

"david.j.palaitis" <david.j.palai...@gmail.com>编写:

>With a single master,  you should not set quorum=2
>
>
>-------- Original message --------
>From: sujinzhao <sujinz...@gmail.com> 
>Date:11/06/2014  4:01 PM  (GMT-05:00) 
>To: user@mesos.apache.org 
>Cc:  
>Subject: Problems of running mesos-0.20.0 with zookeeper 
>
>Hi,all,
>
>I set up zookeeper service with three machines zoo1, zoo2, zoo3, and also 
>installed 1 mesos master and 2 slaves on another three nodes, I tried to run 
>master and slaves with:
>./mesos-master.sh --ip=master-ip --zk=zk://zoo1:2181,zoo2:2181,zoo3:2181/mesos 
>--quorum=2
>
>./mesos-slave.sh --ip=slave-ip 
>--master=zk://zoo1:2181,zoo2:2181,zoo3:2181/mesos 
>
>I also created the /mesos znode before running the above commands, but I got 
>the following error:
>
>Recovering from registrar
>Recovering registrar
>Recovery failed: Failed to recover registrar: Failed to perform fetch within 
>1mins
>*** Check failure stack trace: ***
>    @  0x7f3c1ea105cd google::LogMessage::Fail()
>...............................
>
>after reading the master log, I found that before causing error, master has 
>already been elected successfully, but the leader failed in recovering from 
>registrar, so I guess this error has little relationship with zookeeper.
>
>after googleing I found that other people also encountered this problem, but 
>with no solution, I also exclude the possible reason of ssh between 
>master/slave and zookeeper servers with no password.
>
>So, could somebody be kindly to tell me how to solve this error? any 
>suggestions will be appreciated.
>
>THANKS.

Reply via email to