Some how the logs did not attach. Zookeeper logs should be attached.
> -----Original Message----- > From: Todd Greenwood [mailto:to...@audiencescience.com] > Sent: Friday, July 31, 2009 7:15 PM > To: zookeeper-user@hadoop.apache.org > Subject: Unending Leader Elections in WAN deploy > > This repro's in both branch-3.2, and branch-3.2+patches(473, 479, 481). > > Basically, it seems like the nodes are electing pd4-zook02 to be the > leader. However, pd4-zook02 seems to realize it's not supposed to be and > then disconnects everyone. Then they re-elect it again, and it loops > over and over. > > ------------- > Server config > ------------- > > server.1=dc1-zook01.dc01.revsci.net:2888:3888 > server.2=dc1-zook02.dc01.revsci.net:2888:3888 > server.3=dc1-zook03.dc01.revsci.net:2888:3888 > server.4=dc1-zook04.dc01.revsci.net:2888:3888 > server.5=dc1-zook05.dc01.revsci.net:2888:3888 > server.6=pd1-zook01.pd01.revsci.net:2888:3888 > server.7=pd1-zook02.pd01.revsci.net:2888:3888 > server.8=pd4-zook01.iad1.audsci.net:2888:3888 > server.9=pd4-zook02.iad1.audsci.net:2888:3888 > > group.1:1:2:3:4:5 > weight.1=1 > weight.2=1 > weight.3=1 > weight.4=1 > weight.5=1 > > group.2:6:7:8:9 > weight.6=0 > weight.7=0 > weight.8=0 > weight.9=0 > > Note that we have 2 groups, composed of machines in 3 different > locations (dc1, pd1, and pd4). The idea is that only machines in dc1 > have voting rights, and the ability to become a leader. The machines in > the pods all have a weight of zero, and are not expected to become > leaders, or to vote on transactions. > > Let me know what I can do to help resolve this issue. > > -Todd