This repro's in both branch-3.2, and branch-3.2+patches(473, 479, 481). Basically, it seems like the nodes are electing pd4-zook02 to be the leader. However, pd4-zook02 seems to realize it's not supposed to be and then disconnects everyone. Then they re-elect it again, and it loops over and over.
------------- Server config ------------- server.1=dc1-zook01.dc01.revsci.net:2888:3888 server.2=dc1-zook02.dc01.revsci.net:2888:3888 server.3=dc1-zook03.dc01.revsci.net:2888:3888 server.4=dc1-zook04.dc01.revsci.net:2888:3888 server.5=dc1-zook05.dc01.revsci.net:2888:3888 server.6=pd1-zook01.pd01.revsci.net:2888:3888 server.7=pd1-zook02.pd01.revsci.net:2888:3888 server.8=pd4-zook01.iad1.audsci.net:2888:3888 server.9=pd4-zook02.iad1.audsci.net:2888:3888 group.1:1:2:3:4:5 weight.1=1 weight.2=1 weight.3=1 weight.4=1 weight.5=1 group.2:6:7:8:9 weight.6=0 weight.7=0 weight.8=0 weight.9=0 Note that we have 2 groups, composed of machines in 3 different locations (dc1, pd1, and pd4). The idea is that only machines in dc1 have voting rights, and the ability to become a leader. The machines in the pods all have a weight of zero, and are not expected to become leaders, or to vote on transactions. Let me know what I can do to help resolve this issue. -Todd