Flavio, I notice that you've updated the patches referenced for the WAN deployment. There appears to be an order dependency w/ respect to these four patches...
ZOOKEEPER-473.patch ZOOKEEPER-479-branch3.2.patch ZOOKEEPER-481-branch3.2.patch ZOOKEEPER-491.patch 473 -> 479 (479 fails) to...@toddg01lt:~/asi/workspaces/main/Main/RSI/etc/holmes/main/zookeeper /src/patched/branch-3.2$ patch -p0 < ../patches/ZOOKEEPER-479-branch3.2.patch patching file src/java/main/org/apache/zookeeper/server/quorum/flexible/QuorumHierarch ical.java patching file src/java/main/org/apache/zookeeper/server/quorum/flexible/QuorumMaj.java patching file src/java/main/org/apache/zookeeper/server/quorum/flexible/QuorumVerifier .java patching file src/java/test/org/apache/zookeeper/test/HierarchicalQuorumTest.java Hunk #1 FAILED at 93. Hunk #2 FAILED at 145. 2 out of 2 hunks FAILED -- saving rejects to file src/java/test/org/apache/zookeeper/test/HierarchicalQuorumTest.java.rej to...@toddg01lt:~/asi/workspaces/main/Main/RSI/etc/holmes/main/zookeeper /src/patched/branch-3.2$ h ../patches/ Could you advise as to which patches I need to apply, and in what order? -Todd > -----Original Message----- > From: Flavio Junqueira [mailto:f...@yahoo-inc.com] > Sent: Friday, July 31, 2009 9:51 PM > To: zookeeper-user@hadoop.apache.org > Subject: Re: Unending Leader Elections in WAN deploy > > Perfect! Thanks for the update, Todd. > > -Flavio > > On Jul 31, 2009, at 8:17 PM, Todd Greenwood wrote: > > > Thanks. You were right, I had a stale version of 479. Compilation > > succeeds and all tests pass on branch-3.2 with the latest patches 473, > > 479, 481, and 491. > > > > -Todd > > > >> -----Original Message----- > >> From: Flavio Junqueira [mailto:f...@yahoo-inc.com] > >> Sent: Friday, July 31, 2009 7:48 PM > >> To: zookeeper-user@hadoop.apache.org > >> Subject: Re: Unending Leader Elections in WAN deploy > >> > >> It should be in 479. Perhaps you have a stale version of the patch. > >> > >> -Flavio > >> > >> On Jul 31, 2009, at 7:46 PM, Todd Greenwood wrote: > >> > >>> Flavio, > >>> > >>> I'm getting a compilation error for patch 491: > >>> > >>> compile-main: > >>> [javac] Compiling 1 source file to > >>> /home/toddg/asi/workspaces/main/Main/RSI/etc/holmes/main/zookeeper/ > >>> src/p > >>> atched/branch-3.2/build/classes > >>> [javac] > >>> /home/toddg/asi/workspaces/main/Main/RSI/etc/holmes/main/zookeeper/ > >>> src/p > >>> atched/branch-3.2/src/java/main/org/apache/zookeeper/server/quorum/ > >>> FastL > >>> eaderElection.java:601: cannot find symbol > >>> [javac] symbol : method getWeight(long) > >>> [javac] location: interface > >>> org.apache.zookeeper.server.quorum.flexible.QuorumVerifier > >>> [javac] > >>> if(self.getQuorumVerifier().getWeight(n.sid) != 0) > >>> [javac] ^ > >>> [javac] 1 error > >>> > >>> I see a reference to getWeight in both FastLeaderElection.java in > >>> patch > >>> 491: > >>> > >>> patches/ZOOKEEPER-491.patch:+ > >>> if(self.getQuorumVerifier().getWeight(n.sid) != 0) > >>> src/java/main/org/apache/zookeeper/server/quorum/ > >>> FastLeaderElection.java > >>> : > >>> if(self.getQuorumVerifier().getWeight(n.sid) != > >>> 0) > >>> > >>> However, I don't see a reference to this method in patches 473, 479, > >>> or > >>> 481. I also don't see a reference to this method in the trunk... > >>> > >>> -Todd > >>> > >>>> -----Original Message----- > >>>> From: Todd Greenwood [mailto:to...@audiencescience.com] > >>>> Sent: Friday, July 31, 2009 7:30 PM > >>>> To: zookeeper-user@hadoop.apache.org > >>>> Subject: RE: Unending Leader Elections in WAN deploy > >>>> > >>>> Ok, I'll apply that patch and report back. > >>>> -Todd > >>>> > >>>>> -----Original Message----- > >>>>> From: Flavio Junqueira [mailto:f...@yahoo-inc.com] > >>>>> Sent: Friday, July 31, 2009 7:18 PM > >>>>> To: zookeeper-user@hadoop.apache.org > >>>>> Subject: Re: Unending Leader Elections in WAN deploy > >>>>> > >>>>> You're missing 491 from your set of patches. > >>>>> > >>>>> -Flavio > >>>>> > >>>>> On Jul 31, 2009, at 7:15 PM, Todd Greenwood wrote: > >>>>> > >>>>>> This repro's in both branch-3.2, and branch-3.2+patches(473, 479, > >>>>>> 481). > >>>>>> > >>>>>> Basically, it seems like the nodes are electing pd4-zook02 to be > >>> the > >>>>>> leader. However, pd4-zook02 seems to realize it's not supposed to > >>> be > >>>>>> and > >>>>>> then disconnects everyone. Then they re-elect it again, and it > >>> loops > >>>>>> over and over. > >>>>>> > >>>>>> ------------- > >>>>>> Server config > >>>>>> ------------- > >>>>>> > >>>>>> server.1=dc1-zook01.dc01.revsci.net:2888:3888 > >>>>>> server.2=dc1-zook02.dc01.revsci.net:2888:3888 > >>>>>> server.3=dc1-zook03.dc01.revsci.net:2888:3888 > >>>>>> server.4=dc1-zook04.dc01.revsci.net:2888:3888 > >>>>>> server.5=dc1-zook05.dc01.revsci.net:2888:3888 > >>>>>> server.6=pd1-zook01.pd01.revsci.net:2888:3888 > >>>>>> server.7=pd1-zook02.pd01.revsci.net:2888:3888 > >>>>>> server.8=pd4-zook01.iad1.audsci.net:2888:3888 > >>>>>> server.9=pd4-zook02.iad1.audsci.net:2888:3888 > >>>>>> > >>>>>> group.1:1:2:3:4:5 > >>>>>> weight.1=1 > >>>>>> weight.2=1 > >>>>>> weight.3=1 > >>>>>> weight.4=1 > >>>>>> weight.5=1 > >>>>>> > >>>>>> group.2:6:7:8:9 > >>>>>> weight.6=0 > >>>>>> weight.7=0 > >>>>>> weight.8=0 > >>>>>> weight.9=0 > >>>>>> > >>>>>> Note that we have 2 groups, composed of machines in 3 different > >>>>>> locations (dc1, pd1, and pd4). The idea is that only machines in > >>> dc1 > >>>>>> have voting rights, and the ability to become a leader. The > >>> machines > >>>>>> in > >>>>>> the pods all have a weight of zero, and are not expected to > > become > >>>>>> leaders, or to vote on transactions. > >>>>>> > >>>>>> Let me know what I can do to help resolve this issue. > >>>>>> > >>>>>> -Todd > >>> > >