[
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834049#comment-15834049
]
ASF GitHub Bot commented on ZOOKEEPER-2080:
-------------------------------------------
Github user shralex commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/92#discussion_r97266698
--- Diff:
src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java ---
@@ -468,31 +469,33 @@ synchronized private boolean connectOne(long sid,
InetSocketAddress electionAddr
*/
synchronized void connectOne(long sid){
+ connectOne(sid, self.getLastSeenQuorumVerifier());
+ }
+
+ synchronized void connectOne(long sid, QuorumVerifier lastSeenQV){
if (senderWorkerMap.get(sid) != null) {
- LOG.debug("There is a connection already for server " + sid);
- return;
+ LOG.debug("There is a connection already for server " + sid);
+ return;
}
- synchronized(self) {
- boolean knownId = false;
- // Resolve hostname for the remote server before attempting to
- // connect in case the underlying ip address has changed.
- self.recreateSocketAddresses(sid);
- if (self.getView().containsKey(sid)) {
- knownId = true;
- if (connectOne(sid, self.getView().get(sid).electionAddr))
- return;
- }
- if (self.getLastSeenQuorumVerifier()!=null &&
self.getLastSeenQuorumVerifier().getAllMembers().containsKey(sid)
- && (!knownId ||
(self.getLastSeenQuorumVerifier().getAllMembers().get(sid).electionAddr !=
- self.getView().get(sid).electionAddr))) {
- knownId = true;
- if (connectOne(sid,
self.getLastSeenQuorumVerifier().getAllMembers().get(sid).electionAddr))
- return;
- }
- if (!knownId) {
- LOG.warn("Invalid server id: " + sid);
+ boolean knownId = false;
+ // Resolve hostname for the remote server before attempting to
+ // connect in case the underlying ip address has changed.
+ self.recreateSocketAddresses(sid);
+ if (self.getView().containsKey(sid)) {
--- End diff --
How about passing also the last committed view so you don't need to call
getView() multiple times ?
I know you're protecting this with the lock in QuorumPeer, but before I
read the other file I thought there may be a race because of multiple accesses
to the config. A comment would help here.
> ReconfigRecoveryTest fails intermittently
> -----------------------------------------
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
> Issue Type: Sub-task
> Reporter: Ted Yu
> Assignee: Michael Han
> Fix For: 3.5.3, 3.6.0
>
> Attachments: jacoco-ZOOKEEPER-2080.unzip-grows-to-70MB.7z,
> repro-20150816.log, threaddump.log, ZOOKEEPER-2080.patch,
> ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch,
> ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch
>
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
> FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
> at
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
> at
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)