ZooKeeper_branch34 - Build # 784 - Failure

2013-11-04 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34/784/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 234528 lines...]
[junit] 2013-11-04 08:52:47,630 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-04 08:52:47,631 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-11-04 08:52:47,632 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-11-04 08:52:47,632 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-11-04 08:52:47,632 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-04 08:52:47,632 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-11-04 08:52:47,632 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@224] - 
NIOServerCnxn factory exited run method
[junit] 2013-11-04 08:52:47,633 [myid:] - INFO  [main:ZooKeeperServer@441] 
- shutting down
[junit] 2013-11-04 08:52:47,633 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2013-11-04 08:52:47,633 [myid:] - INFO  
[main:PrepRequestProcessor@761] - Shutting down
[junit] 2013-11-04 08:52:47,633 [myid:] - INFO  
[main:SyncRequestProcessor@209] - Shutting down
[junit] 2013-11-04 08:52:47,633 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
[junit] 2013-11-04 08:52:47,633 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@187] - SyncRequestProcessor exited!
[junit] 2013-11-04 08:52:47,634 [myid:] - INFO  
[main:FinalRequestProcessor@415] - shutdown of request processor complete
[junit] 2013-11-04 08:52:47,634 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-04 08:52:47,635 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[]
[junit] 2013-11-04 08:52:47,636 [myid:] - INFO  [main:ClientBase@414] - 
STARTING server
[junit] 2013-11-04 08:52:47,636 [myid:] - INFO  [main:ZooKeeperServer@162] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/branch-3.4/build/test/tmp/test4289329406240412390.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/branch-3.4/build/test/tmp/test4289329406240412390.junit.dir/version-2
[junit] 2013-11-04 08:52:47,637 [myid:] - INFO  
[main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-04 08:52:47,640 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-04 08:52:47,641 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - 
Accepted socket connection from /127.0.0.1:55748
[junit] 2013-11-04 08:52:47,641 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@817] - Processing 
stat command from /127.0.0.1:55748
[junit] 2013-11-04 08:52:47,641 [myid:] - INFO  
[Thread-5:NIOServerCnxn$StatCommand@653] - Stat command output
[junit] 2013-11-04 08:52:47,642 [myid:] - INFO  
[Thread-5:NIOServerCnxn@997] - Closed socket connection for client 
/127.0.0.1:55748 (no session established for client)
[junit] 2013-11-04 08:52:47,642 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-04 08:52:47,643 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-11-04 08:52:47,643 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-11-04 08:52:47,643 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-11-04 08:52:47,644 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-04 08:52:47,644 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota
[junit] 2013-11-04 08:52:47,644 [myid:] - INFO  [main:ClientBase@451] - 
tearDown starting
[junit] 2013-11-04 08:52:47,719 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x1422250748a closed
[junit] 2013-11-04 08:52:47,719 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@509] - EventThread shut down
[junit] 2013-11-04 08:52:47,719 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-11-04 08:52:47,720 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@224] - 
NIOServerCnxn factory exited run method
[junit] 2013-11-04 08:52:47,720 

[jira] [Commented] (ZOOKEEPER-1805) Don't care value in ZooKeeper election breaks rolling upgrades

2013-11-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812694#comment-13812694
 ] 

Germán Blanco commented on ZOOKEEPER-1805:
--

Thank you for considering my comments. Here are a few more ...
Personally, I would check that both Votes are out of the election before 
ignoring the fields in Vote.java. And I believe peerEpoch should be taken into 
account for normal votes:
{noformat}
 +if ((state != ServerState.LOOKING)  (other.state != 
 ServerState.LOOKING)) {^M
 +   return (id == other.id);^M
 +} else {^M
 +   return (id == other.id^M
 +(zxid == other.zxid) ^M
 +(electionEpoch == other.electionEpoch)
 +(peerEpoch == other.peerEpoch));^M
 +}^M
{noformat}
I think that the previous test case (testJoinInconsistentEnsemble in 
FLETest.java) would look better if we now change also the peerEpoch:
{noformat}
 +Vote newVote = new Vote(leaderSid, zxid+100, electionEpoch+100, 
peerEpoch+100, state);
{noformat}
In this way, this test case also verifies the new changes.
As indicated before, I would also remove the method updateElectionVote in 
QuorumPeer.java and the line under the comment for ZOOKEEPER-1732 in 
Leader.java and Learner.java. The value of peerEpoch will be ignored, so 
updating it looks like a waste of time to me.

 Don't care value in ZooKeeper election breaks rolling upgrades
 

 Key: ZOOKEEPER-1805
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1805
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira
Priority: Blocker
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1805-b3.4.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch


 This is an issue that has been originally reported in ZOOKEEPER-1732.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


ZooKeeper-trunk-solaris - Build # 721 - Still Failing

2013-11-04 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/721/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 217092 lines...]
[junit] 2013-11-04 10:50:33,130 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2013-11-04 10:50:33,130 [myid:] - INFO  [main:ZooKeeperServer@428] 
- shutting down
[junit] 2013-11-04 10:50:33,131 [myid:] - INFO  
[main:SessionTrackerImpl@183] - Shutting down
[junit] 2013-11-04 10:50:33,131 [myid:] - INFO  
[main:PrepRequestProcessor@972] - Shutting down
[junit] 2013-11-04 10:50:33,131 [myid:] - INFO  
[main:SyncRequestProcessor@190] - Shutting down
[junit] 2013-11-04 10:50:33,131 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@156] - PrepRequestProcessor exited loop!
[junit] 2013-11-04 10:50:33,131 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@168] - SyncRequestProcessor exited!
[junit] 2013-11-04 10:50:33,131 [myid:] - INFO  
[main:FinalRequestProcessor@442] - shutdown of request processor complete
[junit] 2013-11-04 10:50:33,132 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-04 10:50:33,132 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[]
[junit] 2013-11-04 10:50:33,133 [myid:] - INFO  [main:ClientBase@414] - 
STARTING server
[junit] 2013-11-04 10:50:33,134 [myid:] - INFO  [main:ZooKeeperServer@149] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4149150081156729558.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4149150081156729558.junit.dir/version-2
[junit] 2013-11-04 10:50:33,134 [myid:] - INFO  
[main:NIOServerCnxnFactory@670] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2013-11-04 10:50:33,135 [myid:] - INFO  
[main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-04 10:50:33,136 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4149150081156729558.junit.dir/version-2/snapshot.b
[junit] 2013-11-04 10:50:33,138 [myid:] - INFO  [main:FileTxnSnapLog@297] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4149150081156729558.junit.dir/version-2/snapshot.b
[junit] 2013-11-04 10:50:33,139 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-04 10:50:33,140 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:57739
[junit] 2013-11-04 10:50:33,141 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@828] - Processing stat command from 
/127.0.0.1:57739
[junit] 2013-11-04 10:50:33,141 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@677] - Stat command output
[junit] 2013-11-04 10:50:33,141 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@999] - Closed socket connection for client 
/127.0.0.1:57739 (no session established for client)
[junit] 2013-11-04 10:50:33,141 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-04 10:50:33,142 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-11-04 10:50:33,143 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-11-04 10:50:33,143 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-11-04 10:50:33,143 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-04 10:50:33,143 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota
[junit] 2013-11-04 10:50:33,143 [myid:] - INFO  [main:ClientBase@451] - 
tearDown starting
[junit] 2013-11-04 10:50:33,217 [myid:] - INFO  [main:ZooKeeper@777] - 
Session: 0x14222bc4408 closed
[junit] 2013-11-04 10:50:33,217 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down
[junit] 2013-11-04 10:50:33,218 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-11-04 10:50:33,218 [myid:] - INFO  

[jira] [Commented] (ZOOKEEPER-1805) Don't care value in ZooKeeper election breaks rolling upgrades

2013-11-04 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812776#comment-13812776
 ] 

Flavio Junqueira commented on ZOOKEEPER-1805:
-

It still bothers me that we can't distinguish between old and new notification 
messages. I was thinking about introducing a format version field so that we 
can get around this problem and make the check in the way proposed instead of 
working around it.

I have a patch mostly ready, but I'd like to know if this a direction that is 
ok to pursue. If this ok, then I can add a sub-task here so that we can work 
this out separately, before fixing this issue.

 Don't care value in ZooKeeper election breaks rolling upgrades
 

 Key: ZOOKEEPER-1805
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1805
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira
Priority: Blocker
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1805-b3.4.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch


 This is an issue that has been originally reported in ZOOKEEPER-1732.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


ZooKeeper-trunk - Build # 2110 - Failure

2013-11-04 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk/2110/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 310215 lines...]
[junit] 2013-11-04 12:14:39,783 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@156] - PrepRequestProcessor exited loop!
[junit] 2013-11-04 12:14:39,784 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@168] - SyncRequestProcessor exited!
[junit] 2013-11-04 12:14:39,784 [myid:] - INFO  
[main:FinalRequestProcessor@442] - shutdown of request processor complete
[junit] 2013-11-04 12:14:39,784 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-04 12:14:39,785 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[]
[junit] 2013-11-04 12:14:39,786 [myid:] - INFO  [main:ClientBase@414] - 
STARTING server
[junit] 2013-11-04 12:14:39,786 [myid:] - INFO  [main:ZooKeeperServer@149] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/tmp/test7233425283476512048.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/tmp/test7233425283476512048.junit.dir/version-2
[junit] 2013-11-04 12:14:39,787 [myid:] - INFO  
[main:NIOServerCnxnFactory@670] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2013-11-04 12:14:39,787 [myid:] - INFO  
[main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-04 12:14:39,788 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/tmp/test7233425283476512048.junit.dir/version-2/snapshot.b
[junit] 2013-11-04 12:14:39,791 [myid:] - INFO  [main:FileTxnSnapLog@297] - 
Snapshotting: 0xb to 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/tmp/test7233425283476512048.junit.dir/version-2/snapshot.b
[junit] 2013-11-04 12:14:39,793 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-04 12:14:39,793 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:37685
[junit] 2013-11-04 12:14:39,794 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@828] - Processing stat command from 
/127.0.0.1:37685
[junit] 2013-11-04 12:14:39,794 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@677] - Stat command output
[junit] 2013-11-04 12:14:39,795 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@999] - Closed socket connection for client 
/127.0.0.1:37685 (no session established for client)
[junit] 2013-11-04 12:14:39,795 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-04 12:14:39,803 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-11-04 12:14:39,803 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-11-04 12:14:39,803 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-11-04 12:14:39,803 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-04 12:14:39,804 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota
[junit] 2013-11-04 12:14:39,804 [myid:] - INFO  [main:ClientBase@451] - 
tearDown starting
[junit] 2013-11-04 12:14:39,864 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down
[junit] 2013-11-04 12:14:39,864 [myid:] - INFO  [main:ZooKeeper@777] - 
Session: 0x14223094596 closed
[junit] 2013-11-04 12:14:39,864 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-11-04 12:14:39,864 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2013-11-04 12:14:39,864 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2013-11-04 12:14:39,864 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2013-11-04 12:14:39,864 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2013-11-04 12:14:39,865 [myid:] - INFO  [main:ZooKeeperServer@428] 
- shutting down
[junit] 2013-11-04 12:14:39,865 [myid:] - INFO  

[jira] [Updated] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-1807:
--

Issue Type: Bug  (was: New Feature)

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-1807:
--

Attachment: (was: ZOOKEEPER-1807.patch)

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales
 Fix For: 3.5.0


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-9863 PreCommit Build #1739

2013-11-04 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-9863
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1739/

###
## LAST 60 LINES OF THE CONSOLE 
###
Started by remote host 127.0.0.1
Building remotely on hadoop9 in workspace 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build
Reverting /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk 
to depth infinity with ignoreExternals: false
Updating http://svn.apache.org/repos/asf/zookeeper/trunk at revision 
'2013-11-04T17:42:47.747 +'
At revision 1538690
no change for http://svn.apache.org/repos/asf/zookeeper/trunk since the 
previous build
No emails were triggered.
[PreCommit-ZOOKEEPER-Build] $ /bin/bash /tmp/hudson2500245945366574607.sh
/home/jenkins/tools/java/latest/bin/java
Buildfile: build.xml

check-for-findbugs:

findbugs.check:

forrest.check:

hudson-test-patch:
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Testing patch for ZOOKEEPER-9863.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] At revision 1538690.
 [exec] ZOOKEEPER-9863 is not Patch Available.  Exiting.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 2 seconds
Archiving artifacts
ERROR: No artifacts found that match the file pattern 
trunk/build/test/findbugs/newPatchFindbugsWarnings.html,trunk/patchprocess/*.txt,trunk/patchprocess/*Warnings.xml,trunk/build/test/test-cppunit/*.txt,trunk/build/tmp/zk.log.
 Configuration error?
ERROR: ?trunk/build/test/findbugs/newPatchFindbugsWarnings.html? doesn?t match 
anything: ?trunk? exists but not 
?trunk/build/test/findbugs/newPatchFindbugsWarnings.html?
Build step 'Archive the artifacts' changed build result to FAILURE
Recording test results
Description set: ZOOKEEPER-9863
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813071#comment-13813071
 ] 

Alexander Shraer commented on ZOOKEEPER-1807:
-

probably there's not going to be any more of a loop than for participants.
if you think this is not acceptable for observers, it would be sufficient to 
reply only when the sending server has a bigger config version (the one in 
QuorumVerifier) than the potential receiver. Otherwise there's no benefit for 
the receiver in terms of learning about new configs. 



 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813077#comment-13813077
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1807:
---

Thanks for the quick comment Alex. Yeah sounds to me that might be acceptable. 
Again, for huge deployments it might be a bit of concern since you'll be 
putting extra pressure on the cluster after, say, a big network partition. 
Thoughts? Cc: [~thawan], [~fpj]. 

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813085#comment-13813085
 ] 

Hadoop QA commented on ZOOKEEPER-1807:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12611988/ZOOKEEPER-1807.patch
  against trunk revision 1535491.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1740//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1740//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1740//console

This message is automatically generated.

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Raul Gutierrez Segales
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco reassigned ZOOKEEPER-1807:


Assignee: Germán Blanco  (was: Raul Gutierrez Segales)

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Germán Blanco
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-1807 PreCommit Build #1740

2013-11-04 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1740/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 237685 lines...]
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12611988/ZOOKEEPER-1807.patch
 [exec]   against trunk revision 1535491.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1740//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1740//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1740//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] e450abe5dff16d08a430c5fe301fe1d6d2f1a583 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 2

Total time: 36 minutes 29 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1807
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-1807:
--

Attachment: notifications-loop.png

Here's how notification traffic (on election port 3888 in my case) goes down 
with the patch (i.e.: without the notifications loop). It's pretty dramatic so 
I'd say this is definitely a blocker for 3.5.0. 

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Germán Blanco
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch, notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813108#comment-13813108
 ] 

Hadoop QA commented on ZOOKEEPER-1807:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12611999/notifications-loop.png
  against trunk revision 1535491.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1741//console

This message is automatically generated.

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Germán Blanco
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch, notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-1807 PreCommit Build #1741

2013-11-04 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1741/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 63 lines...]
 [exec] 
==
 [exec] Applying patch.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] /usr/bin/patch:  Only garbage was found in the patch input.
 [exec] patch unexpectedly ends in middle of line
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12611999/notifications-loop.png
 [exec]   against trunk revision 1535491.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1741//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 740c8b4185b4d78f429ca9b61a33f873119c071a logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 1 minute 11 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1807
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Thawan Kooburat (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813111#comment-13813111
 ] 

Thawan Kooburat commented on ZOOKEEPER-1807:


I believe we have a much different concern using large number of observers. In 
our internal deployment, we did a few hacks which essentially kill all 
observer-to-observer communication. Observers only observe the result of 
election algorithm. We also add random delay when observer try to reconnect, so 
that participants has a chance to synchronize with the leader and form the 
quorum before the observers take away the leader's bandwidth. 

My understanding is that with our leader election algorithm, you need to 
broadcast your vote whenever your current vote change, so this will generate a 
lot of message during the initial phase of the algorithm. Also, N x N 
communication needed by LE is not going to scale for large deployment.  For me, 
I don't think promoting observer to participant is going to be a common case 
(only needed for DR purpose), it would be acceptable to have optional flag to 
disable that feature in order to reduce LE overhead with large number of 
observers.

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Germán Blanco
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807.patch, notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Alexander Shraer (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Shraer updated ZOOKEEPER-1807:


Attachment: ZOOKEEPER-1807-alex.patch

Sorry for the confusion, everyone, but it seems that for reconfiguration 
purposes its only important to send a notification (containing new config) to a 
server if its a participant either in the current or in the next configuration. 
Only in that case we may need to convince him to adopt its new role as a 
participant and help form a quorum. So perhaps the attached patch could work. 
What do you think ?

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Germán Blanco
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807-alex.patch, ZOOKEEPER-1807.patch, 
 notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813287#comment-13813287
 ] 

Alexander Shraer commented on ZOOKEEPER-1807:
-

This part is described in Section 3.2 of the paper: 
https://www.usenix.org/system/files/conference/atc12/atc12-final74.pdf
Of course the paper doesn't talk about FastLeaderElection and things like that. 
So the actual implementation needs to have comments, and it does have them in 
many places, here we should probably explain some more. 

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Germán Blanco
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807-alex.patch, ZOOKEEPER-1807.patch, 
 notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-1807 PreCommit Build #1742

2013-11-04 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1742/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 270048 lines...]
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12612023/ZOOKEEPER-1807-alex.patch
 [exec]   against trunk revision 1535491.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1742//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1742//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1742//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 703a0aaef1d8bc57a09a2890b34bef39bdde99b1 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 33 minutes 40 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1807
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813294#comment-13813294
 ] 

Hadoop QA commented on ZOOKEEPER-1807:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12612023/ZOOKEEPER-1807-alex.patch
  against trunk revision 1535491.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1742//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1742//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1742//console

This message is automatically generated.

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Germán Blanco
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807-alex.patch, ZOOKEEPER-1807.patch, 
 notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1804) Stat the realtime tps of zookeepr server

2013-11-04 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813386#comment-13813386
 ] 

Patrick Hunt commented on ZOOKEEPER-1804:
-

Hi [~nileader], in order for the patchbot to do it's work you'll need to attach 
a patch generated with the --no-prefix option in git, see the guide:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute

 Stat the realtime tps of zookeepr server
 

 Key: ZOOKEEPER-1804
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1804
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Leader Ni
Assignee: Leader Ni
 Attachments: ZOOKEEPER-1804-2.patch, ZOOKEEPER-1804.patch


 At this time, we assessed whether zookeeper supports some business scenarios, 
 always use the number of subscribers, or to assess the number of clients。
 You konw, some times, many client connection with zookeeper, but do noting, 
 and the onthers do complex business logic。
 So,we must stat the realtime tps of zookeepr。
 [-Solution---]
 Solution1: 
 If you only want to know the real time transaction processed, you can use the 
 patch ZOOKEEPER-1804.patch.
 Solution2:
 If you also want to know how client use zookeeper, and the real time r/w ps 
 of each zookeeper client, you can use the patch ZOOKEEPER-1804-2.patch
 use java properties: -Dserver_process_stats=true to open the function.
 Sample:
 $echo rwps|nc localhost 2181
 RealTime R/W Statistics:
 getChildren2:   0.5994005994005994
 createSession:  1.6983016983016983
 closeSession:   0.999000999000999
 setData: 110.18981018981019
 setWatches:   129.17082917082917
 getChildren:    68.83116883116884
 delete:  19.980019980019982
 create:  22.27772227772228
 exists:  1806.2937062937062
 getDate: 729.5704295704296



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1666) Avoid Reverse DNS lookup if the hostname in connection string is literal IP address.

2013-11-04 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813548#comment-13813548
 ] 

Camille Fournier commented on ZOOKEEPER-1666:
-

I checked the C code, not a C expert but it looks like we rely on getaddrinfo 
which takes an ip address or a hostname, so I think we're good there. I will 
check this in.

 Avoid Reverse DNS lookup if the hostname in connection string is literal IP 
 address.
 

 Key: ZOOKEEPER-1666
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1666
 Project: ZooKeeper
  Issue Type: Improvement
  Components: java client
Reporter: George Cao
Assignee: George Cao
  Labels: patch, test
 Attachments: ZOOKEEPER-1666.patch, ZOOKEEPER-1666.patch


 In our ENV, if the InetSocketAddress.getHostName() is called and the host 
 name in the connection string are literal IP address, then the call will 
 trigger a reverse DNS lookup which is very slow.
 And in this situation, the host name can simply set as the IP without causing 
 any problem. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1666) Avoid Reverse DNS lookup if the hostname in connection string is literal IP address.

2013-11-04 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813559#comment-13813559
 ] 

Camille Fournier commented on ZOOKEEPER-1666:
-

I got this into 3.5, but it requires a bit of a rewrite to work for 3.4.6. If 
we want to put it there, I need you to write it to fit, [~georgecao]. LMK, 
otherwise I will resolve this for just 3.5.

 Avoid Reverse DNS lookup if the hostname in connection string is literal IP 
 address.
 

 Key: ZOOKEEPER-1666
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1666
 Project: ZooKeeper
  Issue Type: Improvement
  Components: java client
Reporter: George Cao
Assignee: George Cao
  Labels: patch, test
 Attachments: ZOOKEEPER-1666.patch, ZOOKEEPER-1666.patch


 In our ENV, if the InetSocketAddress.getHostName() is called and the host 
 name in the connection string are literal IP address, then the call will 
 trigger a reverse DNS lookup which is very slow.
 And in this situation, the host name can simply set as the IP without causing 
 any problem. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1652) zookeeper java client does a reverse dns lookup when connecting

2013-11-04 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813565#comment-13813565
 ] 

Camille Fournier commented on ZOOKEEPER-1652:
-

I believe this addresses the same issue as ZOOKEEPER-1666, but will work for 
3.4.6. So I'm going to use this for 3.4.6, but not trunk, which was resolved 
with the other patch.

 zookeeper java client does a reverse dns lookup when connecting
 ---

 Key: ZOOKEEPER-1652
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1652
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.5
Reporter: Sean Bridges
Assignee: Sean Bridges
Priority: Critical
 Attachments: ZOOKEEPER-1652.patch


 When connecting to zookeeper, the client does a reverse dns lookup on the 
 hostname.  In our environment, the reverse dns lookup takes 5 seconds to 
 fail, causing zookeeper clients to connect slowly.
 The reverse dns lookup occurs in ClientCnx in the calls to adr.getHostName()
 {code}
 setName(getName().replaceAll(\\(.*\\),
 ( + addr.getHostName() + : + addr.getPort() + )));
 try {
 zooKeeperSaslClient = new 
 ZooKeeperSaslClient(zookeeper/+addr.getHostName());
 } catch (LoginException e) {
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1652) zookeeper java client does a reverse dns lookup when connecting

2013-11-04 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813570#comment-13813570
 ] 

Camille Fournier commented on ZOOKEEPER-1652:
-

Actually, when I apply this change with the test for ZOOKEEPER-1666, that test 
fails. [~georgecao], [~sbridges], want to take a look?

 zookeeper java client does a reverse dns lookup when connecting
 ---

 Key: ZOOKEEPER-1652
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1652
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.5
Reporter: Sean Bridges
Assignee: Sean Bridges
Priority: Critical
 Attachments: ZOOKEEPER-1652.patch


 When connecting to zookeeper, the client does a reverse dns lookup on the 
 hostname.  In our environment, the reverse dns lookup takes 5 seconds to 
 fail, causing zookeeper clients to connect slowly.
 The reverse dns lookup occurs in ClientCnx in the calls to adr.getHostName()
 {code}
 setName(getName().replaceAll(\\(.*\\),
 ( + addr.getHostName() + : + addr.getPort() + )));
 try {
 zooKeeperSaslClient = new 
 ZooKeeperSaslClient(zookeeper/+addr.getHostName());
 } catch (LoginException e) {
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1666) Avoid Reverse DNS lookup if the hostname in connection string is literal IP address.

2013-11-04 Thread Camille Fournier (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Camille Fournier updated ZOOKEEPER-1666:


Attachment: ZOOKEEPER-1666-34.patch

3.4.6 patch

 Avoid Reverse DNS lookup if the hostname in connection string is literal IP 
 address.
 

 Key: ZOOKEEPER-1666
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1666
 Project: ZooKeeper
  Issue Type: Improvement
  Components: java client
Reporter: George Cao
Assignee: George Cao
  Labels: patch, test
 Attachments: ZOOKEEPER-1666-34.patch, ZOOKEEPER-1666.patch, 
 ZOOKEEPER-1666.patch


 In our ENV, if the InetSocketAddress.getHostName() is called and the host 
 name in the connection string are literal IP address, then the call will 
 trigger a reverse DNS lookup which is very slow.
 And in this situation, the host name can simply set as the IP without causing 
 any problem. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1652) zookeeper java client does a reverse dns lookup when connecting

2013-11-04 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813578#comment-13813578
 ] 

Camille Fournier commented on ZOOKEEPER-1652:
-

Elected to make a patch for ZOOKEEPER-1666 that makes that work on 3.4.6. 
Please look there for that, and leave comments.

 zookeeper java client does a reverse dns lookup when connecting
 ---

 Key: ZOOKEEPER-1652
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1652
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.5
Reporter: Sean Bridges
Assignee: Sean Bridges
Priority: Critical
 Attachments: ZOOKEEPER-1652.patch


 When connecting to zookeeper, the client does a reverse dns lookup on the 
 hostname.  In our environment, the reverse dns lookup takes 5 seconds to 
 fail, causing zookeeper clients to connect slowly.
 The reverse dns lookup occurs in ClientCnx in the calls to adr.getHostName()
 {code}
 setName(getName().replaceAll(\\(.*\\),
 ( + addr.getHostName() + : + addr.getPort() + )));
 try {
 zooKeeperSaslClient = new 
 ZooKeeperSaslClient(zookeeper/+addr.getHostName());
 } catch (LoginException e) {
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1666) Avoid Reverse DNS lookup if the hostname in connection string is literal IP address.

2013-11-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813637#comment-13813637
 ] 

Hadoop QA commented on ZOOKEEPER-1666:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12612093/ZOOKEEPER-1666-34.patch
  against trunk revision 1538853.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1743//console

This message is automatically generated.

 Avoid Reverse DNS lookup if the hostname in connection string is literal IP 
 address.
 

 Key: ZOOKEEPER-1666
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1666
 Project: ZooKeeper
  Issue Type: Improvement
  Components: java client
Reporter: George Cao
Assignee: George Cao
  Labels: patch, test
 Attachments: ZOOKEEPER-1666-34.patch, ZOOKEEPER-1666.patch, 
 ZOOKEEPER-1666.patch


 In our ENV, if the InetSocketAddress.getHostName() is called and the host 
 name in the connection string are literal IP address, then the call will 
 trigger a reverse DNS lookup which is very slow.
 And in this situation, the host name can simply set as the IP without causing 
 any problem. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-1666 PreCommit Build #1743

2013-11-04 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1666
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1743/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 68 lines...]
 [exec] 
==
 [exec] 
 [exec] 
 [exec] patching file 
src/java/test/org/apache/zookeeper/test/StaticHostProviderTest.java
 [exec] Hunk #1 FAILED at 35.
 [exec] Hunk #2 succeeded at 301 with fuzz 1 (offset 209 lines).
 [exec] 1 out of 2 hunks FAILED -- saving rejects to file 
src/java/test/org/apache/zookeeper/test/StaticHostProviderTest.java.rej
 [exec] patching file 
src/java/main/org/apache/zookeeper/client/StaticHostProvider.java
 [exec] Hunk #1 FAILED at 56.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
src/java/main/org/apache/zookeeper/client/StaticHostProvider.java.rej
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12612093/ZOOKEEPER-1666-34.patch
 [exec]   against trunk revision 1538853.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1743//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] feef8621728acce1af4f17c9cd65d22bf710ec7c logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 1 minute 26 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1666
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

ZooKeeper_branch33_solaris - Build # 698 - Still Failing

2013-11-04 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch33_solaris/698/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 98610 lines...]
[junit] 2013-11-05 07:09:01,635 - INFO  [main:ZooKeeperServer@154] - 
Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test5859429153162977043.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test5859429153162977043.junit.dir/version-2
[junit] 2013-11-05 07:09:01,636 - INFO  [main:NIOServerCnxn$Factory@143] - 
binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-05 07:09:01,638 - INFO  [main:FileSnap@82] - Reading 
snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test5859429153162977043.junit.dir/version-2/snapshot.0
[junit] 2013-11-05 07:09:01,641 - INFO  [main:FileTxnSnapLog@256] - 
Snapshotting: b
[junit] 2013-11-05 07:09:01,644 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] 2013-11-05 07:09:01,645 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /127.0.0.1:64395
[junit] 2013-11-05 07:09:01,645 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@1237] - Processing 
stat command from /127.0.0.1:64395
[junit] 2013-11-05 07:09:01,646 - INFO  
[Thread-4:NIOServerCnxn$StatCommand@1153] - Stat command output
[junit] ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-05 07:09:01,647 - INFO  [Thread-4:NIOServerCnxn@1435] - 
Closed socket connection for client /127.0.0.1:64395 (no session established 
for client)
[junit] expect:InMemoryDataTree
[junit] found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-05 07:09:01,649 - INFO  [main:ClientBase@408] - STOPPING 
server
[junit] 2013-11-05 07:09:01,650 - INFO  
[ProcessThread:-1:PrepRequestProcessor@128] - PrepRequestProcessor exited loop!
[junit] 2013-11-05 07:09:01,650 - INFO  
[SyncThread:0:SyncRequestProcessor@151] - SyncRequestProcessor exited!
[junit] 2013-11-05 07:09:01,651 - INFO  [main:FinalRequestProcessor@370] - 
shutdown of request processor complete
[junit] 2013-11-05 07:09:01,652 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] ensureOnly:[]
[junit] 2013-11-05 07:09:01,654 - INFO  [main:ClientBase@401] - STARTING 
server
[junit] 2013-11-05 07:09:01,654 - INFO  [main:ZooKeeperServer@154] - 
Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test5859429153162977043.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test5859429153162977043.junit.dir/version-2
[junit] 2013-11-05 07:09:01,655 - INFO  [main:NIOServerCnxn$Factory@143] - 
binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-05 07:09:01,656 - INFO  [main:FileSnap@82] - Reading 
snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test5859429153162977043.junit.dir/version-2/snapshot.b
[junit] 2013-11-05 07:09:01,659 - INFO  [main:FileTxnSnapLog@256] - 
Snapshotting: b
[junit] 2013-11-05 07:09:01,661 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] 2013-11-05 07:09:01,662 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /127.0.0.1:64397
[junit] 2013-11-05 07:09:01,662 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@1237] - Processing 
stat command from /127.0.0.1:64397
[junit] 2013-11-05 07:09:01,663 - INFO  
[Thread-5:NIOServerCnxn$StatCommand@1153] - Stat command output
[junit] 2013-11-05 07:09:01,663 - INFO  [Thread-5:NIOServerCnxn@1435] - 
Closed socket connection for client /127.0.0.1:64397 (no session established 
for client)
[junit] ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] expect:InMemoryDataTree
[junit] found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-05 07:09:01,665 - INFO  

[jira] [Commented] (ZOOKEEPER-1805) Don't care value in ZooKeeper election breaks rolling upgrades

2013-11-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813720#comment-13813720
 ] 

Germán Blanco commented on ZOOKEEPER-1805:
--

I understand that you want to add a version field to notifications in order to 
know which come from a server that ignores zxid and electionEpoch for an 
established ensemble and which come from a server without this change, corrrect?
Once that is done, then it would be possible to make the correct comparison for 
the epoch when joining an ensemble with a mixture of updated and not-updated 
servers.
That sounds good for me. Having a version field will help in the future if any 
other change is required in notifications for fast leader election. For this 
problem, it means that the comparison between votes only needs to be different 
for the special case in which there is a mixture of servers, and it doesn't 
need to be modified at all for the rest of the cases, which seems to be a safer 
approach.

 Don't care value in ZooKeeper election breaks rolling upgrades
 

 Key: ZOOKEEPER-1805
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1805
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira
Priority: Blocker
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1805-b3.4.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, 
 ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch


 This is an issue that has been originally reported in ZOOKEEPER-1732.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-04 Thread Alexander Shraer (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Shraer reassigned ZOOKEEPER-1807:
---

Assignee: Alexander Shraer  (was: Germán Blanco)

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Alexander Shraer
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807-alex.patch, ZOOKEEPER-1807.patch, 
 notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)