[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863888#action_12863888
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


The test still locks up but it seems that it's Ant that freezes.  This process 
then becomes a zombie process.  This zombie process holds on to ports that the  
tests use and so subsequent runs fail.  Notably, QuorumPeerMainTest takes 15 
minutes to fail.


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863902#action_12863902
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Hi Alan - 

Looking at this attachment: nohup-AsyncHammerTest-201004301209.txt - the tests 
appear to be run twice. The first testObserversHammer completes successfully, 
the second fails. Were you running the tests until you experienced the failure? 

Henry

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863912#action_12863912
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


The first run is the run where Ant locks up.  The second run is the subsequent 
run after the first run zombies.  

Rebooting clears everything for the first test run to lock up and become 
zombied.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863915#action_12863915
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Weird - it looks like the test is shutting down correctly:


[junit] 2010-04-30 11:41:52,896 - INFO  [main:clientb...@222] - connecting to 
127.0.0.1 11233
[junit] 2010-04-30 11:41:52,896 - INFO  [main:quorumb...@277] - 
127.0.0.1:11233 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,896 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11234
[junit] 2010-04-30 11:41:52,897 - INFO  [main:quorumb...@277] - 
127.0.0.1:11234 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,897 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11235
[junit] 2010-04-30 11:41:52,897 - INFO  [main:quorumb...@277] - 
127.0.0.1:11235 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,897 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11236
[junit] 2010-04-30 11:41:52,898 - INFO  [main:quorumb...@277] - 
127.0.0.1:11236 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,898 - INFO  [main:clientb...@222] - connecting 
to 127.0.0.1 11237
[junit] 2010-04-30 11:41:52,898 - INFO  [main:quorumb...@277] - 
127.0.0.1:11237 is no longer accepting client connections
[junit] 2010-04-30 11:41:52,901 - INFO  
[main:junit4zktestrunner$loggedinvokemet...@56] - FINISHED TEST METHOD 
testObserversHammer
[junit] 2010-04-30 11:41:52,901 - INFO  [main:zktestcas...@59] - SUCCEEDED 
testObserversHammer
[junit] 2010-04-30 11:41:52,901 - INFO  [main:zktestcas...@54] - FINISHED 
testObserversHammer

and then it goes into trying the C tests which fail for an unrelated reason - 
does it lock up at this point or does it actually fail out to the CLI? If it 
locks up, is the jstack output you attached from that run?



 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863917#action_12863917
 ] 

Benjamin Reed commented on ZOOKEEPER-690:
-

alan it looks like you took the jstack of the parent process rather than the 
child.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863922#action_12863922
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


Correct.  The child seem to have completed.  The parent was the one that seemed 
to have locked up.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863996#action_12863996
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-690:
--

+1, the patch looks good. 

I believe I have the same setup as Alan: Mac OS X 10.5.8 and the same java 
version. It runs fine for me. I have run from Eclipse and from the command line 
multiple times, and I observed no problem. I have also asked a colleague with 
the same setup to try to reproduce, and also runs for my colleague. 

Alan, could you make sure that JAVA_HOME is set to the correct path before you 
run ant from the command line? If the variable is not set, then ant might not 
pick the correct java version to run, even if it is marked in the java 
preferences utility.

We should keep investigating this problem, but given that it runs in two 
computers with similar configurations, I recommend we remove the blocker flag 
for this issue.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864009#action_12864009
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


Are you saying that the AsyncHammerTest also fails on your machines w/ out the 
patch?  I ask this because {{trunk}}, with no patches, builds and tests fine on 
all my other Macs except the one at work.

My JAVA_HOME is properly set.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864014#action_12864014
 ] 

Patrick Hunt commented on ZOOKEEPER-690:


I think Flavio is saying it works for him.

Alan, any idea what's different btw the versions that are running fine, and the 
one where it fails? Java VM version? CPU core count, networking setup, is there 
something that stands out?


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
 Fix For: 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-05-04 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864040#action_12864040
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-690:
--

Hi Alan, It does not block for me, with or without the patch. My configuration 
is pretty much like your fine one. 

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
 Fix For: 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, jstack-AsyncHammerTest-201004301209.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, nohup-201004291527.txt, 
 nohup-AsyncHammerTest-201004301209.txt, 
 nohup-QuorumPeerMainTest-201004301209.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862594#action_12862594
 ] 

Hadoop QA commented on ZOOKEEPER-690:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12443246/ZOOKEEPER-690.patch
  against trunk revision 939172.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 12 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/77/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/77/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/77/console

This message is automatically generated.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, nohup-201004201053.txt, nohup-201004291409.txt, 
 nohup-201004291527.txt, TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, 
 zoo.log, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-29 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862351#action_12862351
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Alan - can you try this patch to see if it fixes things? 

Thanks, 

Henry


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, nohup-201004201053.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-29 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862398#action_12862398
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


Test does not lock up but it does not pass.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, nohup-201004201053.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-29 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862414#action_12862414
 ] 

Benjamin Reed commented on ZOOKEEPER-690:
-

i think the key fix is here:

{quote}
public void setLearnerType(LearnerType p) {
   learnerType = p;
   if (quorumPeers.containsValue(this.myid)) {
   this.quorumPeers.get(myid).type = p;
   } else {
   LOG.error(Setting LearnerType to  + p +  but  + myid 
   +  not in QuorumPeers. );
   }
   ...
}
{quote}

right?

the problem i see is that we are only updating the quorumPeers for the one 
peer.  the other peers are going to be thinking it is a participant.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-29 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862424#action_12862424
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

This map is, I think, shared between the quorumpeers for the purposes of the 
test (and in general there aren't two quorumpeers sharing this datastructure 
when running normally). 

But! The error here is that I'm dumb (and that Java's type-checking leaves a 
little to be desired). I've written quorumPeers.containsValue up there, but 
actually it should be quorumPeers.containsKey. New patch on the way, let's see 
if that fixes it.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 nohup-201004201053.txt, nohup-201004291409.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log, 
 ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-29 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862480#action_12862480
 ] 

Benjamin Reed commented on ZOOKEEPER-690:
-

henry, i think this may show that we can't really have a setLearnerType() 
method. In the real distributed setting, each peer will have its own list, so 
we should really think of the peers list as immutable.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, nohup-201004201053.txt, nohup-201004291409.txt, 
 nohup-201004291527.txt, TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, 
 zoo.log, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-29 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862482#action_12862482
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Ben - 

Agreed. I see this as the same as setMyid(...) - it sets an immutable value and 
should only be called once. I'd prefer if these parameters were 'final' in 
QuorumPeer and set in the constructor, but that's not the way that 
runFromConfig (the only place outside of tests that these methods are called) 
is written. Then we could get rid of setLearnerType, for sure. 

The real error here, I think, is duplicating the learnertype between QuorumPeer 
and QuorumServer. If we are going to have the list of QuorumServers, then 
getLearnerType should lookup the learner type in the peer map. Same for the 
serverid, perhaps, and we should just save a reference to the QuorumServer that 
represents our Quorumpeer. 


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, jstack-201004291409.txt, 
 jstack-201004291527.txt, nohup-201004201053.txt, nohup-201004291409.txt, 
 nohup-201004291527.txt, TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, 
 zoo.log, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch, ZOOKEEPER-690.patch


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-28 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861865#action_12861865
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Progress update - possibly to do with a bug in FLE allowing an Observer to be 
elected. We're looking into this now.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, nohup-201004201053.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-28 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861909#action_12861909
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-690:
--

FLE relies upon a correct implementation of the voting view. My understanding 
is that if an observer is being elected leader, then the following predicate is 
evaluating to false:

{noformat}
(!self.getVotingView().containsKey(response.sid))
{noformat}

This is line 201 of FastLeaderElection.java. If this predicate is true, meaning 
that an observer is not in the voting view of a server, then the server will 
send a response right away and won't consider the vote of the observer. Does it 
make sense?


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Henry Robinson
Priority: Blocker
 Fix For: 3.3.1, 3.4.0

 Attachments: jstack-201004201053.txt, nohup-201004201053.txt, 
 TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, zoo.log


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-20 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858767#action_12858767
 ] 

Patrick Hunt commented on ZOOKEEPER-690:


The following snippets are from 
TEST-org.apache.zookeeper.test.AsyncHammerTest.txt

Looks like there's a problem with the address already being in use. Maybe you 
can 
try editing src/java/test/org/apache/zookeeper/PortAssignment.java and see if 
that helps?

2010-04-19 14:31:55,010 - INFO  [main:quorumb...@88] - Ports are: 
127.0.0.1:11222,127.0.0.1:11223,127.0.0.1:11224,127.0.0.1:11225,127.0.0.1:11226
2010-04-19 14:31:55,015 - INFO  [main:quorumb...@145] - creating QuorumPeer 1 
port 11222
2010-04-19 14:31:55,062 - INFO  [main:nioservercnxn$fact...@144] - binding to 
port 0.0.0.0/0.0.0.0:11222
2010-04-19 14:31:55,063 - INFO  [main:asynchammert...@68] - Test clients 
shutting down
2010-04-19 14:31:55,064 - INFO  [main:quorumb...@254] - TearDown started

Testcase: testHammer took 0.443 sec
Caused an ERROR
Address already in use
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxn$Factory.init(NIOServerCnxn.java:145)
at 
org.apache.zookeeper.server.NIOServerCnxn$Factory.init(NIOServerCnxn.java:126)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.init(QuorumPeer.java:450)
at 
org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:146)
at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:96)
at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:60)
at 
org.apache.zookeeper.test.AsyncHammerTest.setUp(AsyncHammerTest.java:54)

Caused an ERROR


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1

 Attachments: TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, 
 zoo.log


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-20 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858980#action_12858980
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


I reran the tests a minute or two later, without having done anything, and the 
test got hung.  I attached the thread dump to this issue.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1

 Attachments: TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, 
 zoo.log


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-20 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858984#action_12858984
 ] 

Patrick Hunt commented on ZOOKEEPER-690:


Can you attach the log, that would be very useful for me to review prior to 
digging into the thread dump.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1

 Attachments: TEST-org.apache.zookeeper.test.AsyncHammerTest.txt, 
 zoo.log


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-19 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858644#action_12858644
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


This consistently hangs on my machine.  I'll try to get some time to see why 
this is so.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-19 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858665#action_12858665
 ] 

Henry Robinson commented on ZOOKEEPER-690:
--

Alan - that would be great. If you can take a jstack dump of the process when 
it hangs we can do some forensics.

 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-690) AsyncTestHammer test fails on hudson.

2010-04-19 Thread Alan Cabrera (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858689#action_12858689
 ] 

Alan Cabrera commented on ZOOKEEPER-690:


{code}
junit.run:
[junit] Running org.apache.zookeeper.test.AsyncHammerTest
[junit] Tests run: 2, Failures: 0, Errors: 2, Time elapsed: 32.958 sec
[junit] Test org.apache.zookeeper.test.AsyncHammerTest FAILED

BUILD FAILED
/Users/acabrera/dev/hadoop/zookeeper/build.xml:818: Tests failed!

Total time: 46 seconds
[acabrera-md:zookeeper 620]$ java -version
java version 1.6.0_17
Java(TM) SE Runtime Environment (build 1.6.0_17-b04-248-9M3125)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01-101, mixed mode)
{code}

Mac OS X 10.5.8


 AsyncTestHammer test fails on hudson.
 -

 Key: ZOOKEEPER-690
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-690
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.1


 the hudson test failed on 
 http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/2/testReport/.
  There are huge set of cancelledkeyexceptions in the logs. Still going 
 through the logs to find out the reason for failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.