[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124766#comment-15124766
 ] 

Hadoop QA commented on ZOOKEEPER-2247:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12785335/ZOOKEEPER-2247-10.patch
  against trunk revision 1726354.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3024//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3024//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3024//console

This message is automatically generated.

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch, 
> ZOOKEEPER-2247-10.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: ZOOKEEPER-2247 PreCommit Build #3024

2016-01-29 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3024/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 379396 lines...]
 [exec] 
 [exec] +1 tests included.  The patch appears to include 9 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3024//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3024//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3024//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 8732ae556b518abcc61d9e4d11efa144fb67aeb9 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1816:
 exec returned: 1

Total time: 8 minutes 53 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Compressed 543.84 KB of artifacts by 47.1% relative to #3019
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-2247
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentServersAreObserversInNextConfig

Error Message:
waiting for server 2 being up

Stack Trace:
junit.framework.AssertionFailedError: waiting for server 2 being up
at 
org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentServersAreObserversInNextConfig(ReconfigRecoveryTest.java:217)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)




[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124741#comment-15124741
 ] 

Rakesh R commented on ZOOKEEPER-2247:
-

Attached another patch addressing {{while (self.isRunning() && 
this.isRunning())}} comment. Also, added checks in the {{Leader}}, that was 
missed previously.

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch, 
> ZOOKEEPER-2247-10.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-2247:

Attachment: ZOOKEEPER-2247-10.patch

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch, 
> ZOOKEEPER-2247-10.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124733#comment-15124733
 ] 

Rakesh R commented on ZOOKEEPER-2247:
-


bq. Per discussion in the mailing list, lets punt this to 3.4.9.
Thanks for the patience [~rgs] . If we get a good solution will include this, 
otw agreed to postpone.

Thanks [~fpj] for the review comments.
bq. Instead of doing this while (self.isRunning() && this.isRunning()), why 
don't you do this while (this.isRunning()) and check self.running() in 
this.isRunning()?
Agreed, will change it.

{quote} I don't think we need the health monitor thread. It is just shutting 
down the cnxn factories and you could do it immediately after the server loops. 
For example, in Follower.java, add the cnxn factory shutdown calls after the 
'while (this.isRunning())'. Does it work?{quote}
The solution works well with ensemble. IIUC, there is no server loop in the 
standalone server. He just waits on {{#join}} 
[ZooKeeperServerMain.java#L149|https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMain.java#L149].
 I'm not finding a way to interrupt this without a watchdog in standalone mode. 
Could you please correct me if I'm missing anything. Thanks!

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124729#comment-15124729
 ] 

Rakesh R commented on ZOOKEEPER-2247:
-

Thanks [~cnauroth] for the interest.

bq. In the latest patch, I only see the new HeartbeatMonitor thread started in 
standalone mode, because it's coupled with ZooKeeperServerMain. In the case of 
a full ensemble, QuorumPeerMain won't start the thread, because it won't call 
ZooKeeperServerMain (unless it's the degenerate case of a single-node ensemble).
Monitor thread is not required for the quorum as they already have a server 
loop where the {{internalError}} checks has been integrated. But in case of 
standalone there is no loops instead it has {{#join()}} like I mentioned 
earlier 
[ZooKeeperServerMain.java#L149|https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMain.java#L149].
 Thats the reason I've added a watch dog there.

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124723#comment-15124723
 ] 

Flavio Junqueira commented on ZOOKEEPER-2247:
-

[~rakesh_r] Thanks for updating the patch. There are two things I believe we 
can improve here:

# Instead of doing this {{while (self.isRunning() && this.isRunning())}}, why 
don't you do this {{while (this.isRunning())}} and check {{self.running()}} in 
{{this.isRunning()}}?
# I don't think we need the health monitor thread. It is just shutting down the 
cnxn factories and you could do it immediately after the server loops. For 
example, in Follower.java, add the cnxn factory shutdown calls after the 
{{while (this.isRunning()) {...} }}. Does it work? 

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124549#comment-15124549
 ] 

Chris Nauroth commented on ZOOKEEPER-2247:
--

In the latest patch, I only see the new {{HeartbeatMonitor}} thread started in 
standalone mode, because it's coupled with {{ZooKeeperServerMain}}.  In the 
case of a full ensemble, {{QuorumPeerMain}} won't start the thread, because it 
won't call {{ZooKeeperServerMain}} (unless it's the degenerate case of a 
single-node ensemble).

I've been trying to look for a way to solve this without adding another 
watchdog thread, but unfortunately I don't have another proposal to offer right 
now.

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124117#comment-15124117
 ] 

Chris Nauroth commented on ZOOKEEPER-1936:
--

There was a test failure in {{WatcherTest#testWatcherAutoResetWithLocal}}, but 
I can't reproduce it.

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v4.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2355) Ephemeral node is never deleted if follower fails while reading the proposal packet

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124088#comment-15124088
 ] 

Hadoop QA commented on ZOOKEEPER-2355:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12785227/ZOOKEEPER-2355-02.patch
  against trunk revision 1726354.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3023//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3023//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3023//console

This message is automatically generated.

> Ephemeral node is never deleted if follower fails while reading the proposal 
> packet
> ---
>
> Key: ZOOKEEPER-2355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9
>
> Attachments: ZOOKEEPER-2355-01.patch, ZOOKEEPER-2355-02.patch
>
>
> ZooKeeper ephemeral node is never deleted if follower fail while reading the 
> proposal packet
> The scenario is as follows:
> # Configure three node ZooKeeper cluster, lets say nodes are A, B and C, 
> start all, assume A is leader, B and C are follower
> # Connect to any of the server and create ephemeral node /e1
> # Close the session, ephemeral node /e1 will go for deletion
> # While receiving delete proposal make Follower B to fail with 
> {{SocketTimeoutException}}. This we need to do to reproduce the scenario 
> otherwise in production environment it happens because of network fault.
> # Remove the fault, just check that faulted Follower is now connected with 
> quorum
> # Connect to any of the server, create the same ephemeral node /e1, created 
> is success.
> # Close the session,  ephemeral node /e1 will go for deletion
> # {color:red}/e1 is not deleted from the faulted Follower B, It should have 
> been deleted as it was again created with another session{color}
> # {color:green}/e1 is deleted from Leader A and other Follower C{color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: ZOOKEEPER-2355 PreCommit Build #3023

2016-01-29 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3023/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 377543 lines...]
 [exec] 
 [exec] +1 tests included.  The patch appears to include 5 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3023//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3023//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3023//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] a2b6ad36635c4a2034ac87c8c458c2b78c75e511 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1816:
 exec returned: 1

Total time: 22 minutes 47 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Compressed 543.63 KB of artifacts by 23.5% relative to #3019
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-2355
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
7 tests failed.
FAILED:  
org.apache.zookeeper.server.ZxidRolloverTest.testRolloverThenFollowerRestart

Error Message:
KeeperErrorCode = ConnectionLoss for /foofoofoo-connected

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /foofoofoo-connected
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1643)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1671)
at 
org.apache.zookeeper.server.ZxidRolloverTest.checkClientConnected(ZxidRolloverTest.java:119)
at 
org.apache.zookeeper.server.ZxidRolloverTest.checkClientsConnected(ZxidRolloverTest.java:90)
at 
org.apache.zookeeper.server.ZxidRolloverTest.start(ZxidRolloverTest.java:165)
at 
org.apache.zookeeper.server.ZxidRolloverTest.testRolloverThenFollowerRestart(ZxidRolloverTest.java:345)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)


FAILED:  org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRun

Error Message:
expected:<4294967297> but was:<4294967296>

Stack Trace:
junit.framework.AssertionFailedError: expected:<4294967297> but

[jira] [Commented] (ZOOKEEPER-2297) NPE is thrown while creating "key manager" and "trust manager"

2016-01-29 Thread Arshad Mohammad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124061#comment-15124061
 ] 

Arshad Mohammad commented on ZOOKEEPER-2297:


Too many findbugs get induce this way, I think the best way is to create 
ZooKeeper instance level context and use it statically as it was done in 
ZOOKEEPER-2297-02.patch. [~rakeshr] can you have a look on the current patch 
and its findbugs and earlier patch ZOOKEEPER-2297-02.patch. 
I think ZOOKEEPER-2297-02.patch is the better way to handle this secenario

any other opinion??

> NPE is thrown while creating "key manager" and "trust manager" 
> ---
>
> Key: ZOOKEEPER-2297
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2297
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
> Environment: Suse 11 sp 3
>Reporter: Anushri
>Assignee: Arshad Mohammad
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2297-01.patch, ZOOKEEPER-2297-02.patch, 
> ZOOKEEPER-2297-03.patch, ZOOKEEPER-2297-04.patch, ZOOKEEPER-2355-05.patch
>
>
> NPE is thrown while creating "key manager" and "trust manager" , even though 
> the zk setup is in non-secure mode
> bq. 2015-10-19 12:54:12,278 [myid:2] - ERROR [ProcessThread(sid:2 
> cport:-1)::X509AuthenticationProvider@78] - Failed to create key manager
> bq. org.apache.zookeeper.common.X509Exception$KeyManagerException: 
> java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createKeyManager(X509Util.java:129)
> at 
> org.apache.zookeeper.server.auth.X509AuthenticationProvider.(X509AuthenticationProvider.java:75)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.initialize(ProviderRegistry.java:42)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.getProvider(ProviderRegistry.java:68)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.fixupACL(PrepRequestProcessor.java:952)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:379)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:716)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144)
> Caused by: java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createKeyManager(X509Util.java:113)
> ... 7 more
> bq. 2015-10-19 12:54:12,279 [myid:2] - ERROR [ProcessThread(sid:2 
> cport:-1)::X509AuthenticationProvider@90] - Failed to create trust manager
> bq.  org.apache.zookeeper.common.X509Exception$TrustManagerException: 
> java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createTrustManager(X509Util.java:158)
> at 
> org.apache.zookeeper.server.auth.X509AuthenticationProvider.(X509AuthenticationProvider.java:87)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.initialize(ProviderRegistry.java:42)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.getProvider(ProviderRegistry.java:68)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.fixupACL(PrepRequestProcessor.java:952)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:379)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:716)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144)
> Caused by: java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createTrustManager(X509Util.java:143)
> ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124054#comment-15124054
 ] 

Hadoop QA commented on ZOOKEEPER-1936:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12785224/ZOOKEEPER-1936.v4.patch
  against trunk revision 1726354.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3022//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3022//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3022//console

This message is automatically generated.

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v4.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
>

Failed: ZOOKEEPER-1936 PreCommit Build #3022

2016-01-29 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3022/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 380756 lines...]
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3022//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3022//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3022//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] ab3a6d0298540a99f457fef29c3d1f2750bab4ec logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1816:
 exec returned: 2

Total time: 12 minutes 12 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Compressed 543.42 KB of artifacts by 29.4% relative to #3019
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-1936
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal

Error Message:
Did not connect

Stack Trace:
java.util.concurrent.TimeoutException: Did not connect
at 
org.apache.zookeeper.test.ClientBase$CountdownWatcher.waitForConnected(ClientBase.java:132)
at 
org.apache.zookeeper.test.WatcherTest.testWatcherAutoReset(WatcherTest.java:340)
at 
org.apache.zookeeper.test.WatcherTest.testWatcherAutoResetWithLocal(WatcherTest.java:240)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)




[jira] [Updated] (ZOOKEEPER-2355) Ephemeral node is never deleted if follower fails while reading the proposal packet

2016-01-29 Thread Arshad Mohammad (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arshad Mohammad updated ZOOKEEPER-2355:
---
Attachment: ZOOKEEPER-2355-02.patch

Thanks [~rgs], [~eribeiro] for your review comments on test code.
Submitting the solution. [~fpj] please have a look

> Ephemeral node is never deleted if follower fails while reading the proposal 
> packet
> ---
>
> Key: ZOOKEEPER-2355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9
>
> Attachments: ZOOKEEPER-2355-01.patch, ZOOKEEPER-2355-02.patch
>
>
> ZooKeeper ephemeral node is never deleted if follower fail while reading the 
> proposal packet
> The scenario is as follows:
> # Configure three node ZooKeeper cluster, lets say nodes are A, B and C, 
> start all, assume A is leader, B and C are follower
> # Connect to any of the server and create ephemeral node /e1
> # Close the session, ephemeral node /e1 will go for deletion
> # While receiving delete proposal make Follower B to fail with 
> {{SocketTimeoutException}}. This we need to do to reproduce the scenario 
> otherwise in production environment it happens because of network fault.
> # Remove the fault, just check that faulted Follower is now connected with 
> quorum
> # Connect to any of the server, create the same ephemeral node /e1, created 
> is success.
> # Close the session,  ephemeral node /e1 will go for deletion
> # {color:red}/e1 is not deleted from the faulted Follower B, It should have 
> been deleted as it was again created with another session{color}
> # {color:green}/e1 is deleted from Leader A and other Follower C{color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated ZOOKEEPER-1936:
--
Attachment: (was: ZOOKEEPER-1936.v4.patch)

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v4.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated ZOOKEEPER-1936:
--
Attachment: ZOOKEEPER-1936.v4.patch

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v4.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124017#comment-15124017
 ] 

Hadoop QA commented on ZOOKEEPER-1936:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12785223/ZOOKEEPER-1936.v4.patch
  against trunk revision 1726354.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3021//console

This message is automatically generated.

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v4.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: ZOOKEEPER-1936 PreCommit Build #3021

2016-01-29 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3021/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 1522 lines...]
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12785223/ZOOKEEPER-1936.v4.patch
 [exec]   against trunk revision 1726354.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3021//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] ba3d0a903aeb241d845950a7302417ae7a7510ae logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1816:
 exec returned: 1

Total time: 1 minute 15 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
ERROR: Publisher 'Publish JUnit test result report' failed: No test report 
files were found. Configuration error?
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-1936
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
No tests ran.

Failed: ZOOKEEPER-2297 PreCommit Build #3020

2016-01-29 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2297
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3020/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 390765 lines...]
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] -1 findbugs.  The patch appears to introduce 2 new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3020//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3020//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3020//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] d6ad9365c4cd538cb96d5d61bec699643995be11 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1816:
 exec returned: 2

Total time: 20 minutes 7 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-2297
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated ZOOKEEPER-1936:
--
Attachment: ZOOKEEPER-1936.v4.patch

Patch v4 addresses Chris' comment above

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v4.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2297) NPE is thrown while creating "key manager" and "trust manager"

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124008#comment-15124008
 ] 

Hadoop QA commented on ZOOKEEPER-2297:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12785213/ZOOKEEPER-2355-05.patch
  against trunk revision 1726354.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3020//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3020//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3020//console

This message is automatically generated.

> NPE is thrown while creating "key manager" and "trust manager" 
> ---
>
> Key: ZOOKEEPER-2297
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2297
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
> Environment: Suse 11 sp 3
>Reporter: Anushri
>Assignee: Arshad Mohammad
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2297-01.patch, ZOOKEEPER-2297-02.patch, 
> ZOOKEEPER-2297-03.patch, ZOOKEEPER-2297-04.patch, ZOOKEEPER-2355-05.patch
>
>
> NPE is thrown while creating "key manager" and "trust manager" , even though 
> the zk setup is in non-secure mode
> bq. 2015-10-19 12:54:12,278 [myid:2] - ERROR [ProcessThread(sid:2 
> cport:-1)::X509AuthenticationProvider@78] - Failed to create key manager
> bq. org.apache.zookeeper.common.X509Exception$KeyManagerException: 
> java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createKeyManager(X509Util.java:129)
> at 
> org.apache.zookeeper.server.auth.X509AuthenticationProvider.(X509AuthenticationProvider.java:75)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.initialize(ProviderRegistry.java:42)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.getProvider(ProviderRegistry.java:68)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.fixupACL(PrepRequestProcessor.java:952)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:379)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:716)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144)
> Caused by: java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createKeyManager(X509Util.java:113)
> ... 7 more
> bq. 2015-10-19 12:54:12,279 [myid:2] - ERROR [ProcessThread(sid:2 
> cport:-1)::X509AuthenticationProvider@90] - Failed to create trust manager
> bq.  org.apache.zookeeper.common.X509Exception$TrustManagerException: 
> java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createTrustManager(X509Util.java:158)
> at 
> org.apache.zookeeper.server.auth.X509AuthenticationProvider.(X509AuthenticationProvider.java:87)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.initialize(ProviderRegistry.java:42)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.getProvider(ProviderRegistry.java:68)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.fixupACL(PrepRequestProcessor.java:952)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:379)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:716)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144)
> Caused by: java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createTrustManager(X509Util.java:143)
> ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123961#comment-15123961
 ] 

Chris Nauroth commented on ZOOKEEPER-1936:
--

It's a tough call then.  I don't see a way to write a deterministic JUnit test 
to prove the fix.  I'm reluctant to accept a code change without a test or a 
manual verification, at least on a stable maintenance line.

Here is my take on it.  Even without a consistent repro, I see the theoretical 
problem in the code.  The logic change in the patch looks correct to me, even 
if there might have been something more happening when Ted reported it.  Let's 
put a fix into trunk and branch-3.5, but not branch-3.4.  In trunk and 
branch-3.5, we also can make the switch to 
[{{Files#createDirectories}}|http://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html#createDirectories(java.nio.file.Path,%20java.nio.file.attribute.FileAttribute...)]
 that I mentioned earlier, because those branches are compiling to JDK 7.  That 
way, if we see another repro, we'll get additional debugging information to 
help with any subsequent patches.

Would some other committers like to comment on that plan?  Thanks!

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2297) NPE is thrown while creating "key manager" and "trust manager"

2016-01-29 Thread Arshad Mohammad (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arshad Mohammad updated ZOOKEEPER-2297:
---
Attachment: ZOOKEEPER-2355-05.patch

submitting new patch

> NPE is thrown while creating "key manager" and "trust manager" 
> ---
>
> Key: ZOOKEEPER-2297
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2297
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
> Environment: Suse 11 sp 3
>Reporter: Anushri
>Assignee: Arshad Mohammad
>Priority: Blocker
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2297-01.patch, ZOOKEEPER-2297-02.patch, 
> ZOOKEEPER-2297-03.patch, ZOOKEEPER-2297-04.patch, ZOOKEEPER-2355-05.patch
>
>
> NPE is thrown while creating "key manager" and "trust manager" , even though 
> the zk setup is in non-secure mode
> bq. 2015-10-19 12:54:12,278 [myid:2] - ERROR [ProcessThread(sid:2 
> cport:-1)::X509AuthenticationProvider@78] - Failed to create key manager
> bq. org.apache.zookeeper.common.X509Exception$KeyManagerException: 
> java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createKeyManager(X509Util.java:129)
> at 
> org.apache.zookeeper.server.auth.X509AuthenticationProvider.(X509AuthenticationProvider.java:75)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.initialize(ProviderRegistry.java:42)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.getProvider(ProviderRegistry.java:68)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.fixupACL(PrepRequestProcessor.java:952)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:379)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:716)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144)
> Caused by: java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createKeyManager(X509Util.java:113)
> ... 7 more
> bq. 2015-10-19 12:54:12,279 [myid:2] - ERROR [ProcessThread(sid:2 
> cport:-1)::X509AuthenticationProvider@90] - Failed to create trust manager
> bq.  org.apache.zookeeper.common.X509Exception$TrustManagerException: 
> java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createTrustManager(X509Util.java:158)
> at 
> org.apache.zookeeper.server.auth.X509AuthenticationProvider.(X509AuthenticationProvider.java:87)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.initialize(ProviderRegistry.java:42)
> at 
> org.apache.zookeeper.server.auth.ProviderRegistry.getProvider(ProviderRegistry.java:68)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.fixupACL(PrepRequestProcessor.java:952)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:379)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:716)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144)
> Caused by: java.lang.NullPointerException
> at org.apache.zookeeper.common.X509Util.createTrustManager(X509Util.java:143)
> ... 7 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123934#comment-15123934
 ] 

Ted Yu commented on ZOOKEEPER-1936:
---

Haven't got a chance to reproduce the bug.

After some QE fix, hbase un-secure deployment works reliably.

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: 3.4.8 release

2016-01-29 Thread Raúl Gutiérrez Segalés
On 29 January 2016 at 10:23, Flavio Junqueira  wrote:

> Give me until the end of today to have a look at ZK-2247. Rakesh has put
> the effort to prepare the patch, so I'd like to consider it. If it is not
> ready, then let's push it to 3.4.9. Does it sound good, Raul?
>

Yeah that's reasonable. I actually won't be able to cut an RC until Sunday,
so we have the whole weekend for that patch get more reviews :-) Thanks
Flavio!


-rgs




>
> -Flavio
>
> > On 29 Jan 2016, at 10:20, Raúl Gutiérrez Segalés 
> wrote:
> >
> > Ok - lets punt them to 3.4.9 then.
> >
> >
> > -rgs
> >
> > On 29 January 2016 at 10:10, Chris Nauroth 
> wrote:
> >
> >> +1 for expediting the 3.4.8 release.  Our goal was simply to correct the
> >> critical bug discovered late in 3.4.7.  Any patches more than that can
> be
> >> deferred to a 3.4.9 at a later date.
> >>
> >> --Chris Nauroth
> >>
> >>
> >>
> >>
> >> On 1/29/16, 10:06 AM, "Flavio Junqueira"  wrote:
> >>
> >>> +1
> >>>
>  On 29 Jan 2016, at 10:02, Ted Yu  wrote:
> 
>  ZOOKEEPER-2355 is in Open state as of now.
> 
>  My two cents:
> 
>  3.4.7 has been retracted.
>  It would be nice to get 3.4.8 out the door soon so that zookeeper
> users
>  can
>  pick up bug fixes in between 3.4.6 and 3.4 branch.
> 
>  On Thu, Jan 28, 2016 at 8:57 AM, Raúl Gutiérrez Segalés
>   > wrote:
> 
> > Hi,
> >
> > On 28 January 2016 at 07:07, Talluri, Chandra <
> > chandra.tall...@fmr.com.invalid> wrote:
> >
> >> Thanks for the updates.
> >>
> >> When can we expect 3.4.8?
> >>
> >
> > I think we need to decide if we want to include these patches:
> >
> > https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> > https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> >
> > They seem to be almost ready, though there might be some subtleties
> > left to
> > be addressed.
> >
> >
> >> What are the plans for 3.5.*  stable version release?
> >>
> >
> > The general plan for making 3.5 stable is here:
> >
> > http://markmail.org/message/ymxliy2rrwjc2pmo
> >
> > I believe Chris Nauroth will be the RM for the upcoming 3.5.2
> release.
> >
> >
> > -rgs
> >
> >>>
> >>>
> >>
> >>
>
>


Re: 3.4.8 release

2016-01-29 Thread Flavio Junqueira
Give me until the end of today to have a look at ZK-2247. Rakesh has put the 
effort to prepare the patch, so I'd like to consider it. If it is not ready, 
then let's push it to 3.4.9. Does it sound good, Raul?

-Flavio

> On 29 Jan 2016, at 10:20, Raúl Gutiérrez Segalés  wrote:
> 
> Ok - lets punt them to 3.4.9 then.
> 
> 
> -rgs
> 
> On 29 January 2016 at 10:10, Chris Nauroth  wrote:
> 
>> +1 for expediting the 3.4.8 release.  Our goal was simply to correct the
>> critical bug discovered late in 3.4.7.  Any patches more than that can be
>> deferred to a 3.4.9 at a later date.
>> 
>> --Chris Nauroth
>> 
>> 
>> 
>> 
>> On 1/29/16, 10:06 AM, "Flavio Junqueira"  wrote:
>> 
>>> +1
>>> 
 On 29 Jan 2016, at 10:02, Ted Yu  wrote:
 
 ZOOKEEPER-2355 is in Open state as of now.
 
 My two cents:
 
 3.4.7 has been retracted.
 It would be nice to get 3.4.8 out the door soon so that zookeeper users
 can
 pick up bug fixes in between 3.4.6 and 3.4 branch.
 
 On Thu, Jan 28, 2016 at 8:57 AM, Raúl Gutiérrez Segalés
  wrote:
 
> Hi,
> 
> On 28 January 2016 at 07:07, Talluri, Chandra <
> chandra.tall...@fmr.com.invalid> wrote:
> 
>> Thanks for the updates.
>> 
>> When can we expect 3.4.8?
>> 
> 
> I think we need to decide if we want to include these patches:
> 
> https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> 
> They seem to be almost ready, though there might be some subtleties
> left to
> be addressed.
> 
> 
>> What are the plans for 3.5.*  stable version release?
>> 
> 
> The general plan for making 3.5 stable is here:
> 
> http://markmail.org/message/ymxliy2rrwjc2pmo
> 
> I believe Chris Nauroth will be the RM for the upcoming 3.5.2 release.
> 
> 
> -rgs
> 
>>> 
>>> 
>> 
>> 



[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123914#comment-15123914
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2247:
---

Per discussion in the mailing list, lets punt this to 3.4.9.

cc: [~fpj]

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2355) Ephemeral node is never deleted if follower fails while reading the proposal packet

2016-01-29 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123912#comment-15123912
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2355:
---

Per discussion in the mailing list, lets punt this to 3.4.9.

cc: [~fpj]

> Ephemeral node is never deleted if follower fails while reading the proposal 
> packet
> ---
>
> Key: ZOOKEEPER-2355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9
>
> Attachments: ZOOKEEPER-2355-01.patch
>
>
> ZooKeeper ephemeral node is never deleted if follower fail while reading the 
> proposal packet
> The scenario is as follows:
> # Configure three node ZooKeeper cluster, lets say nodes are A, B and C, 
> start all, assume A is leader, B and C are follower
> # Connect to any of the server and create ephemeral node /e1
> # Close the session, ephemeral node /e1 will go for deletion
> # While receiving delete proposal make Follower B to fail with 
> {{SocketTimeoutException}}. This we need to do to reproduce the scenario 
> otherwise in production environment it happens because of network fault.
> # Remove the fault, just check that faulted Follower is now connected with 
> quorum
> # Connect to any of the server, create the same ephemeral node /e1, created 
> is success.
> # Close the session,  ephemeral node /e1 will go for deletion
> # {color:red}/e1 is not deleted from the faulted Follower B, It should have 
> been deleted as it was again created with another session{color}
> # {color:green}/e1 is deleted from Leader A and other Follower C{color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2247:
--
Fix Version/s: (was: 3.4.8)
   3.4.9

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: 3.4.8 release

2016-01-29 Thread Raúl Gutiérrez Segalés
Ok - lets punt them to 3.4.9 then.


-rgs

On 29 January 2016 at 10:10, Chris Nauroth  wrote:

> +1 for expediting the 3.4.8 release.  Our goal was simply to correct the
> critical bug discovered late in 3.4.7.  Any patches more than that can be
> deferred to a 3.4.9 at a later date.
>
> --Chris Nauroth
>
>
>
>
> On 1/29/16, 10:06 AM, "Flavio Junqueira"  wrote:
>
> >+1
> >
> >> On 29 Jan 2016, at 10:02, Ted Yu  wrote:
> >>
> >> ZOOKEEPER-2355 is in Open state as of now.
> >>
> >> My two cents:
> >>
> >> 3.4.7 has been retracted.
> >> It would be nice to get 3.4.8 out the door soon so that zookeeper users
> >>can
> >> pick up bug fixes in between 3.4.6 and 3.4 branch.
> >>
> >> On Thu, Jan 28, 2016 at 8:57 AM, Raúl Gutiérrez Segalés
> >> >>> wrote:
> >>
> >>> Hi,
> >>>
> >>> On 28 January 2016 at 07:07, Talluri, Chandra <
> >>> chandra.tall...@fmr.com.invalid> wrote:
> >>>
>  Thanks for the updates.
> 
>  When can we expect 3.4.8?
> 
> >>>
> >>> I think we need to decide if we want to include these patches:
> >>>
> >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> >>>
> >>> They seem to be almost ready, though there might be some subtleties
> >>>left to
> >>> be addressed.
> >>>
> >>>
>  What are the plans for 3.5.*  stable version release?
> 
> >>>
> >>> The general plan for making 3.5 stable is here:
> >>>
> >>> http://markmail.org/message/ymxliy2rrwjc2pmo
> >>>
> >>> I believe Chris Nauroth will be the RM for the upcoming 3.5.2 release.
> >>>
> >>>
> >>> -rgs
> >>>
> >
> >
>
>


[jira] [Updated] (ZOOKEEPER-2355) Ephemeral node is never deleted if follower fails while reading the proposal packet

2016-01-29 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2355:
--
Fix Version/s: 3.4.9

> Ephemeral node is never deleted if follower fails while reading the proposal 
> packet
> ---
>
> Key: ZOOKEEPER-2355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.9
>
> Attachments: ZOOKEEPER-2355-01.patch
>
>
> ZooKeeper ephemeral node is never deleted if follower fail while reading the 
> proposal packet
> The scenario is as follows:
> # Configure three node ZooKeeper cluster, lets say nodes are A, B and C, 
> start all, assume A is leader, B and C are follower
> # Connect to any of the server and create ephemeral node /e1
> # Close the session, ephemeral node /e1 will go for deletion
> # While receiving delete proposal make Follower B to fail with 
> {{SocketTimeoutException}}. This we need to do to reproduce the scenario 
> otherwise in production environment it happens because of network fault.
> # Remove the fault, just check that faulted Follower is now connected with 
> quorum
> # Connect to any of the server, create the same ephemeral node /e1, created 
> is success.
> # Close the session,  ephemeral node /e1 will go for deletion
> # {color:red}/e1 is not deleted from the faulted Follower B, It should have 
> been deleted as it was again created with another session{color}
> # {color:green}/e1 is deleted from Leader A and other Follower C{color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

2016-01-29 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123906#comment-15123906
 ] 

Chris Nauroth commented on ZOOKEEPER-1936:
--

[~yuzhih...@gmail.com], my understanding is that you only can repro this in 
standalone mode, not when deploying a full ensemble.  Is that correct?  Did you 
get to a point where you had a consistent repro and were able to verify that 
this patch helped fix it?

The logic change looks correct to me, but I'm trying to figure out if there is 
really something more going on, as suggested in earlier comments.  Thanks!

> Server exits when unable to create data directory due to race 
> --
>
> Key: ZOOKEEPER-1936
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6, 3.5.0
>Reporter: Harald Musum
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: ZOOKEEPER-1936.patch, ZOOKEEPER-1936.v2.patch, 
> ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch
>
>
> We sometime see issues with ZooKeeper server not starting and seeing this 
> error in the log:
> [2014-05-27 09:29:48.248] ERROR   : -   
> .org.apache.zookeeper.server.ZooKeeperServerMainUnexpected exception,
> exiting abnormally\nexception=\njava.io.IOException: Unable to create data
> directory /home/y/var/zookeeper/version-2\n\tat
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
> [...]
> Stack trace from JVM gives this:
> "PurgeTask" daemon prio=10 tid=0x0201d000 nid=0x1727 runnable
> [0x7f55d7dc7000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
> at
> org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> "zookeeper server" prio=10 tid=0x027df800 nid=0x1715 runnable
> [0x7f55d7ed8000]
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createDirectory(Native Method)
> at java.io.File.mkdir(File.java:1310)
> at java.io.File.mkdirs(File.java:1337)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:84)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> [...]
> So it seems that when autopurge is used (as it is in our case), it might 
> happen at the same time as starting the server itself. In FileTxnSnapLog() it 
> will check if the directory exists and create it if not. These two tasks do 
> this at the same time, and mkdir fails and server exits the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: 3.4.8 release

2016-01-29 Thread Chris Nauroth
+1 for expediting the 3.4.8 release.  Our goal was simply to correct the
critical bug discovered late in 3.4.7.  Any patches more than that can be
deferred to a 3.4.9 at a later date.

--Chris Nauroth




On 1/29/16, 10:06 AM, "Flavio Junqueira"  wrote:

>+1
>
>> On 29 Jan 2016, at 10:02, Ted Yu  wrote:
>> 
>> ZOOKEEPER-2355 is in Open state as of now.
>> 
>> My two cents:
>> 
>> 3.4.7 has been retracted.
>> It would be nice to get 3.4.8 out the door soon so that zookeeper users
>>can
>> pick up bug fixes in between 3.4.6 and 3.4 branch.
>> 
>> On Thu, Jan 28, 2016 at 8:57 AM, Raúl Gutiérrez Segalés
 wrote:
>> 
>>> Hi,
>>> 
>>> On 28 January 2016 at 07:07, Talluri, Chandra <
>>> chandra.tall...@fmr.com.invalid> wrote:
>>> 
 Thanks for the updates.
 
 When can we expect 3.4.8?
 
>>> 
>>> I think we need to decide if we want to include these patches:
>>> 
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2355
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2247
>>> 
>>> They seem to be almost ready, though there might be some subtleties
>>>left to
>>> be addressed.
>>> 
>>> 
 What are the plans for 3.5.*  stable version release?
 
>>> 
>>> The general plan for making 3.5 stable is here:
>>> 
>>> http://markmail.org/message/ymxliy2rrwjc2pmo
>>> 
>>> I believe Chris Nauroth will be the RM for the upcoming 3.5.2 release.
>>> 
>>> 
>>> -rgs
>>> 
>
>



Re: 3.4.8 release

2016-01-29 Thread Flavio Junqueira
+1

> On 29 Jan 2016, at 10:02, Ted Yu  wrote:
> 
> ZOOKEEPER-2355 is in Open state as of now.
> 
> My two cents:
> 
> 3.4.7 has been retracted.
> It would be nice to get 3.4.8 out the door soon so that zookeeper users can
> pick up bug fixes in between 3.4.6 and 3.4 branch.
> 
> On Thu, Jan 28, 2016 at 8:57 AM, Raúl Gutiérrez Segalés > wrote:
> 
>> Hi,
>> 
>> On 28 January 2016 at 07:07, Talluri, Chandra <
>> chandra.tall...@fmr.com.invalid> wrote:
>> 
>>> Thanks for the updates.
>>> 
>>> When can we expect 3.4.8?
>>> 
>> 
>> I think we need to decide if we want to include these patches:
>> 
>> https://issues.apache.org/jira/browse/ZOOKEEPER-2355
>> https://issues.apache.org/jira/browse/ZOOKEEPER-2247
>> 
>> They seem to be almost ready, though there might be some subtleties left to
>> be addressed.
>> 
>> 
>>> What are the plans for 3.5.*  stable version release?
>>> 
>> 
>> The general plan for making 3.5 stable is here:
>> 
>> http://markmail.org/message/ymxliy2rrwjc2pmo
>> 
>> I believe Chris Nauroth will be the RM for the upcoming 3.5.2 release.
>> 
>> 
>> -rgs
>> 



Re: 3.4.8 release

2016-01-29 Thread Ted Yu
ZOOKEEPER-2355 is in Open state as of now.

My two cents:

3.4.7 has been retracted.
It would be nice to get 3.4.8 out the door soon so that zookeeper users can
pick up bug fixes in between 3.4.6 and 3.4 branch.

On Thu, Jan 28, 2016 at 8:57 AM, Raúl Gutiérrez Segalés  wrote:

> Hi,
>
> On 28 January 2016 at 07:07, Talluri, Chandra <
> chandra.tall...@fmr.com.invalid> wrote:
>
> > Thanks for the updates.
> >
> > When can we expect 3.4.8?
> >
>
> I think we need to decide if we want to include these patches:
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> https://issues.apache.org/jira/browse/ZOOKEEPER-2247
>
> They seem to be almost ready, though there might be some subtleties left to
> be addressed.
>
>
> > What are the plans for 3.5.*  stable version release?
> >
>
> The general plan for making 3.5 stable is here:
>
> http://markmail.org/message/ymxliy2rrwjc2pmo
>
> I believe Chris Nauroth will be the RM for the upcoming 3.5.2 release.
>
>
> -rgs
>


[jira] [Commented] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123244#comment-15123244
 ] 

Hadoop QA commented on ZOOKEEPER-2247:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12785133/ZOOKEEPER-2247-09.patch
  against trunk revision 1726354.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3019//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3019//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3019//console

This message is automatically generated.

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.8, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: ZOOKEEPER-2247 PreCommit Build #3019

2016-01-29 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3019/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 384909 lines...]
 [exec]   
http://issues.apache.org/jira/secure/attachment/12785133/ZOOKEEPER-2247-09.patch
 [exec]   against trunk revision 1726354.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 9 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3019//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3019//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3019//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] daff8d3aeac2a3e58bfba738c5601033b1a22942 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 16 minutes 56 seconds
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-2247
Email was triggered for: Success
Sending email for trigger: Success
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-2247:

Attachment: ZOOKEEPER-2247-09.patch

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.8, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-2247:

Attachment: (was: ZOOKEEPER-2247-09.patch)

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.8, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2247) Zookeeper service becomes unavailable when leader fails to write transaction log

2016-01-29 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-2247:

Attachment: ZOOKEEPER-2247-09.patch

> Zookeeper service becomes unavailable when leader fails to write transaction 
> log
> 
>
> Key: ZOOKEEPER-2247
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2247
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Arshad Mohammad
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.4.8, 3.5.2
>
> Attachments: ZOOKEEPER-2247-01.patch, ZOOKEEPER-2247-02.patch, 
> ZOOKEEPER-2247-03.patch, ZOOKEEPER-2247-04.patch, ZOOKEEPER-2247-05.patch, 
> ZOOKEEPER-2247-06.patch, ZOOKEEPER-2247-07.patch, ZOOKEEPER-2247-09.patch
>
>
> Zookeeper service becomes unavailable when leader fails to write transaction 
> log. Bellow are the exceptions
> {code}
> 2015-08-14 15:41:18,556 [myid:100] - ERROR 
> [SyncThread:100:ZooKeeperCriticalThread@48] - Severe unrecoverable error, 
> from thread : SyncThread:100
> java.io.IOException: Input/output error
>   at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>   at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>   at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:331)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:380)
>   at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:563)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:178)
>   at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer$ZooKeeperServerListenerImpl@500] - Thread 
> SyncThread:100 exits, error code 1
> 2015-08-14 15:41:18,559 [myid:100] - INFO  
> [SyncThread:100:ZooKeeperServer@523] - shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:SessionTrackerImpl@232] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:LeaderRequestProcessor@77] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:PrepRequestProcessor@1035] - Shutting down
> 2015-08-14 15:41:18,560 [myid:100] - INFO  
> [SyncThread:100:ProposalRequestProcessor@88] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [SyncThread:100:CommitProcessor@356] - Shutting down
> 2015-08-14 15:41:18,561 [myid:100] - INFO  
> [CommitProcessor:100:CommitProcessor@191] - CommitProcessor exited loop!
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:Leader$ToBeAppliedRequestProcessor@915] - Shutting down
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:FinalRequestProcessor@646] - shutdown of request processor 
> complete
> 2015-08-14 15:41:18,562 [myid:100] - INFO  
> [SyncThread:100:SyncRequestProcessor@191] - Shutting down
> 2015-08-14 15:41:18,563 [myid:100] - INFO  [ProcessThread(sid:100 
> cport:-1)::PrepRequestProcessor@159] - PrepRequestProcessor exited loop!
> {code}
> After this exception Leader server still remains leader. After this non 
> recoverable exception the leader should go down and let other followers 
> become leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)