Success: ZOOKEEPER-2383 PreCommit Build #3290

2016-07-27 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3290/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 435869 lines...]
 [exec]   
http://issues.apache.org/jira/secure/attachment/12819926/ZOOKEEPER-2383.patch
 [exec]   against trunk revision 1754188.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3290//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3290//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3290//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] ccd825d867fa2373b356a075e89b3a755a71778d logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 17 minutes 36 seconds
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-2383
Email was triggered for: Success
Sending email for trigger: Success
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396992#comment-15396992
 ] 

Hadoop QA commented on ZOOKEEPER-2383:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12819926/ZOOKEEPER-2383.patch
  against trunk revision 1754188.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3290//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3290//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3290//console

This message is automatically generated.

> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
>Assignee: Rakesh R
>Priority: Blocker
> Fix For: 3.4.9, 3.5.3, 3.6.0
>
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> ZOOKEEPER-2383.patch, release-3.4.8-extra-logging.patch, 
> zk-3.4.8-MBeanRegistry.log, zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2490) infinitely connect on windows

2016-07-27 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-2490:

Assignee: spooky000

> infinitely connect on windows
> -
>
> Key: ZOOKEEPER-2490
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2490
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.5.2
> Environment: Windows
>Reporter: spooky000
>Assignee: spooky000
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2490.patch
>
>
> in addrvec_contains function
> this memcmp always return false on windows release build.
> for (i = 0; i < avec->count; i++)
> {
> if(memcmp(>data[i], addr, INET_ADDRSTRLEN) == 0)
> return 1;
> }
> because..
> #define INET_ADDRSTRLEN  16 on linux.
> #define INET_ADDRSTRLEN  22 on windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1884) zkCli silently ignores commands with missing parameters

2016-07-27 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396855#comment-15396855
 ] 

Rakesh R commented on ZOOKEEPER-1884:
-

ping [~rgs]

> zkCli silently ignores commands with missing parameters
> ---
>
> Key: ZOOKEEPER-1884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1884
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.6
>Reporter: Flavio Junqueira
>Assignee: Raul Gutierrez Segales
>Priority: Minor
> Fix For: 3.4.9
>
> Attachments: ZOOKEEPER-1884.patch
>
>
> Apparently, we have fixed this in trunk, but not in the 3.4 branch. When we 
> pass only the path to create, the command is not executed because it expects 
> an additional parameter and there is no error message because the create 
> command exists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2192) Port "Introduce new ZNode type: container" to 3.4.x

2016-07-27 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396852#comment-15396852
 ] 

Rakesh R commented on ZOOKEEPER-2192:
-

Yes, I think so. Since we haven't get much community feedback I'm just moving 
out of 3.4.9 version

> Port "Introduce new ZNode type: container" to 3.4.x
> ---
>
> Key: ZOOKEEPER-2192
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2192
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, java client, server
>Affects Versions: 3.4.6
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-2192.patch, ZOOKEEPER-2192.patch
>
>
> ZOOKEEPER-2163 applies to the trunk branch. This feature is too needed to 
> wait for 3.5.x. So, port the feature to the 3.4.x branch so it can be 
> released ahead of 3.5.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2192) Port "Introduce new ZNode type: container" to 3.4.x

2016-07-27 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-2192:

Fix Version/s: (was: 3.4.10)

> Port "Introduce new ZNode type: container" to 3.4.x
> ---
>
> Key: ZOOKEEPER-2192
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2192
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, java client, server
>Affects Versions: 3.4.6
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-2192.patch, ZOOKEEPER-2192.patch
>
>
> ZOOKEEPER-2163 applies to the trunk branch. This feature is too needed to 
> wait for 3.5.x. So, port the feature to the 3.4.x branch so it can be 
> released ahead of 3.5.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2397) Undocumented SASL properties

2016-07-27 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396846#comment-15396846
 ] 

Rakesh R commented on ZOOKEEPER-2397:
-

Hi All, how is the progress of this issue, any chance to conclude this ?

> Undocumented SASL properties
> 
>
> Key: ZOOKEEPER-2397
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2397
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.4.8, 3.5.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 3.4.9, 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2397.patch
>
>
> There are a number of properties spread across the code that do not appear in 
> the docs. For example, zookeeper.allowSaslFailedClients isn't documented 
> afaict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2492) gethostname return error before Win32WSAStartup on windows.

2016-07-27 Thread spooky000 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

spooky000 updated ZOOKEEPER-2492:
-
Attachment: ZOOKEEPER-2492.patch

> gethostname return error before Win32WSAStartup on windows.
> ---
>
> Key: ZOOKEEPER-2492
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2492
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.2
> Environment: windows
>Reporter: spooky000
>Priority: Trivial
> Fix For: 3.5.3
>
> Attachments: ZOOKEEPER-2492.patch
>
>
> gethostname return error before Win32WSAStartup on windows.
> in log_env function.
> gethostname(buf, sizeof(buf));
> LOG_INFO(LOGCALLBACK(zh), "Client environment:host.name=%s", buf);
> buf will be uninitialized buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2492) gethostname return error before Win32WSAStartup on windows.

2016-07-27 Thread spooky000 (JIRA)
spooky000 created ZOOKEEPER-2492:


 Summary: gethostname return error before Win32WSAStartup on 
windows.
 Key: ZOOKEEPER-2492
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2492
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.5.2
 Environment: windows
Reporter: spooky000
Priority: Trivial
 Fix For: 3.5.3


gethostname return error before Win32WSAStartup on windows.

in log_env function.
gethostname(buf, sizeof(buf));
LOG_INFO(LOGCALLBACK(zh), "Client environment:host.name=%s", buf);

buf will be uninitialized buffer.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2491) C client build error in vs 2015

2016-07-27 Thread spooky000 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

spooky000 updated ZOOKEEPER-2491:
-
Attachment: ZOOKEEPER-2491.patch

> C client build error in vs 2015 
> 
>
> Key: ZOOKEEPER-2491
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2491
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.5.2
> Environment: windows vs 2015
>Reporter: spooky000
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-2491.patch
>
>
> Visual Studio 2015  supports snprintf.
> #define snprintf _snprintf throw error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2491) C client build error in vs 2015

2016-07-27 Thread spooky000 (JIRA)
spooky000 created ZOOKEEPER-2491:


 Summary: C client build error in vs 2015 
 Key: ZOOKEEPER-2491
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2491
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.5.2
 Environment: windows vs 2015
Reporter: spooky000
 Fix For: 3.6.0


Visual Studio 2015  supports snprintf.
#define snprintf _snprintf throw error.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2490) infinitely connect on windows

2016-07-27 Thread spooky000 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

spooky000 updated ZOOKEEPER-2490:
-
Attachment: ZOOKEEPER-2490.patch

> infinitely connect on windows
> -
>
> Key: ZOOKEEPER-2490
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2490
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.5.2
> Environment: Windows
>Reporter: spooky000
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2490.patch
>
>
> in addrvec_contains function
> this memcmp always return false on windows release build.
> for (i = 0; i < avec->count; i++)
> {
> if(memcmp(>data[i], addr, INET_ADDRSTRLEN) == 0)
> return 1;
> }
> because..
> #define INET_ADDRSTRLEN  16 on linux.
> #define INET_ADDRSTRLEN  22 on windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper-trunk - Build # 3010 - Still Failing

2016-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk/3010/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 1502 lines...]
 [exec] 
 [exec] init-proxy:
 [exec] 
 [exec] fetch-skins-descriptors:
 [exec] 
 [exec] fetch-skin:
 [exec] 
 [exec] unpack-skins:
 [exec] 
 [exec] init-skins:
 [exec] 
 [exec] fetch-plugins-descriptors:
 [exec] Copying 1 file to 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/docs/build/tmp
 [exec] Copying 1 file to 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/docs/build/tmp
 [exec] Fetching plugins descriptor: 
http://forrest.apache.org/plugins/plugins.xml
 [exec] Getting: http://forrest.apache.org/plugins/plugins.xml
 [exec] To: 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/docs/build/tmp/plugins-1.xml
 [exec] local file date : Tue Nov 18 13:39:59 UTC 2014
 [exec] Not modified - so not downloaded
 [exec] Fetching plugins descriptor: 
http://forrest.apache.org/plugins/whiteboard-plugins.xml
 [exec] Getting: http://forrest.apache.org/plugins/whiteboard-plugins.xml
 [exec] To: 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/docs/build/tmp/plugins-2.xml
 [exec] local file date : Tue Nov 18 13:39:59 UTC 2014
 [exec] Not modified - so not downloaded
 [exec] Plugin list loaded from 
http://forrest.apache.org/plugins/plugins.xml.
 [exec] Plugin list loaded from 
http://forrest.apache.org/plugins/whiteboard-plugins.xml.
 [exec] 
 [exec] init-plugins:
 [exec] 
 [exec] BUILD FAILED
 [exec] /home/jenkins/tools/forrest/latest/main/targets/plugins.xml:374: 
Directory /home/jenkins/tools/forrest/latest/build/plugins creation was not 
successful for an unknown reason
 [exec] 
 [exec] Total time: 1 second

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:468: exec 
returned: 1

Total time: 1 second
mv: cannot stat 'build/*.tar.gz': No such file or directory
mv: cannot stat 'build/*.jar': No such file or directory
mv: cannot stat 'build/test/findbugs': No such file or directory
mv: cannot stat 'build/docs/api': No such file or directory
Build Failed
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
[WARNINGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording fingerprints
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Publishing Javadoc
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396590#comment-15396590
 ] 

Flavio Junqueira commented on ZOOKEEPER-1256:
-

Yeah, it is good for me, +1.

> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch, ZOOKEEPER-1256v2.patch, 
> ZOOKEEPER-1256v3.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL

2016-07-27 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396386#comment-15396386
 ] 

Patrick Hunt commented on ZOOKEEPER-1045:
-

Rakesh - I don't think we can rely on the "host" from the zoo.cfg file, it 
might be the host name, it might be an IP address, it might be FQDN, might not 
match whatever is in the kerberos credential.

I chatted with the HDFS and HBase folks briefly, and what they mentioned to me 
was that they look at the user and domain portion of the user/host@domain 
principal, and don't compare the host portion. This is why it's a bit more 
complicated than a simple string comparison as we originally had it in this 
patch. This would provide the authz at the user and domain level, while not 
constraining the host. Given we aren't using shared credentials I believe this 
is sufficient - the ZK servers would authenticate each of the zk servers with 
kerberos, then check that the user and domain is correct. ZK servers would have 
dedicated user name such as "zk-quorum-peer" or something like that. Another 
way to say it, is that we are authorizing the servers by user id, which is 
unique to an ensemble of quorum peers, and that is sufficient within a domain.

Note that I'm not an expert at this stuff myself. Does what I'm saying make 
sense? Perhaps we should get on a hangout and discuss f2f for a bit? I'm open 
to that if you think it would help figure out the right approach.

> Support Quorum Peer mutual authentication via SASL
> --
>
> Key: ZOOKEEPER-1045
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Eugene Koontz
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.4.9, 3.5.3
>
> Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, 
> 1045_failing_phunt.tar.gz, 
> TEST-org.apache.zookeeper.server.quorum.auth.QuorumAuthUpgradeTest.txt, 
> ZK-1045-test-case-failure-logs.zip, ZOOKEEPER-1045-00.patch, 
> ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, 
> ZOOKEEPER-1045TestValidationDesign.pdf
>
>
> ZOOKEEPER-938 addresses mutual authentication between clients and servers. 
> This bug, on the other hand, is for authentication among quorum peers. 
> Hopefully much of the work done on SASL integration with Zookeeper for 
> ZOOKEEPER-938 can be used as a foundation for this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396380#comment-15396380
 ] 

Chris Nauroth commented on ZOOKEEPER-1256:
--

Patch v3 passes for me on both trunk and branch-3.4, tested on both Mac and 
CentOS 7.

+1 for patch v3, pending a pre-commit run (currently blocked on some 
infrastructure issues).  Flavio, are you OK with that too?

I'm leaving some additional notes here too in case it's useful for anyone in 
the future.  The latest patch got me curious about what exactly was going on 
here.  I adapted this to a code snippet that iterates through all interfaces 
and addresses and prints whether they are considered loopback.  Here is a Scala 
REPL session for that on the Mac:

{code}
def printAddresses() = {
  val ifaces = java.net.NetworkInterface.getNetworkInterfaces
  while (ifaces.hasMoreElements) {
val iface = ifaces.nextElement
val addrs = iface.getInetAddresses
while (addrs.hasMoreElements) {
  val addr = addrs.nextElement
  printf("iface=[%s], iface.isLoopback=[%s], addr=[%s], 
addr.isLoopbackAddress=[%s]%n",
  iface, iface.isLoopback, addr, addr.isLoopbackAddress)
}
  }
}

scala> printAddresses
printAddresses
iface=[name:awdl0 (awdl0)], iface.isLoopback=[false], 
addr=[/fe80:0:0:0:e836:36ff:fe8f:2795%6], addr.isLoopbackAddress=[false]
iface=[name:vboxnet0 (vboxnet0)], iface.isLoopback=[false], 
addr=[/192.168.56.1], addr.isLoopbackAddress=[false]
iface=[name:en0 (en0)], iface.isLoopback=[false], 
addr=[/fe80:0:0:0:3e15:c2ff:fed1:8136%4], addr.isLoopbackAddress=[false]
iface=[name:en0 (en0)], iface.isLoopback=[false], addr=[/10.22.2.86], 
addr.isLoopbackAddress=[false]
iface=[name:lo0 (lo0)], iface.isLoopback=[true], addr=[/fe80:0:0:0:0:0:0:1%1], 
addr.isLoopbackAddress=[false]
iface=[name:lo0 (lo0)], iface.isLoopback=[true], addr=[/0:0:0:0:0:0:0:1], 
addr.isLoopbackAddress=[true]
iface=[name:lo0 (lo0)], iface.isLoopback=[true], addr=[/127.0.0.1], 
addr.isLoopbackAddress=[true]
{code}

Notice that for /fe80:0:0:0:0:0:0:1%1, the interface is considered loopback, 
but the address is not considered loopback.  That seems wrong.  I assume it has 
something to do with the JDK bugs linked above.

This is different from my CentOS VM, where all addresses associated with the 
loopback interface are also considered loopback:

{code}
scala> printAddresses
iface=[name:enp0s8 (enp0s8)], iface.isLoopback=[false], 
addr=[/fe80:0:0:0:a00:27ff:fe06:287a%enp0s8], addr.isLoopbackAddress=[false]
iface=[name:enp0s8 (enp0s8)], iface.isLoopback=[false], addr=[/192.168.56.105], 
addr.isLoopbackAddress=[false]
iface=[name:enp0s3 (enp0s3)], iface.isLoopback=[false], 
addr=[/fe80:0:0:0:a00:27ff:fea9:db31%enp0s3], addr.isLoopbackAddress=[false]
iface=[name:enp0s3 (enp0s3)], iface.isLoopback=[false], addr=[/10.0.2.15], 
addr.isLoopbackAddress=[false]
iface=[name:lo (lo)], iface.isLoopback=[true], addr=[/0:0:0:0:0:0:0:1%lo], 
addr.isLoopbackAddress=[true]
iface=[name:lo (lo)], iface.isLoopback=[true], addr=[/127.0.0.1], 
addr.isLoopbackAddress=[true]
{code}


> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch, ZOOKEEPER-1256v2.patch, 
> ZOOKEEPER-1256v3.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: ZOOKEEPER-1256 PreCommit Build #3289

2016-07-27 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3289/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 440286 lines...]
 [exec]   
http://issues.apache.org/jira/secure/attachment/12820539/ZOOKEEPER-1256v3.patch
 [exec]   against trunk revision 1754188.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3289//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3289//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3289//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/java/test/bin/test-patch.sh:
 line 558: /home/jenkins/tools/jiracli/latest/jira: Permission denied
 [exec] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/java/test/bin/test-patch.sh:
 line 559: /home/jenkins/tools/jiracli/latest/jira: Permission denied

BUILD SUCCESSFUL
Total time: 17 minutes 23 seconds
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-1256
Email was triggered for: Success
Sending email for trigger: Success
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Camille Fournier (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Camille Fournier updated ZOOKEEPER-1256:

Attachment: ZOOKEEPER-1256v3.patch

all JDK 1.4 and respects loopback check

> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch, ZOOKEEPER-1256v2.patch, 
> ZOOKEEPER-1256v3.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396242#comment-15396242
 ] 

Flavio Junqueira commented on ZOOKEEPER-1256:
-

bq. Should we look specifically for an IPv4 loopback address?

That's why I suggested to set {{java.net.preferIPv4Stack}}, also see the idk 
bugs that [~hanm] pointed us to above.

> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch, ZOOKEEPER-1256v2.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396234#comment-15396234
 ] 

Camille Fournier commented on ZOOKEEPER-2104:
-

Yeah, your init limit needs to be longer. They're not getting into quorum 
because it takes longer than 20s to sync. Dunno why the original node crashed 
but if you increase initLimit that should solve this problem.

> Sudden crash of all nodes in the cluster
> 
>
> Key: ZOOKEEPER-2104
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2104
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Benjamin Jaton
> Attachments: zookeeper-errors.txt, zookeeper-warns.txt
>
>
> In a 3 nodes ensemble, suddenly all the nodes seem to fail, displaying 
> "ZooKeeper is not running" messages.
> Not retry seems to be happening after that.
> This a request to understand what happened and probably to improve the logs 
> when it does.
> See logs below:
> NODE1:
> -- no log for several days before this --
> 2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] - 
> fsync-ing the write ahead log in SyncThread:1 took 11024ms which will 
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2015-01-04 16:18:22,380 [myid:1] - WARN  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:23,384 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:23,492 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:24,060 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE2:
> -- no log for several days before this --
> 2015-01-04 16:18:21,899 [myid:3] - WARN  
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:22,760 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,801 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,886 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE3 (leader):
> -- no log for several days before this --
> 2015-01-04 16:18:21,897 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,898 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.249:43402:LearnerHandler@646] - *** GOODBYE 
> /204.53.107.249:43402 
> 2015-01-04 16:18:21,905 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,907 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.247:45953:LearnerHandler@646] - *** GOODBYE 

[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396172#comment-15396172
 ] 

Camille Fournier commented on ZOOKEEPER-2104:
-

It's hard to tell if this is just that the logs were grabbed at different times 
or if it is clock drift but I would check for clock drift.
I'm also seeing this error though:
2016-07-27 11:47:05,709 [myid:2] - WARN  [SyncThread:2:FileTxnLog@321] - 
fsync-ing the write ahead 
log in SyncThread:2 took ms which will adversely effect operation latency. 
See the ZooKeeper troubleshooting guide

It's also taking over 10 seconds to read the snapshot on startup, which is not 
a good sign. Flavio's advice to increase the initLimit is probably good.

> Sudden crash of all nodes in the cluster
> 
>
> Key: ZOOKEEPER-2104
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2104
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Benjamin Jaton
> Attachments: zookeeper-errors.txt, zookeeper-warns.txt
>
>
> In a 3 nodes ensemble, suddenly all the nodes seem to fail, displaying 
> "ZooKeeper is not running" messages.
> Not retry seems to be happening after that.
> This a request to understand what happened and probably to improve the logs 
> when it does.
> See logs below:
> NODE1:
> -- no log for several days before this --
> 2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] - 
> fsync-ing the write ahead log in SyncThread:1 took 11024ms which will 
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2015-01-04 16:18:22,380 [myid:1] - WARN  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:23,384 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:23,492 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:24,060 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE2:
> -- no log for several days before this --
> 2015-01-04 16:18:21,899 [myid:3] - WARN  
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:22,760 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,801 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,886 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE3 (leader):
> -- no log for several days before this --
> 2015-01-04 16:18:21,897 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,898 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.249:43402:LearnerHandler@646] - 

Success: ZOOKEEPER-1256 PreCommit Build #3288

2016-07-27 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3288/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 442084 lines...]
 [exec]   
http://issues.apache.org/jira/secure/attachment/12820518/ZOOKEEPER-1256v2.patch
 [exec]   against trunk revision 1754188.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 2.0.3) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3288//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3288//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3288//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/java/test/bin/test-patch.sh:
 line 558: /home/jenkins/tools/jiracli/latest/jira: Permission denied
 [exec] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/java/test/bin/test-patch.sh:
 line 559: /home/jenkins/tools/jiracli/latest/jira: Permission denied

BUILD SUCCESSFUL
Total time: 17 minutes 55 seconds
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-1256
Email was triggered for: Success
Sending email for trigger: Success
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396130#comment-15396130
 ] 

Camille Fournier commented on ZOOKEEPER-2104:
-

Is it possible this is a clock drift problem? The logs you've provided end at 
12:13:35 for node1, 12:18:31 for node 2, and 12:14:11 for node3. I can't 
remember if this degree of clock drift causes issues or not, [~fpj] do you 
recall?

> Sudden crash of all nodes in the cluster
> 
>
> Key: ZOOKEEPER-2104
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2104
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Benjamin Jaton
> Attachments: zookeeper-errors.txt, zookeeper-warns.txt
>
>
> In a 3 nodes ensemble, suddenly all the nodes seem to fail, displaying 
> "ZooKeeper is not running" messages.
> Not retry seems to be happening after that.
> This a request to understand what happened and probably to improve the logs 
> when it does.
> See logs below:
> NODE1:
> -- no log for several days before this --
> 2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] - 
> fsync-ing the write ahead log in SyncThread:1 took 11024ms which will 
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2015-01-04 16:18:22,380 [myid:1] - WARN  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:23,384 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:23,492 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:24,060 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE2:
> -- no log for several days before this --
> 2015-01-04 16:18:21,899 [myid:3] - WARN  
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:22,760 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,801 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,886 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE3 (leader):
> -- no log for several days before this --
> 2015-01-04 16:18:21,897 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,898 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.249:43402:LearnerHandler@646] - *** GOODBYE 
> /204.53.107.249:43402 
> 2015-01-04 16:18:21,905 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,907 [myid:2] - WARN  
> 

[jira] [Commented] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396115#comment-15396115
 ] 

Camille Fournier commented on ZOOKEEPER-1256:
-

So, the test says in comments that "if we have a loopback, and it has an 
address use it" but we do not request the loopback address when getting the 
bindAddress set. This patch does nothing but specifically request the loopback 
address. It passes on my machine.

> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch, ZOOKEEPER-1256v2.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Camille Fournier (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Camille Fournier updated ZOOKEEPER-1256:

Attachment: ZOOKEEPER-1256v2.patch

patch that specifies that we want the loopback address

> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch, ZOOKEEPER-1256v2.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396110#comment-15396110
 ] 

Flavio Junqueira commented on ZOOKEEPER-1256:
-

This is also failing in jenkins, btw.

> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X

2016-07-27 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396106#comment-15396106
 ] 

Camille Fournier commented on ZOOKEEPER-1256:
-

I don't absolutely love the patch fix by setting that property. Let me look at 
this for a minute.

> ClientPortBindTest is failing on Mac OS X
> -
>
> Key: ZOOKEEPER-1256
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
> Environment: Mac OS X
>Reporter: Daniel Gómez Ferro
>Assignee: Flavio Junqueira
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, 
> ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch
>
>
> ClientPortBindTest is failing consistently on Mac OS X.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently

2016-07-27 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396091#comment-15396091
 ] 

Michael Han commented on ZOOKEEPER-2080:


Hi Alex, thanks for the review :)
bq. do you think that the creation of a new election object won't be interfered 
if the old object shutdown/GC hasn't happened yet 

The new leader election object and the old leader election object does not 
share object state: each object has their own QuorumCnxManager that manages the 
underlying TCP connections used for leader election. They could in theory 
possibly share the same socket address (election address), because I believe 
this address is statically generated from the connection string instead of 
dynamically generated (like the uniquePort utility we had in test), and this 
address seems to be only thing that different QuroumCnxManager shares. In 
theory we might have two QuorumCnxManager, one from old election object waiting 
to be shutdown and the other one from the new election object, that both try 
binding to same address. I haven't found any issues related this though during 
my stress test on unit tests (in particular for reconfig test), and I think we 
could possibly address this issue by some retry logic with exponential back off 
when binding to socket in QuorumCnxManager.

bq. any way to test this using a unit test
I don't have any concrete ideas around this, my thinking is we could possibly 
expose some options from related classes under test so we can artificially 
inject faults, creating race conditions and control timings. For example we 
could delay the shut down of the old leader election object and see what 
happens. As a simple test, I simply remove the statement completely and 5 out 
of 6 ReconfigRecoveryTest tests failed, which is expected because that is not 
supposed to be completely removed, so maybe instead of removing we can add a 
delay and make sure everything still works.




> ReconfigRecoveryTest fails intermittently
> -
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Michael Han
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, 
> jacoco-ZOOKEEPER-2080.unzip-grows-to-70MB.7z, repro-20150816.log, 
> threaddump.log
>
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
>   at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper Jenkins builds are broken (environmental)

2016-07-27 Thread Patrick Hunt
Hi folks - just a heads up that the Jenkins environment has changed and as
a result our builds are failing. We're working with INFRA team to resolve
this

https://issues.apache.org/jira/browse/INFRA-12327

Patrick


ZooKeeper_branch35_solaris - Build # 185 - Still Failing

2016-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_solaris/185/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 423326 lines...]
[junit] 2016-07-27 17:29:50,372 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-07-27 17:29:50,378 [myid:] - INFO  [main:ClientBase@466] - 
STARTING server
[junit] 2016-07-27 17:29:50,378 [myid:] - INFO  [main:ClientBase@386] - 
CREATING server instance 127.0.0.1:11222
[junit] 2016-07-27 17:29:50,378 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2016-07-27 17:29:50,379 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222
[junit] 2016-07-27 17:29:50,379 [myid:] - INFO  [main:ClientBase@361] - 
STARTING server instance 127.0.0.1:11222
[junit] 2016-07-27 17:29:50,380 [myid:] - INFO  [main:ZooKeeperServer@858] 
- minSessionTimeout set to 6000
[junit] 2016-07-27 17:29:50,380 [myid:] - INFO  [main:ZooKeeperServer@867] 
- maxSessionTimeout set to 6
[junit] 2016-07-27 17:29:50,380 [myid:] - INFO  [main:ZooKeeperServer@156] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test6667677890380192355.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test6667677890380192355.junit.dir/version-2
[junit] 2016-07-27 17:29:50,381 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test6667677890380192355.junit.dir/version-2/snapshot.b
[junit] 2016-07-27 17:29:50,383 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test6667677890380192355.junit.dir/version-2/snapshot.b
[junit] 2016-07-27 17:29:50,384 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222
[junit] 2016-07-27 17:29:50,385 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:35883
[junit] 2016-07-27 17:29:50,385 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from 
/127.0.0.1:35883
[junit] 2016-07-27 17:29:50,385 [myid:] - INFO  
[NIOWorkerThread-1:StatCommand@49] - Stat command output
[junit] 2016-07-27 17:29:50,386 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:35883 (no session established for client)
[junit] 2016-07-27 17:29:50,386 [myid:] - INFO  [main:JMXEnv@228] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-07-27 17:29:50,387 [myid:] - INFO  [main:JMXEnv@245] - 
expect:InMemoryDataTree
[junit] 2016-07-27 17:29:50,387 [myid:] - INFO  [main:JMXEnv@249] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2016-07-27 17:29:50,388 [myid:] - INFO  [main:JMXEnv@245] - 
expect:StandaloneServer_port
[junit] 2016-07-27 17:29:50,388 [myid:] - INFO  [main:JMXEnv@249] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2016-07-27 17:29:50,388 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17871
[junit] 2016-07-27 17:29:50,388 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24
[junit] 2016-07-27 17:29:50,388 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testQuota
[junit] 2016-07-27 17:29:50,388 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-07-27 17:29:50,462 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x122aa63d90e closed
[junit] 2016-07-27 17:29:50,462 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-07-27 17:29:50,462 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x122aa63d90e
[junit] 2016-07-27 17:29:50,463 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-07-27 17:29:50,463 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-07-27 17:29:50,463 [myid:] - INFO  

ZooKeeper_branch34_openjdk7 - Build # 1150 - Failure

2016-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/1150/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 217047 lines...]
[junit] 2016-07-27 15:21:33,275 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2016-07-27 15:21:33,275 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-07-27 15:21:33,276 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2016-07-27 15:21:33,276 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@219] - 
NIOServerCnxn factory exited run method
[junit] 2016-07-27 15:21:33,276 [myid:] - INFO  [main:ZooKeeperServer@469] 
- shutting down
[junit] 2016-07-27 15:21:33,276 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2016-07-27 15:21:33,277 [myid:] - INFO  
[main:PrepRequestProcessor@765] - Shutting down
[junit] 2016-07-27 15:21:33,277 [myid:] - INFO  
[main:SyncRequestProcessor@209] - Shutting down
[junit] 2016-07-27 15:21:33,277 [myid:] - INFO  [ProcessThread(sid:0 
cport:11221)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
[junit] 2016-07-27 15:21:33,277 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@187] - SyncRequestProcessor exited!
[junit] 2016-07-27 15:21:33,277 [myid:] - INFO  
[main:FinalRequestProcessor@402] - shutdown of request processor complete
[junit] 2016-07-27 15:21:33,278 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-07-27 15:21:33,278 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-07-27 15:21:33,280 [myid:] - INFO  [main:ClientBase@443] - 
STARTING server
[junit] 2016-07-27 15:21:33,280 [myid:] - INFO  [main:ClientBase@364] - 
CREATING server instance 127.0.0.1:11221
[junit] 2016-07-27 15:21:33,281 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2016-07-27 15:21:33,281 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2016-07-27 15:21:33,281 [myid:] - INFO  [main:ZooKeeperServer@170] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/x1/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/branch-3.4/build/test/tmp/test7851434331364075191.junit.dir/version-2
 snapdir 
/x1/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/branch-3.4/build/test/tmp/test7851434331364075191.junit.dir/version-2
[junit] 2016-07-27 15:21:33,285 [myid:] - INFO  
[main:FourLetterWordMain@62] - connecting to 127.0.0.1 11221
[junit] 2016-07-27 15:21:33,286 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@192] - 
Accepted socket connection from /127.0.0.1:51690
[junit] 2016-07-27 15:21:33,286 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing 
stat command from /127.0.0.1:51690
[junit] 2016-07-27 15:21:33,287 [myid:] - INFO  
[Thread-4:NIOServerCnxn$StatCommand@663] - Stat command output
[junit] 2016-07-27 15:21:33,287 [myid:] - INFO  
[Thread-4:NIOServerCnxn@1008] - Closed socket connection for client 
/127.0.0.1:51690 (no session established for client)
[junit] 2016-07-27 15:21:33,287 [myid:] - INFO  [main:JMXEnv@229] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-07-27 15:21:33,289 [myid:] - INFO  [main:JMXEnv@246] - 
expect:InMemoryDataTree
[junit] 2016-07-27 15:21:33,289 [myid:] - INFO  [main:JMXEnv@250] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2016-07-27 15:21:33,289 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2016-07-27 15:21:33,290 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2016-07-27 15:21:33,290 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 26415
[junit] 2016-07-27 15:21:33,290 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 20
[junit] 2016-07-27 15:21:33,291 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota
[junit] 2016-07-27 15:21:33,291 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2016-07-27 15:21:33,361 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x1562cf2501c closed
[junit] 2016-07-27 15:21:33,361 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2016-07-27 15:21:33,361 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for 
session: 0x1562cf2501c
[junit] 2016-07-27 15:21:33,362 [myid:] - INFO  

[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Daniel Freudenberger (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395387#comment-15395387
 ] 

Daniel Freudenberger commented on ZOOKEEPER-2104:
-

[~fpj] the referenced snapshot is the latest one and all snapshots have around 
the same size. The log files look pretty much all the same to me. But maybe 
your eyes will catch sth. that looked good to me. You can download the log 
files for all nodes if you like 
(https://s3-eu-west-1.amazonaws.com/files.rebuy-cdn.de/logs.tgz). The 
transaction log is written to the same device. This is sth. we can't really 
change in our current setup.


> Sudden crash of all nodes in the cluster
> 
>
> Key: ZOOKEEPER-2104
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2104
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Benjamin Jaton
> Attachments: zookeeper-errors.txt, zookeeper-warns.txt
>
>
> In a 3 nodes ensemble, suddenly all the nodes seem to fail, displaying 
> "ZooKeeper is not running" messages.
> Not retry seems to be happening after that.
> This a request to understand what happened and probably to improve the logs 
> when it does.
> See logs below:
> NODE1:
> -- no log for several days before this --
> 2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] - 
> fsync-ing the write ahead log in SyncThread:1 took 11024ms which will 
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2015-01-04 16:18:22,380 [myid:1] - WARN  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:23,384 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:23,492 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:24,060 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE2:
> -- no log for several days before this --
> 2015-01-04 16:18:21,899 [myid:3] - WARN  
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:22,760 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,801 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,886 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE3 (leader):
> -- no log for several days before this --
> 2015-01-04 16:18:21,897 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,898 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.249:43402:LearnerHandler@646] - *** GOODBYE 
> /204.53.107.249:43402 
> 2015-01-04 16:18:21,905 [myid:2] - WARN  
> 

[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395346#comment-15395346
 ] 

Flavio Junqueira commented on ZOOKEEPER-2104:
-

[~d.freudenberger] right, 147mb isn't large. But, this indicates that the 
follower has timed out waiting on the leader to sync up:

{noformat}
2016-07-27 11:49:40,346 [myid:2] - WARN 
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
following the leader
java.net.SocketTimeoutException: Read timed out
{noformat}

You may want to have a look at the prospective leader logs to see if you spot 
anything odd. If the log files aren't too large, then you may consider posting 
them here.

Also, is the snapshot you checked the latest one? do all snapshots have roughly 
that size? About devices, are you using a single device or a dedicated device 
to the txn log. 

> Sudden crash of all nodes in the cluster
> 
>
> Key: ZOOKEEPER-2104
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2104
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Benjamin Jaton
> Attachments: zookeeper-errors.txt, zookeeper-warns.txt
>
>
> In a 3 nodes ensemble, suddenly all the nodes seem to fail, displaying 
> "ZooKeeper is not running" messages.
> Not retry seems to be happening after that.
> This a request to understand what happened and probably to improve the logs 
> when it does.
> See logs below:
> NODE1:
> -- no log for several days before this --
> 2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] - 
> fsync-ing the write ahead log in SyncThread:1 took 11024ms which will 
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2015-01-04 16:18:22,380 [myid:1] - WARN  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:23,384 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:23,492 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:24,060 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE2:
> -- no log for several days before this --
> 2015-01-04 16:18:21,899 [myid:3] - WARN  
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:22,760 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,801 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,886 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE3 (leader):
> -- no log for several days before this --
> 2015-01-04 16:18:21,897 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer 

[jira] [Comment Edited] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Daniel Freudenberger (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395335#comment-15395335
 ] 

Daniel Freudenberger edited comment on ZOOKEEPER-2104 at 7/27/16 9:52 AM:
--

[~fpj] of course I read through the comments. Zookeeper recovered after ~15 
minutes. 20 minutes later (right now) it crashed again and flooding the log 
file with following errors:

{quote}
2016-07-27 11:49:39,829 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.233:60522 (no session established for client)
2016-07-27 11:49:39,864 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.201:60524
2016-07-27 11:49:39,865 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:39,865 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.201:60524 (no session established for client)
2016-07-27 11:49:40,095 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.217:37339
2016-07-27 11:49:40,096 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,098 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.217:37339 (no session established for client)
2016-07-27 11:49:40,245 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.63:33360
2016-07-27 11:49:40,245 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,245 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.63:33360 (no session established for client)
2016-07-27 11:49:40,317 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.111:34965
2016-07-27 11:49:40,320 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,320 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.111:34965 (no session established for client)
2016-07-27 11:49:40,346 [myid:2] - WARN  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
following the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at 
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:272)
at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:72)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139] - 
Shutting down
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - shutting down
2016-07-27 11:49:40,348 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2016-07-27 11:49:40,352 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot 

[jira] [Comment Edited] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Daniel Freudenberger (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395335#comment-15395335
 ] 

Daniel Freudenberger edited comment on ZOOKEEPER-2104 at 7/27/16 9:53 AM:
--

[~fpj] of course I read through the comments. Zookeeper recovered after ~15 
minutes. 20 minutes later (right now) it crashed again and flooding the log 
file with following errors:

{quote}
2016-07-27 11:49:39,829 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.233:60522 (no session established for client)
2016-07-27 11:49:39,864 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.201:60524
2016-07-27 11:49:39,865 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:39,865 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.201:60524 (no session established for client)
2016-07-27 11:49:40,095 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.217:37339
2016-07-27 11:49:40,096 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,098 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.217:37339 (no session established for client)
2016-07-27 11:49:40,245 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.63:33360
2016-07-27 11:49:40,245 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,245 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.63:33360 (no session established for client)
2016-07-27 11:49:40,317 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.111:34965
2016-07-27 11:49:40,320 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,320 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.111:34965 (no session established for client)
2016-07-27 11:49:40,346 [myid:2] - WARN  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
following the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at 
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:272)
at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:72)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139] - 
Shutting down
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - shutting down
2016-07-27 11:49:40,348 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2016-07-27 11:49:40,352 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot 

[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Daniel Freudenberger (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395335#comment-15395335
 ] 

Daniel Freudenberger commented on ZOOKEEPER-2104:
-

[~fpj] of course I read through the comments. Zookeeper recovered after ~15 
minutes. 20 minutes later (right now) it crashed again and flooding the log 
file with following errors:

2016-07-27 11:49:39,829 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.233:60522 (no session established for client)
2016-07-27 11:49:39,864 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.201:60524
2016-07-27 11:49:39,865 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:39,865 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.201:60524 (no session established for client)
2016-07-27 11:49:40,095 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.217:37339
2016-07-27 11:49:40,096 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,098 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.217:37339 (no session established for client)
2016-07-27 11:49:40,245 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.63:33360
2016-07-27 11:49:40,245 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,245 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.63:33360 (no session established for client)
2016-07-27 11:49:40,317 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.41.199.111:34965
2016-07-27 11:49:40,320 [myid:2] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2016-07-27 11:49:40,320 [myid:2] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket 
connection for client /10.41.199.111:34965 (no session established for client)
2016-07-27 11:49:40,346 [myid:2] - WARN  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
following the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at 
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:272)
at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:72)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139] - 
Shutting down
2016-07-27 11:49:40,347 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - shutting down
2016-07-27 11:49:40,348 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2016-07-27 11:49:40,352 [myid:2] - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot 
/var/lib/zookeeper/version-2/snapshot.3300799266

The 

[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395321#comment-15395321
 ] 

Flavio Junqueira commented on ZOOKEEPER-2104:
-

[~d.freudenberger] please check the comments in this jira if you haven't done 
it yet.

The "ZooKeeper is not running" messages are due to the server(s) being in 
leader election. If they can't elect a leader and make progress, then we need 
to determine why that's the case. To my knowledge, there is nothing to be fixed 
here unless you provide further evidence of a bug.

> Sudden crash of all nodes in the cluster
> 
>
> Key: ZOOKEEPER-2104
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2104
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Benjamin Jaton
> Attachments: zookeeper-errors.txt, zookeeper-warns.txt
>
>
> In a 3 nodes ensemble, suddenly all the nodes seem to fail, displaying 
> "ZooKeeper is not running" messages.
> Not retry seems to be happening after that.
> This a request to understand what happened and probably to improve the logs 
> when it does.
> See logs below:
> NODE1:
> -- no log for several days before this --
> 2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] - 
> fsync-ing the write ahead log in SyncThread:1 took 11024ms which will 
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2015-01-04 16:18:22,380 [myid:1] - WARN  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:23,384 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:23,492 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:24,060 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE2:
> -- no log for several days before this --
> 2015-01-04 16:18:21,899 [myid:3] - WARN  
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:22,760 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,801 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,886 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE3 (leader):
> -- no log for several days before this --
> 2015-01-04 16:18:21,897 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,898 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.249:43402:LearnerHandler@646] - *** GOODBYE 
> /204.53.107.249:43402 
> 2015-01-04 16:18:21,905 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection 

[jira] [Commented] (ZOOKEEPER-2104) Sudden crash of all nodes in the cluster

2016-07-27 Thread Daniel Freudenberger (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395310#comment-15395310
 ] 

Daniel Freudenberger commented on ZOOKEEPER-2104:
-

We just ran into exactly the same issue. 3 nodes cluster, suddenly the cluster 
went down and all nodes reporting "ZooKeeper is not running".

Is there something I can do to make someone look into this? 


> Sudden crash of all nodes in the cluster
> 
>
> Key: ZOOKEEPER-2104
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2104
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Benjamin Jaton
> Attachments: zookeeper-errors.txt, zookeeper-warns.txt
>
>
> In a 3 nodes ensemble, suddenly all the nodes seem to fail, displaying 
> "ZooKeeper is not running" messages.
> Not retry seems to be happening after that.
> This a request to understand what happened and probably to improve the logs 
> when it does.
> See logs below:
> NODE1:
> -- no log for several days before this --
> 2015-01-04 16:18:22,259 [myid:1] - WARN  [SyncThread:1:FileTxnLog@321] - 
> fsync-ing the write ahead log in SyncThread:1 took 11024ms which will 
> adversely effect operation latency. See the ZooKeeper troubleshooting guide
> 2015-01-04 16:18:22,380 [myid:1] - WARN  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:23,384 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:23,492 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:24,060 [myid:1] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE2:
> -- no log for several days before this --
> 2015-01-04 16:18:21,899 [myid:3] - WARN  
> [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2015-01-04 16:18:22,760 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,801 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2015-01-04 16:18:22,886 [myid:3] - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> NODE3 (leader):
> -- no log for several days before this --
> 2015-01-04 16:18:21,897 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,898 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.249:43402:LearnerHandler@646] - *** GOODBYE 
> /204.53.107.249:43402 
> 2015-01-04 16:18:21,905 [myid:2] - WARN  
> [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
> connection to peer due to transaction timeout.
> 2015-01-04 16:18:21,907 [myid:2] - WARN  
> [LearnerHandler-/204.53.107.247:45953:LearnerHandler@646] - *** GOODBYE 
> 

[jira] [Updated] (ZOOKEEPER-2490) infinitely connect on windows

2016-07-27 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-2490:

Fix Version/s: (was: 3.5.2)
   3.6.0
   3.5.3

> infinitely connect on windows
> -
>
> Key: ZOOKEEPER-2490
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2490
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.5.2
> Environment: Windows
>Reporter: spooky000
> Fix For: 3.5.3, 3.6.0
>
>
> in addrvec_contains function
> this memcmp always return false on windows release build.
> for (i = 0; i < avec->count; i++)
> {
> if(memcmp(>data[i], addr, INET_ADDRSTRLEN) == 0)
> return 1;
> }
> because..
> #define INET_ADDRSTRLEN  16 on linux.
> #define INET_ADDRSTRLEN  22 on windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2490) infinitely connect on windows

2016-07-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395248#comment-15395248
 ] 

Flavio Junqueira commented on ZOOKEEPER-2490:
-

Thanks for reporting this issue, [~spooky000]. Would you be interested in 
providing a patch?

> infinitely connect on windows
> -
>
> Key: ZOOKEEPER-2490
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2490
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.5.2
> Environment: Windows
>Reporter: spooky000
> Fix For: 3.5.2
>
>
> in addrvec_contains function
> this memcmp always return false on windows release build.
> for (i = 0; i < avec->count; i++)
> {
> if(memcmp(>data[i], addr, INET_ADDRSTRLEN) == 0)
> return 1;
> }
> because..
> #define INET_ADDRSTRLEN  16 on linux.
> #define INET_ADDRSTRLEN  22 on windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper-trunk-solaris - Build # 1244 - Still Failing

2016-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/1244/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 425732 lines...]
[junit] 2016-07-27 08:32:47,628 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-07-27 08:32:47,629 [myid:] - INFO  [main:ClientBase@466] - 
STARTING server
[junit] 2016-07-27 08:32:47,629 [myid:] - INFO  [main:ClientBase@386] - 
CREATING server instance 127.0.0.1:11222
[junit] 2016-07-27 08:32:47,629 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2016-07-27 08:32:47,630 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222
[junit] 2016-07-27 08:32:47,631 [myid:] - INFO  [main:ClientBase@361] - 
STARTING server instance 127.0.0.1:11222
[junit] 2016-07-27 08:32:47,631 [myid:] - INFO  [main:ZooKeeperServer@858] 
- minSessionTimeout set to 6000
[junit] 2016-07-27 08:32:47,631 [myid:] - INFO  [main:ZooKeeperServer@867] 
- maxSessionTimeout set to 6
[junit] 2016-07-27 08:32:47,632 [myid:] - INFO  [main:ZooKeeperServer@156] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4638573641595997860.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4638573641595997860.junit.dir/version-2
[junit] 2016-07-27 08:32:47,632 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4638573641595997860.junit.dir/version-2/snapshot.b
[junit] 2016-07-27 08:32:47,634 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test4638573641595997860.junit.dir/version-2/snapshot.b
[junit] 2016-07-27 08:32:47,636 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222
[junit] 2016-07-27 08:32:47,636 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:44086
[junit] 2016-07-27 08:32:47,637 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from 
/127.0.0.1:44086
[junit] 2016-07-27 08:32:47,637 [myid:] - INFO  
[NIOWorkerThread-1:StatCommand@49] - Stat command output
[junit] 2016-07-27 08:32:47,637 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:44086 (no session established for client)
[junit] 2016-07-27 08:32:47,637 [myid:] - INFO  [main:JMXEnv@228] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-07-27 08:32:47,639 [myid:] - INFO  [main:JMXEnv@245] - 
expect:InMemoryDataTree
[junit] 2016-07-27 08:32:47,639 [myid:] - INFO  [main:JMXEnv@249] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2016-07-27 08:32:47,639 [myid:] - INFO  [main:JMXEnv@245] - 
expect:StandaloneServer_port
[junit] 2016-07-27 08:32:47,639 [myid:] - INFO  [main:JMXEnv@249] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2016-07-27 08:32:47,640 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17981
[junit] 2016-07-27 08:32:47,640 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24
[junit] 2016-07-27 08:32:47,640 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testQuota
[junit] 2016-07-27 08:32:47,640 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-07-27 08:32:47,722 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x122a8782f16 closed
[junit] 2016-07-27 08:32:47,722 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x122a8782f16
[junit] 2016-07-27 08:32:47,722 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-07-27 08:32:47,722 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2016-07-27 08:32:47,722 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 
ConnnectionExpirerThread interrupted
[junit] 2016-07-27 08:32:47,722 [myid:] - INFO  

ZooKeeper_branch35_solaris - Build # 184 - Failure

2016-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_solaris/184/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 424523 lines...]
[junit] 2016-07-27 06:30:10,791 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2016-07-27 06:30:10,793 [myid:] - INFO  [main:ClientBase@466] - 
STARTING server
[junit] 2016-07-27 06:30:10,793 [myid:] - INFO  [main:ClientBase@386] - 
CREATING server instance 127.0.0.1:11222
[junit] 2016-07-27 06:30:10,793 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2016-07-27 06:30:10,794 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222
[junit] 2016-07-27 06:30:10,794 [myid:] - INFO  [main:ClientBase@361] - 
STARTING server instance 127.0.0.1:11222
[junit] 2016-07-27 06:30:10,795 [myid:] - INFO  [main:ZooKeeperServer@858] 
- minSessionTimeout set to 6000
[junit] 2016-07-27 06:30:10,795 [myid:] - INFO  [main:ZooKeeperServer@867] 
- maxSessionTimeout set to 6
[junit] 2016-07-27 06:30:10,795 [myid:] - INFO  [main:ZooKeeperServer@156] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test1451798615245699791.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test1451798615245699791.junit.dir/version-2
[junit] 2016-07-27 06:30:10,795 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test1451798615245699791.junit.dir/version-2/snapshot.b
[junit] 2016-07-27 06:30:10,797 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/branch-3.5/build/test/tmp/test1451798615245699791.junit.dir/version-2/snapshot.b
[junit] 2016-07-27 06:30:10,799 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222
[junit] 2016-07-27 06:30:10,799 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:59845
[junit] 2016-07-27 06:30:10,800 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from 
/127.0.0.1:59845
[junit] 2016-07-27 06:30:10,800 [myid:] - INFO  
[NIOWorkerThread-1:StatCommand@49] - Stat command output
[junit] 2016-07-27 06:30:10,800 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:59845 (no session established for client)
[junit] 2016-07-27 06:30:10,801 [myid:] - INFO  [main:JMXEnv@228] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-07-27 06:30:10,802 [myid:] - INFO  [main:JMXEnv@245] - 
expect:InMemoryDataTree
[junit] 2016-07-27 06:30:10,802 [myid:] - INFO  [main:JMXEnv@249] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2016-07-27 06:30:10,802 [myid:] - INFO  [main:JMXEnv@245] - 
expect:StandaloneServer_port
[junit] 2016-07-27 06:30:10,802 [myid:] - INFO  [main:JMXEnv@249] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2016-07-27 06:30:10,802 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17872
[junit] 2016-07-27 06:30:10,803 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24
[junit] 2016-07-27 06:30:10,803 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testQuota
[junit] 2016-07-27 06:30:10,803 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-07-27 06:30:10,876 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x122a807eedf closed
[junit] 2016-07-27 06:30:10,876 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x122a807eedf
[junit] 2016-07-27 06:30:10,876 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-07-27 06:30:10,877 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2016-07-27 06:30:10,877 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-07-27 06:30:10,877 [myid:] - INFO  

[jira] [Created] (ZOOKEEPER-2490) infinitely connect on windows

2016-07-27 Thread spooky000 (JIRA)
spooky000 created ZOOKEEPER-2490:


 Summary: infinitely connect on windows
 Key: ZOOKEEPER-2490
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2490
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.5.2
 Environment: Windows
Reporter: spooky000
 Fix For: 3.5.2


in addrvec_contains function
this memcmp always return false on windows release build.

for (i = 0; i < avec->count; i++)
{
if(memcmp(>data[i], addr, INET_ADDRSTRLEN) == 0)
return 1;
}

because..
#define INET_ADDRSTRLEN  16 on linux.
#define INET_ADDRSTRLEN  22 on windows.








--
This message was sent by Atlassian JIRA
(v6.3.4#6332)