[jira] [Updated] (ZOOKEEPER-2266) Integrate JaCoCo Coverage Library

2016-03-08 Thread Akihiro Suda (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiro Suda updated ZOOKEEPER-2266:

Attachment: ZOOKEEPER-2266-v4.patch

Hello [~phunt],
Thank you for the comment.

Attached v4.patch:

 * Set {{jacoco.enabled}} to {{false}] by default
 * JaCoCo jars are pulled only when {{jacoco.enabled}} is {{true}}
 * Use fileset for JaCoCo jars in {{build.xml}}
 * Eliminated tabs

How to check {{jacoco.enabled}} is implemented as follows:

{code:xml}


..


..

{code}


Unfortunately there is a {{}} code clone between 
{{junit.run-concurrent.nojacoco}} and {{junit.run-concurrent.jacoco}}, but I 
could not find a way to deduplicate this.

If I split the {{}} as an independent task, and put {{}} in {{}}, I get an error {{antcall 
is not a valid child of the coverage task}}.
{{}} also throws a similar error as well.



> Integrate JaCoCo Coverage Library
> -
>
> Key: ZOOKEEPER-2266
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2266
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: tests
>Reporter: Akihiro Suda
>Assignee: Akihiro Suda
>Priority: Minor
> Attachments: ZOOKEEPER-2266-v1.patch, ZOOKEEPER-2266-v2.patch, 
> ZOOKEEPER-2266-v3.patch, ZOOKEEPER-2266-v4.patch, jacoco-report-example.zip
>
>
> I would like to propose integration of 
> [JaCoCo|http://www.eclemma.org/jacoco/] coverage library with ZooKeeper.
> h4. Purposes
>  - To find poorly covered methods, and improve JUnit testcases to cover them
>  - To estimate causes of flaky testcases (e.g. ZOOKEEPER-2080, 
> ZOOKEEPER-2252, ZOOKEEPER-1868) by comparing reports from succeeded 
> experiments and failed ones (I'm recently interested in how we can 
> systematically realize this.)
> h4. Advantages of JaCoCo
>  - Support recent JDKs (including JDK 8)
>  - Low overhead
>  - Released under EPL
>  -- Note: cobertura has been removed from the code base because it is 
> released under GPL ( ZOOKEEPER-75, 
> http://www.apache.org/legal/resolved.html#category-x )
> h4. Usage
> {panel}
> $ ant test #(plus optionally, -Dtestcase=.. -Dtest.method=..)
> $ ant jacoco-report
> $ x-www-browser build/test/jacoco/reports/index.html &
> {panel}
> Example: jacoco-report-example.zip
> h4. Possible Future Work
> - Integrate to Jenkins buildbot so that we can check coverage after each of 
> builds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186556#comment-15186556
 ] 

Rakesh R commented on ZOOKEEPER-2383:
-

Thanks [~steve_rowe] for reporting this issue and good analysis.

bq. According to git blame, the latest changes around the startup method in 
ZooKeeperServer are due to ZOOKEEPER-1907, which actually turned out to be 
quite problematic, so this could be another issue due to that patch, I'm not 
sure.

[~fpj], sure I'm happy to investigate this. To understand the impact of 
ZOOKEEPER-1907, first I took the code before ZOOKEEPER-1907 commit version 
{{da3e7e0d4b66ac5a25d40ae2d0102b1b57994b62}}. I've debugged the code and able 
to re-produce the issue even without ZOOKEEPER-1907 changes.

Coming back to the issues reported in this jira, there are two issues. IIUC, 
both the cases are due to the race between server startup and processing a 
client connection request. I've tried an attempt to figure it out, please see 
the below sequence that creating the trouble.
# NullPointerException while creating session
{code}
2016-03-08 11:29:00,374 [myid:] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:5:NIOServerCnxnFactory@213] - 
Ignoring unexpected runtime exception
java.lang.NullPointerException
at 
org.apache.zookeeper.server.ZooKeeperServer.createSession(ZooKeeperServer.java:569)
{code}
+Thread-1: Starting the server+
1=> Invoked cnxnFactory.startup(server);
2=> Started NIOServerCxn.Factory thread and register OP_ACCEPT to accept 
connections
3=> sets zookeeper server to the connection factory
4=> loads zookeeper data
5=> Assume server is about to invoke {{zks.startup();}} and {{sessionTracker}} 
is not yet initialized.
+Thread-2: creating client connection+
1=> sends connection request to the server
2=> NIOServerCnxn reads the request and invokes 
{{NIOServerCnxn#readConnectRequest()}}
3=> It then calls {{zkServer.processConnectRequest(this, incomingBuffer);}}
4=> While processing the request it needs {{sessionTracker}} reference, but 
this is not yet initialized and the server is still in the startup phase 
causing the NPE error.
# MBeanRegistry throws assertion error due to parent doesn't exists
{code}
2016-03-08 11:29:00,449 [myid:] - WARN  [Thread-0:MBeanRegistry@118] - 
registered bean 'InMemoryDataTree' with parent 'StandaloneServer_port5' at 
path '/StandaloneServer_port5'
java.lang.Throwable: 
at 
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116)
{code}
+Thread-1: Starting the server+
1=> Invoked cnxnFactory.startup(server);
2=> Started NIOServerCxn.Factory thread and register OP_ACCEPT to accept 
connections
3=> sets zookeeper server to the connection factory
4=> loads zookeeper data
5=> Server invoked {{zks.startup();}}
6=> Started session tracker
7=> Finished settingup RequestProcessors
8=> Invoked {{ZooKeeperServer#registerJMX();}}
9=> Now assume ZooKeeperServer has initialized {{jmxServerBean = new 
ZooKeeperServerBean(this);}} and about to register the bean in the registry 
{{MBeanRegistry.getInstance().register(jmxServerBean, null);}}
+Thread-2: creating client connection+
1=> sends connection request to the server
2=> NIOServerCnxn reads the request and invokes 
{{NIOServerCnxn#readConnectRequest()}}
3=> It then calls {{zkServer.processConnectRequest(this, incomingBuffer);}}
4=> Since all the request processors are ready, it successfully creates the 
session and goes to register the connection bean
5=> Now, it will invoke {{zkServer.finishSessionInit()}}. Here it invokes 
{{serverCnxnFactory.registerConnection(cnxn);}} and hitting the path error.

> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
>Priority: Blocker
> Fix For: 3.4.9
>
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, 
> zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
>

[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185740#comment-15185740
 ] 

Steve Rowe commented on ZOOKEEPER-2383:
---

Similarly to the subclassed ZooKeeperServer above, if I apply the following 
patch to ZK 3.4.8 and run the attached test case with it, the JMX registration 
race no longer happens:

{code:java}
Index: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java
===
--- src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java  
(revision 1732157)
+++ src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java  
(working copy)
@@ -413,10 +413,9 @@
 createSessionTracker();
 }
 startSessionTracker();
+registerJMX();
 setupRequestProcessors();
 
-registerJMX();
-
 state = State.RUNNING;
 notifyAll();
 }
{code}


> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
>Priority: Blocker
> Fix For: 3.4.9
>
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, 
> zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2044) CancelledKeyException in zookeeper 3.4.5

2016-03-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185520#comment-15185520
 ] 

Hadoop QA commented on ZOOKEEPER-2044:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12791627/ZOOKEEPER-2044.patch
  against trunk revision 1733679.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3092//console

This message is automatically generated.

> CancelledKeyException in zookeeper 3.4.5
> 
>
> Key: ZOOKEEPER-2044
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2044
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
> Environment: Red Hat Enterprise Linux Server release 6.2
>Reporter: shamjith antholi
>Assignee: Flavio Junqueira
>Priority: Minor
> Fix For: 3.4.9
>
> Attachments: ZOOKEEPER-2044.patch, ZOOKEEPER-2044.patch
>
>
> I am getting cancelled key exception in zookeeper (version 3.4.5). Please see 
> the log below. When this error is thrown, the connected solr shard is going 
> down by giving the error "Failed to index metadata in 
> Solr,StackTrace=SolrError: HTTP status 503.Reason: 
> {"responseHeader":{"status":503,"QTime":204},"error":{"msg":"ClusterState 
> says we are the leader, but locally we don't think so","code":503"  and 
> ultimately the current activity is going down. Could you please give a 
> solution for this ?
> Zookeper log 
> --
> 2014-09-16 02:58:47,799 [myid:1] - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client 
> attempting to renew session 0x24868e7ca980003 at /172.22.0.5:58587
> 2014-09-16 02:58:47,800 [myid:1] - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating 
> client: 0x24868e7ca980003
> 2014-09-16 02:58:47,802 [myid:1] - INFO  
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@588] - Invalid 
> session 0x24868e7ca980003 for client /172.22.0.5:58587, probably expired
> 2014-09-16 02:58:47,803 [myid:1] - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed 
> socket connection for client /172.22.0.5:58587 which had sessionid 
> 0x24868e7ca980003
> 2014-09-16 02:58:47,810 [myid:1] - ERROR 
> [CommitProcessor:1:NIOServerCnxn@180] - Unexpected Exception:
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
> at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
> at 
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
> at 
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
> at 
> org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113)
> at org.apache.zookeeper.server.DataTree.setWatches(DataTree.java:1327)
> at 
> org.apache.zookeeper.server.ZKDatabase.setWatches(ZKDatabase.java:384)
> at 
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:304)
> at 
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: ZOOKEEPER-2044 PreCommit Build #3092

2016-03-08 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2044
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3092/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 97 lines...]
 [exec] Hunk #1 succeeded at 18 with fuzz 2.
 [exec] Hunk #2 succeeded at 75 (offset 1 line).
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12791627/ZOOKEEPER-2044.patch
 [exec]   against trunk revision 1733679.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3092//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 171cfaab088165a136330fe41f331d20997b335f logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1646:
 exec returned: 1

Total time: 1 minute 14 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Recording test results
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
[description-setter] Description set: ZOOKEEPER-2044
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7
Setting 
LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185412#comment-15185412
 ] 

Steve Rowe commented on ZOOKEEPER-2383:
---

FWIW, I was able to work around the problem in Solr tests by subclassing 
ZooKeeperServer and ordering server JMX registration before setting up its 
request processing pipeline, like so:

{code:java}
 private class TestZooKeeperServer extends ZooKeeperServer {
@Override protected void registerJMX() {
  // no-op - super.registerJMX() is called in overridden startup()
}
/** Register in JMX before starting the request processors. */
@Override public void startup() {
  super.registerJMX();
  super.startup();
}
public TestZooKeeperServer(FileTxnSnapLog txnLogFactory, int tickTime,
   int minSessionTimeout, int maxSessionTimeout,
   DataTreeBuilder treeBuilder, ZKDatabase zkDb) {
  super(txnLogFactory, tickTime, minSessionTimeout, maxSessionTimeout, 
treeBuilder, zkDb);
}
  }
{code}

> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
>Priority: Blocker
> Fix For: 3.4.9
>
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, 
> zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-2383:

Priority: Blocker  (was: Major)

> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
>Priority: Blocker
> Fix For: 3.4.9
>
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, 
> zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-2383:

Fix Version/s: 3.4.9

> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
>Priority: Blocker
> Fix For: 3.4.9
>
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, 
> zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185323#comment-15185323
 ] 

Flavio Junqueira edited comment on ZOOKEEPER-2383 at 3/8/16 5:39 PM:
-

[~steve_rowe] Thanks for reporting this issue. According to git blame, the 
latest changes around the startup method in ZooKeeperServer are due to 
ZOOKEEPER-1907, which actually turned out to be quite problematic, so this 
could be another issue due to that patch, I'm not sure.

{noformat}
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  411) public synchronized 
void startup() {
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  412) if 
(sessionTracker == null) {
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  413) 
createSessionTracker();
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  414) }
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  415) 
startSessionTracker();
097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java 
(Benjamin Reed  2008-05-12 23:01:25 +  416) 
setupRequestProcessors();
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  417) 
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  418) registerJMX();
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  419) 
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  420) state = 
State.RUNNING;
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  421) notifyAll();
097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java 
(Benjamin Reed  2008-05-12 23:01:25 +  422) }

{noformat}

{noformat}
commit 91f579e40755de870ed9123c8fd55925517d9aa6
Author: Hongchao Deng 
Date:   Mon Aug 17 20:52:07 2015 +

ZOOKEEPER-1907 Improve Thread handling (Rakesh R via hdeng)

git-svn-id: 
https://svn.apache.org/repos/asf/zookeeper/branches/branch-3.4@1696337 
13f79535-47bb-0310-9956-ffa450edef68
{noformat}

[~rakesh_r] could you have a look, please?

CC [~rgs] [~phunt]


was (Author: fpj):
[~steve_rowe] Thanks for reporting this issue. According to git blame, the 
latest changes around the startup method in ZooKeeperServer is due to 
ZOOKEEPER-1907, which actually turned out to be quite problematic, so this 
could be another issue due to that patch, I'm not sure.

{noformat}
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  411) public synchronized 
void startup() {
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  412) if 
(sessionTracker == null) {
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  413) 
createSessionTracker();
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  414) }
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  415) 
startSessionTracker();
097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java 
(Benjamin Reed  2008-05-12 23:01:25 +  416) 
setupRequestProcessors();
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  417) 
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  418) registerJMX();
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  419) 
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  420) state = 
State.RUNNING;
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  421) notifyAll();
097b7979 

[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185323#comment-15185323
 ] 

Flavio Junqueira commented on ZOOKEEPER-2383:
-

[~steve_rowe] Thanks for reporting this issue. According to git blame, the 
latest changes around the startup method in ZooKeeperServer is due to 
ZOOKEEPER-1907, which actually turned out to be quite problematic, so this 
could be another issue due to that patch, I'm not sure.

{noformat}
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  411) public synchronized 
void startup() {
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  412) if 
(sessionTracker == null) {
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  413) 
createSessionTracker();
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  414) }
55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Mahadev Konar  2012-01-31 06:50:06 +  415) 
startSessionTracker();
097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java 
(Benjamin Reed  2008-05-12 23:01:25 +  416) 
setupRequestProcessors();
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  417) 
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  418) registerJMX();
87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Patrick D. Hunt2009-01-15 22:57:14 +  419) 
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  420) state = 
State.RUNNING;
91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
(Hongchao Deng  2015-08-17 20:52:07 +  421) notifyAll();
097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java 
(Benjamin Reed  2008-05-12 23:01:25 +  422) }

{noformat}

{noformat}
commit 91f579e40755de870ed9123c8fd55925517d9aa6
Author: Hongchao Deng 
Date:   Mon Aug 17 20:52:07 2015 +

ZOOKEEPER-1907 Improve Thread handling (Rakesh R via hdeng)

git-svn-id: 
https://svn.apache.org/repos/asf/zookeeper/branches/branch-3.4@1696337 
13f79535-47bb-0310-9956-ffa450edef68
{noformat}

[~rakesh_r] could you have a look, please?

CC [~rgs] [~phunt]

> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, 
> zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated ZOOKEEPER-2383:
--
Attachment: zk-3.4.8-NPE.log

Sometimes the attached test case will trigger a NullPointerException in 
ZooKeeperServer.createSession() - see attached zk-3.4.8-NPE.log.  For some 
reason I never saw this failure mode in Solr tests using ZK 3.4.8.

{noformat}
2016-03-08 11:29:00,374 [myid:] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:5:NIOServerCnxnFactory@213] - 
Ignoring unexpected runtime e
xception
java.lang.NullPointerException
at 
org.apache.zookeeper.server.ZooKeeperServer.createSession(ZooKeeperServer.java:569)
at 
org.apache.zookeeper.server.ZooKeeperServer.processConnectRequest(ZooKeeperServer.java:902)
at 
org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:418)
at 
org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:198)
at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:244)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:745)
{noformat} 

> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, 
> zk-3.4.8-NPE.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated ZOOKEEPER-2383:
--
Attachment: release-3.4.8-extra-logging.patch
zk-3.4.8-MBeanRegistry.log
TestZkStandaloneJMXRegistrationRaceConcurrent.java

Attaching files:

Test case: TestZkStandaloneJMXRegistrationRaceConcurrent.java
ZK 3.4.8 source patch to add JXM logging: release-3.4.8-extra-logging.patch
Log file exhibiting failure: zk-3.4.8-MBeanRegistry.log


> Startup race in ZooKeeperServer
> ---
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx, server
>Affects Versions: 3.4.8
>Reporter: Steve Rowe
> Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, 
> release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log
>
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
> (SOLR-8724) I ran into test failures where attempts to create a node in a 
> newly started standalone ZooKeeperServer were failing because of an assertion 
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then 
> registers itself in JMX, but if a connection comes in before the server's JMX 
> registration happens, registration of the connection will fail because it 
> trips the assertion that (effectively) its parent (the server) has already 
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this 
> issue with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185285#comment-15185285
 ] 

Steve Rowe commented on ZOOKEEPER-2383:
---

This program triggers the problem for me roughly 10% of the time with ZK 3.4.8 
- note that if I don't use a thread to start ZooKeeperServer, the connection 
always comes in after the server has had a chance to register itself with JMX 
(imports omitted - attaching full file here in a sec): 

{code:java|title=TestZkStandaloneJMXRegistrationRaceConcurrent.java}
public class TestZkStandaloneJMXRegistrationRaceConcurrent {
  public static void main(String[] args) throws IOException, 
InterruptedException, KeeperException {
class ServerThread extends Thread {
  private ZooKeeperServer server;
  private ServerCnxnFactory cnxnFactory;
  @Override public void run() {
try {
  File tempDir = 
Files.createTempDirectory(FileSystems.getDefault().getPath("."),"test").toFile();
  FileTxnSnapLog txnSnapLog = new FileTxnSnapLog(tempDir, tempDir);
  server = new ZooKeeperServer
(txnSnapLog, 2000, 2000, 4000, null, new ZKDatabase(txnSnapLog));
  cnxnFactory = ServerCnxnFactory.createFactory(5, -1);
  cnxnFactory.startup(server);
} catch (IOException e) {
  throw new RuntimeException(e);
} catch (InterruptedException e) { 
  Thread.currentThread().interrupt();
}
  }
  public void shutdown() throws IOException, InterruptedException {
cnxnFactory.shutdown();
cnxnFactory.join();
server.shutdown();
  }
}
ServerThread serverThread = new ServerThread();
serverThread.setDaemon(true);
serverThread.start();
Thread.sleep(3);
ZooKeeper zk = new ZooKeeper("127.0.0.1:5", 45000, new Watcher() {
public void process(WatchedEvent event) {} });
zk.create("/testing123", new byte[]{}, Ids.OPEN_ACL_UNSAFE, 
CreateMode.EPHEMERAL);
serverThread.shutdown();
serverThread.join();
  }
}
{code}

Here's an excerpt from a log exhibiting the failure - I'll also attach the full 
log (I've added some logging to ZK 3.4.8 - I'll attach a patch showing those 
additions here in a minute):

{noformat}
2016-03-08 11:32:08,414 [myid:] - WARN  [SyncThread:0:MBeanRegistry@100] - bean 
'Connections/127.0.0.1/0x153571244a7' with parent 
'StandaloneServer_port5' has null path.
java.lang.Throwable: 
at 
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:98)
at 
org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147)
at 
org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613)
at 
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181)
at 
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200)
at 
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)
2016-03-08 11:32:08,414 [myid:] - WARN  [Thread-0:MBeanRegistry@118] - 
registered bean 'StandaloneServer_port5' with parent 'null' at path '/'
java.lang.Throwable: 
at 
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116)
at 
org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:385)
at 
org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:418)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:119)
at 
TestZkStandaloneJMXRegistrationRaceConcurrent$1ServerThread.run(TestZkStandaloneJMXRegistrationRaceConcurrent.java:29)
2016-03-08 11:32:08,415 [myid:] - ERROR 
[SyncThread:0:ZooKeeperCriticalThread@49] - Severe unrecoverable error, from 
thread : SyncThread:0
java.lang.AssertionError
at 
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:104)
at 
org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147)
at 
org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613)
at 
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181)
at 
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200)
at 
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)
2016-03-08 11:32:08,416 [myid:] - WARN  [Thread-0:MBeanRegistry@118] - 
registered bean 'InMemoryDataTree' with parent 'StandaloneServer_port5' at 
path '/StandaloneServer_port5'
java.lang.Throwable: 
at 
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116)
at 
org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:389)
at 

[jira] [Created] (ZOOKEEPER-2383) Startup race in ZooKeeperServer

2016-03-08 Thread Steve Rowe (JIRA)
Steve Rowe created ZOOKEEPER-2383:
-

 Summary: Startup race in ZooKeeperServer
 Key: ZOOKEEPER-2383
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
 Project: ZooKeeper
  Issue Type: Bug
  Components: jmx, server
Affects Versions: 3.4.8
Reporter: Steve Rowe


In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 
(SOLR-8724) I ran into test failures where attempts to create a node in a newly 
started standalone ZooKeeperServer were failing because of an assertion in 
MBeanRegistry.

ZooKeeperServer.startup() first sets up its request processor chain then 
registers itself in JMX, but if a connection comes in before the server's JMX 
registration happens, registration of the connection will fail because it trips 
the assertion that (effectively) its parent (the server) has already registered 
itself.

{code:java|title=ZooKeeperServer.java}
public synchronized void startup() {
if (sessionTracker == null) {
createSessionTracker();
}
startSessionTracker();
setupRequestProcessors();

registerJMX();

state = State.RUNNING;
notifyAll();
}
{code}

{code:java|title=MBeanRegistry.java}
public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
throws JMException
{
assert bean != null;
String path = null;
if (parent != null) {
path = mapBean2Path.get(parent);
assert path != null;
}
{code}

This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this issue 
with ZK 3.4.6. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)