[jira] [Updated] (ZOOKEEPER-2266) Integrate JaCoCo Coverage Library
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akihiro Suda updated ZOOKEEPER-2266: Attachment: ZOOKEEPER-2266-v4.patch Hello [~phunt], Thank you for the comment. Attached v4.patch: * Set {{jacoco.enabled}} to {{false}] by default * JaCoCo jars are pulled only when {{jacoco.enabled}} is {{true}} * Use fileset for JaCoCo jars in {{build.xml}} * Eliminated tabs How to check {{jacoco.enabled}} is implemented as follows: {code:xml} .. .. {code} Unfortunately there is a {{}} code clone between {{junit.run-concurrent.nojacoco}} and {{junit.run-concurrent.jacoco}}, but I could not find a way to deduplicate this. If I split the {{}} as an independent task, and put {{}} in {{}}, I get an error {{antcall is not a valid child of the coverage task}}. {{}} also throws a similar error as well. > Integrate JaCoCo Coverage Library > - > > Key: ZOOKEEPER-2266 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2266 > Project: ZooKeeper > Issue Type: Improvement > Components: tests >Reporter: Akihiro Suda >Assignee: Akihiro Suda >Priority: Minor > Attachments: ZOOKEEPER-2266-v1.patch, ZOOKEEPER-2266-v2.patch, > ZOOKEEPER-2266-v3.patch, ZOOKEEPER-2266-v4.patch, jacoco-report-example.zip > > > I would like to propose integration of > [JaCoCo|http://www.eclemma.org/jacoco/] coverage library with ZooKeeper. > h4. Purposes > - To find poorly covered methods, and improve JUnit testcases to cover them > - To estimate causes of flaky testcases (e.g. ZOOKEEPER-2080, > ZOOKEEPER-2252, ZOOKEEPER-1868) by comparing reports from succeeded > experiments and failed ones (I'm recently interested in how we can > systematically realize this.) > h4. Advantages of JaCoCo > - Support recent JDKs (including JDK 8) > - Low overhead > - Released under EPL > -- Note: cobertura has been removed from the code base because it is > released under GPL ( ZOOKEEPER-75, > http://www.apache.org/legal/resolved.html#category-x ) > h4. Usage > {panel} > $ ant test #(plus optionally, -Dtestcase=.. -Dtest.method=..) > $ ant jacoco-report > $ x-www-browser build/test/jacoco/reports/index.html & > {panel} > Example: jacoco-report-example.zip > h4. Possible Future Work > - Integrate to Jenkins buildbot so that we can check coverage after each of > builds -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186556#comment-15186556 ] Rakesh R commented on ZOOKEEPER-2383: - Thanks [~steve_rowe] for reporting this issue and good analysis. bq. According to git blame, the latest changes around the startup method in ZooKeeperServer are due to ZOOKEEPER-1907, which actually turned out to be quite problematic, so this could be another issue due to that patch, I'm not sure. [~fpj], sure I'm happy to investigate this. To understand the impact of ZOOKEEPER-1907, first I took the code before ZOOKEEPER-1907 commit version {{da3e7e0d4b66ac5a25d40ae2d0102b1b57994b62}}. I've debugged the code and able to re-produce the issue even without ZOOKEEPER-1907 changes. Coming back to the issues reported in this jira, there are two issues. IIUC, both the cases are due to the race between server startup and processing a client connection request. I've tried an attempt to figure it out, please see the below sequence that creating the trouble. # NullPointerException while creating session {code} 2016-03-08 11:29:00,374 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:5:NIOServerCnxnFactory@213] - Ignoring unexpected runtime exception java.lang.NullPointerException at org.apache.zookeeper.server.ZooKeeperServer.createSession(ZooKeeperServer.java:569) {code} +Thread-1: Starting the server+ 1=> Invoked cnxnFactory.startup(server); 2=> Started NIOServerCxn.Factory thread and register OP_ACCEPT to accept connections 3=> sets zookeeper server to the connection factory 4=> loads zookeeper data 5=> Assume server is about to invoke {{zks.startup();}} and {{sessionTracker}} is not yet initialized. +Thread-2: creating client connection+ 1=> sends connection request to the server 2=> NIOServerCnxn reads the request and invokes {{NIOServerCnxn#readConnectRequest()}} 3=> It then calls {{zkServer.processConnectRequest(this, incomingBuffer);}} 4=> While processing the request it needs {{sessionTracker}} reference, but this is not yet initialized and the server is still in the startup phase causing the NPE error. # MBeanRegistry throws assertion error due to parent doesn't exists {code} 2016-03-08 11:29:00,449 [myid:] - WARN [Thread-0:MBeanRegistry@118] - registered bean 'InMemoryDataTree' with parent 'StandaloneServer_port5' at path '/StandaloneServer_port5' java.lang.Throwable: at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116) {code} +Thread-1: Starting the server+ 1=> Invoked cnxnFactory.startup(server); 2=> Started NIOServerCxn.Factory thread and register OP_ACCEPT to accept connections 3=> sets zookeeper server to the connection factory 4=> loads zookeeper data 5=> Server invoked {{zks.startup();}} 6=> Started session tracker 7=> Finished settingup RequestProcessors 8=> Invoked {{ZooKeeperServer#registerJMX();}} 9=> Now assume ZooKeeperServer has initialized {{jmxServerBean = new ZooKeeperServerBean(this);}} and about to register the bean in the registry {{MBeanRegistry.getInstance().register(jmxServerBean, null);}} +Thread-2: creating client connection+ 1=> sends connection request to the server 2=> NIOServerCnxn reads the request and invokes {{NIOServerCnxn#readConnectRequest()}} 3=> It then calls {{zkServer.processConnectRequest(this, incomingBuffer);}} 4=> Since all the request processors are ready, it successfully creates the session and goes to register the connection bean 5=> Now, it will invoke {{zkServer.finishSessionInit()}}. Here it invokes {{serverCnxnFactory.registerConnection(cnxn);}} and hitting the path error. > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe >Priority: Blocker > Fix For: 3.4.9 > > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, > zk-3.4.8-NPE.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { >
[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185740#comment-15185740 ] Steve Rowe commented on ZOOKEEPER-2383: --- Similarly to the subclassed ZooKeeperServer above, if I apply the following patch to ZK 3.4.8 and run the attached test case with it, the JMX registration race no longer happens: {code:java} Index: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java === --- src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (revision 1732157) +++ src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (working copy) @@ -413,10 +413,9 @@ createSessionTracker(); } startSessionTracker(); +registerJMX(); setupRequestProcessors(); -registerJMX(); - state = State.RUNNING; notifyAll(); } {code} > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe >Priority: Blocker > Fix For: 3.4.9 > > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, > zk-3.4.8-NPE.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2044) CancelledKeyException in zookeeper 3.4.5
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185520#comment-15185520 ] Hadoop QA commented on ZOOKEEPER-2044: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12791627/ZOOKEEPER-2044.patch against trunk revision 1733679. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3092//console This message is automatically generated. > CancelledKeyException in zookeeper 3.4.5 > > > Key: ZOOKEEPER-2044 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2044 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.6 > Environment: Red Hat Enterprise Linux Server release 6.2 >Reporter: shamjith antholi >Assignee: Flavio Junqueira >Priority: Minor > Fix For: 3.4.9 > > Attachments: ZOOKEEPER-2044.patch, ZOOKEEPER-2044.patch > > > I am getting cancelled key exception in zookeeper (version 3.4.5). Please see > the log below. When this error is thrown, the connected solr shard is going > down by giving the error "Failed to index metadata in > Solr,StackTrace=SolrError: HTTP status 503.Reason: > {"responseHeader":{"status":503,"QTime":204},"error":{"msg":"ClusterState > says we are the leader, but locally we don't think so","code":503" and > ultimately the current activity is going down. Could you please give a > solution for this ? > Zookeper log > -- > 2014-09-16 02:58:47,799 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client > attempting to renew session 0x24868e7ca980003 at /172.22.0.5:58587 > 2014-09-16 02:58:47,800 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating > client: 0x24868e7ca980003 > 2014-09-16 02:58:47,802 [myid:1] - INFO > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@588] - Invalid > session 0x24868e7ca980003 for client /172.22.0.5:58587, probably expired > 2014-09-16 02:58:47,803 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed > socket connection for client /172.22.0.5:58587 which had sessionid > 0x24868e7ca980003 > 2014-09-16 02:58:47,810 [myid:1] - ERROR > [CommitProcessor:1:NIOServerCnxn@180] - Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076) > at > org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113) > at org.apache.zookeeper.server.DataTree.setWatches(DataTree.java:1327) > at > org.apache.zookeeper.server.ZKDatabase.setWatches(ZKDatabase.java:384) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:304) > at > org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74) > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: ZOOKEEPER-2044 PreCommit Build #3092
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2044 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3092/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 97 lines...] [exec] Hunk #1 succeeded at 18 with fuzz 2. [exec] Hunk #2 succeeded at 75 (offset 1 line). [exec] PATCH APPLICATION FAILED [exec] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12791627/ZOOKEEPER-2044.patch [exec] against trunk revision 1733679. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] -1 patch. The patch command could not apply the patch. [exec] [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3092//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 171cfaab088165a136330fe41f331d20997b335f logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1646: exec returned: 1 Total time: 1 minute 14 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Setting LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7 Recording test results Setting LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7 ERROR: Step ?Publish JUnit test result report? failed: No test report files were found. Configuration error? Setting LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7 [description-setter] Description set: ZOOKEEPER-2044 Email was triggered for: Failure - Any Sending email for trigger: Failure - Any Setting LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7 Setting LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7 Setting LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7 Setting LATEST1_7_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.7 ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185412#comment-15185412 ] Steve Rowe commented on ZOOKEEPER-2383: --- FWIW, I was able to work around the problem in Solr tests by subclassing ZooKeeperServer and ordering server JMX registration before setting up its request processing pipeline, like so: {code:java} private class TestZooKeeperServer extends ZooKeeperServer { @Override protected void registerJMX() { // no-op - super.registerJMX() is called in overridden startup() } /** Register in JMX before starting the request processors. */ @Override public void startup() { super.registerJMX(); super.startup(); } public TestZooKeeperServer(FileTxnSnapLog txnLogFactory, int tickTime, int minSessionTimeout, int maxSessionTimeout, DataTreeBuilder treeBuilder, ZKDatabase zkDb) { super(txnLogFactory, tickTime, minSessionTimeout, maxSessionTimeout, treeBuilder, zkDb); } } {code} > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe >Priority: Blocker > Fix For: 3.4.9 > > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, > zk-3.4.8-NPE.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Junqueira updated ZOOKEEPER-2383: Priority: Blocker (was: Major) > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe >Priority: Blocker > Fix For: 3.4.9 > > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, > zk-3.4.8-NPE.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Junqueira updated ZOOKEEPER-2383: Fix Version/s: 3.4.9 > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe >Priority: Blocker > Fix For: 3.4.9 > > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, > zk-3.4.8-NPE.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185323#comment-15185323 ] Flavio Junqueira edited comment on ZOOKEEPER-2383 at 3/8/16 5:39 PM: - [~steve_rowe] Thanks for reporting this issue. According to git blame, the latest changes around the startup method in ZooKeeperServer are due to ZOOKEEPER-1907, which actually turned out to be quite problematic, so this could be another issue due to that patch, I'm not sure. {noformat} 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 411) public synchronized void startup() { 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 412) if (sessionTracker == null) { 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 413) createSessionTracker(); 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 414) } 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 415) startSessionTracker(); 097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java (Benjamin Reed 2008-05-12 23:01:25 + 416) setupRequestProcessors(); 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 417) 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 418) registerJMX(); 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 419) 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 420) state = State.RUNNING; 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 421) notifyAll(); 097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java (Benjamin Reed 2008-05-12 23:01:25 + 422) } {noformat} {noformat} commit 91f579e40755de870ed9123c8fd55925517d9aa6 Author: Hongchao DengDate: Mon Aug 17 20:52:07 2015 + ZOOKEEPER-1907 Improve Thread handling (Rakesh R via hdeng) git-svn-id: https://svn.apache.org/repos/asf/zookeeper/branches/branch-3.4@1696337 13f79535-47bb-0310-9956-ffa450edef68 {noformat} [~rakesh_r] could you have a look, please? CC [~rgs] [~phunt] was (Author: fpj): [~steve_rowe] Thanks for reporting this issue. According to git blame, the latest changes around the startup method in ZooKeeperServer is due to ZOOKEEPER-1907, which actually turned out to be quite problematic, so this could be another issue due to that patch, I'm not sure. {noformat} 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 411) public synchronized void startup() { 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 412) if (sessionTracker == null) { 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 413) createSessionTracker(); 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 414) } 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 415) startSessionTracker(); 097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java (Benjamin Reed 2008-05-12 23:01:25 + 416) setupRequestProcessors(); 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 417) 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 418) registerJMX(); 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 419) 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 420) state = State.RUNNING; 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 421) notifyAll(); 097b7979
[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185323#comment-15185323 ] Flavio Junqueira commented on ZOOKEEPER-2383: - [~steve_rowe] Thanks for reporting this issue. According to git blame, the latest changes around the startup method in ZooKeeperServer is due to ZOOKEEPER-1907, which actually turned out to be quite problematic, so this could be another issue due to that patch, I'm not sure. {noformat} 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 411) public synchronized void startup() { 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 412) if (sessionTracker == null) { 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 413) createSessionTracker(); 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 414) } 55b03fce src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Mahadev Konar 2012-01-31 06:50:06 + 415) startSessionTracker(); 097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java (Benjamin Reed 2008-05-12 23:01:25 + 416) setupRequestProcessors(); 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 417) 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 418) registerJMX(); 87e1e030 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Patrick D. Hunt2009-01-15 22:57:14 + 419) 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 420) state = State.RUNNING; 91f579e4 src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java (Hongchao Deng 2015-08-17 20:52:07 + 421) notifyAll(); 097b7979 zookeeper/java/src/com/yahoo/zookeeper/server/ZooKeeperServer.java (Benjamin Reed 2008-05-12 23:01:25 + 422) } {noformat} {noformat} commit 91f579e40755de870ed9123c8fd55925517d9aa6 Author: Hongchao DengDate: Mon Aug 17 20:52:07 2015 + ZOOKEEPER-1907 Improve Thread handling (Rakesh R via hdeng) git-svn-id: https://svn.apache.org/repos/asf/zookeeper/branches/branch-3.4@1696337 13f79535-47bb-0310-9956-ffa450edef68 {noformat} [~rakesh_r] could you have a look, please? CC [~rgs] [~phunt] > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, > zk-3.4.8-NPE.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated ZOOKEEPER-2383: -- Attachment: zk-3.4.8-NPE.log Sometimes the attached test case will trigger a NullPointerException in ZooKeeperServer.createSession() - see attached zk-3.4.8-NPE.log. For some reason I never saw this failure mode in Solr tests using ZK 3.4.8. {noformat} 2016-03-08 11:29:00,374 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:5:NIOServerCnxnFactory@213] - Ignoring unexpected runtime e xception java.lang.NullPointerException at org.apache.zookeeper.server.ZooKeeperServer.createSession(ZooKeeperServer.java:569) at org.apache.zookeeper.server.ZooKeeperServer.processConnectRequest(ZooKeeperServer.java:902) at org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:418) at org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:198) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:244) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203) at java.lang.Thread.run(Thread.java:745) {noformat} > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log, > zk-3.4.8-NPE.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated ZOOKEEPER-2383: -- Attachment: release-3.4.8-extra-logging.patch zk-3.4.8-MBeanRegistry.log TestZkStandaloneJMXRegistrationRaceConcurrent.java Attaching files: Test case: TestZkStandaloneJMXRegistrationRaceConcurrent.java ZK 3.4.8 source patch to add JXM logging: release-3.4.8-extra-logging.patch Log file exhibiting failure: zk-3.4.8-MBeanRegistry.log > Startup race in ZooKeeperServer > --- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server >Affects Versions: 3.4.8 >Reporter: Steve Rowe > Attachments: TestZkStandaloneJMXRegistrationRaceConcurrent.java, > release-3.4.8-extra-logging.patch, zk-3.4.8-MBeanRegistry.log > > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185285#comment-15185285 ] Steve Rowe commented on ZOOKEEPER-2383: --- This program triggers the problem for me roughly 10% of the time with ZK 3.4.8 - note that if I don't use a thread to start ZooKeeperServer, the connection always comes in after the server has had a chance to register itself with JMX (imports omitted - attaching full file here in a sec): {code:java|title=TestZkStandaloneJMXRegistrationRaceConcurrent.java} public class TestZkStandaloneJMXRegistrationRaceConcurrent { public static void main(String[] args) throws IOException, InterruptedException, KeeperException { class ServerThread extends Thread { private ZooKeeperServer server; private ServerCnxnFactory cnxnFactory; @Override public void run() { try { File tempDir = Files.createTempDirectory(FileSystems.getDefault().getPath("."),"test").toFile(); FileTxnSnapLog txnSnapLog = new FileTxnSnapLog(tempDir, tempDir); server = new ZooKeeperServer (txnSnapLog, 2000, 2000, 4000, null, new ZKDatabase(txnSnapLog)); cnxnFactory = ServerCnxnFactory.createFactory(5, -1); cnxnFactory.startup(server); } catch (IOException e) { throw new RuntimeException(e); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } public void shutdown() throws IOException, InterruptedException { cnxnFactory.shutdown(); cnxnFactory.join(); server.shutdown(); } } ServerThread serverThread = new ServerThread(); serverThread.setDaemon(true); serverThread.start(); Thread.sleep(3); ZooKeeper zk = new ZooKeeper("127.0.0.1:5", 45000, new Watcher() { public void process(WatchedEvent event) {} }); zk.create("/testing123", new byte[]{}, Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL); serverThread.shutdown(); serverThread.join(); } } {code} Here's an excerpt from a log exhibiting the failure - I'll also attach the full log (I've added some logging to ZK 3.4.8 - I'll attach a patch showing those additions here in a minute): {noformat} 2016-03-08 11:32:08,414 [myid:] - WARN [SyncThread:0:MBeanRegistry@100] - bean 'Connections/127.0.0.1/0x153571244a7' with parent 'StandaloneServer_port5' has null path. java.lang.Throwable: at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:98) at org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147) at org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2016-03-08 11:32:08,414 [myid:] - WARN [Thread-0:MBeanRegistry@118] - registered bean 'StandaloneServer_port5' with parent 'null' at path '/' java.lang.Throwable: at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116) at org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:385) at org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:418) at org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:119) at TestZkStandaloneJMXRegistrationRaceConcurrent$1ServerThread.run(TestZkStandaloneJMXRegistrationRaceConcurrent.java:29) 2016-03-08 11:32:08,415 [myid:] - ERROR [SyncThread:0:ZooKeeperCriticalThread@49] - Severe unrecoverable error, from thread : SyncThread:0 java.lang.AssertionError at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:104) at org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147) at org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2016-03-08 11:32:08,416 [myid:] - WARN [Thread-0:MBeanRegistry@118] - registered bean 'InMemoryDataTree' with parent 'StandaloneServer_port5' at path '/StandaloneServer_port5' java.lang.Throwable: at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116) at org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:389) at
[jira] [Created] (ZOOKEEPER-2383) Startup race in ZooKeeperServer
Steve Rowe created ZOOKEEPER-2383: - Summary: Startup race in ZooKeeperServer Key: ZOOKEEPER-2383 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 Project: ZooKeeper Issue Type: Bug Components: jmx, server Affects Versions: 3.4.8 Reporter: Steve Rowe In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 (SOLR-8724) I ran into test failures where attempts to create a node in a newly started standalone ZooKeeperServer were failing because of an assertion in MBeanRegistry. ZooKeeperServer.startup() first sets up its request processor chain then registers itself in JMX, but if a connection comes in before the server's JMX registration happens, registration of the connection will fail because it trips the assertion that (effectively) its parent (the server) has already registered itself. {code:java|title=ZooKeeperServer.java} public synchronized void startup() { if (sessionTracker == null) { createSessionTracker(); } startSessionTracker(); setupRequestProcessors(); registerJMX(); state = State.RUNNING; notifyAll(); } {code} {code:java|title=MBeanRegistry.java} public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) throws JMException { assert bean != null; String path = null; if (parent != null) { path = mapBean2Path.get(parent); assert path != null; } {code} This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)