[ https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185285#comment-15185285 ]
Steve Rowe commented on ZOOKEEPER-2383: --------------------------------------- This program triggers the problem for me roughly 10% of the time with ZK 3.4.8 - note that if I don't use a thread to start ZooKeeperServer, the connection always comes in after the server has had a chance to register itself with JMX (imports omitted - attaching full file here in a sec): {code:java|title=TestZkStandaloneJMXRegistrationRaceConcurrent.java} public class TestZkStandaloneJMXRegistrationRaceConcurrent { public static void main(String[] args) throws IOException, InterruptedException, KeeperException { class ServerThread extends Thread { private ZooKeeperServer server; private ServerCnxnFactory cnxnFactory; @Override public void run() { try { File tempDir = Files.createTempDirectory(FileSystems.getDefault().getPath("."),"test").toFile(); FileTxnSnapLog txnSnapLog = new FileTxnSnapLog(tempDir, tempDir); server = new ZooKeeperServer (txnSnapLog, 2000, 2000, 4000, null, new ZKDatabase(txnSnapLog)); cnxnFactory = ServerCnxnFactory.createFactory(55555, -1); cnxnFactory.startup(server); } catch (IOException e) { throw new RuntimeException(e); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } public void shutdown() throws IOException, InterruptedException { cnxnFactory.shutdown(); cnxnFactory.join(); server.shutdown(); } } ServerThread serverThread = new ServerThread(); serverThread.setDaemon(true); serverThread.start(); Thread.sleep(3); ZooKeeper zk = new ZooKeeper("127.0.0.1:55555", 45000, new Watcher() { public void process(WatchedEvent event) {} }); zk.create("/testing123", new byte[]{}, Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL); serverThread.shutdown(); serverThread.join(); } } {code} Here's an excerpt from a log exhibiting the failure - I'll also attach the full log (I've added some logging to ZK 3.4.8 - I'll attach a patch showing those additions here in a minute): {noformat} 2016-03-08 11:32:08,414 [myid:] - WARN [SyncThread:0:MBeanRegistry@100] - bean 'Connections/127.0.0.1/0x153571244a70000' with parent 'StandaloneServer_port55555' has null path. java.lang.Throwable: at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:98) at org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147) at org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2016-03-08 11:32:08,414 [myid:] - WARN [Thread-0:MBeanRegistry@118] - registered bean 'StandaloneServer_port55555' with parent 'null' at path '/' java.lang.Throwable: at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116) at org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:385) at org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:418) at org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:119) at TestZkStandaloneJMXRegistrationRaceConcurrent$1ServerThread.run(TestZkStandaloneJMXRegistrationRaceConcurrent.java:29) 2016-03-08 11:32:08,415 [myid:] - ERROR [SyncThread:0:ZooKeeperCriticalThread@49] - Severe unrecoverable error, from thread : SyncThread:0 java.lang.AssertionError at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:104) at org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147) at org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2016-03-08 11:32:08,416 [myid:] - WARN [Thread-0:MBeanRegistry@118] - registered bean 'InMemoryDataTree' with parent 'StandaloneServer_port55555' at path '/StandaloneServer_port55555' java.lang.Throwable: at org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116) at org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:389) at org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:418) at org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:119) at TestZkStandaloneJMXRegistrationRaceConcurrent$1ServerThread.run(TestZkStandaloneJMXRegistrationRaceConcurrent.java:29) {noformat} > Startup race in ZooKeeperServer > ------------------------------- > > Key: ZOOKEEPER-2383 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383 > Project: ZooKeeper > Issue Type: Bug > Components: jmx, server > Affects Versions: 3.4.8 > Reporter: Steve Rowe > > In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8 > (SOLR-8724) I ran into test failures where attempts to create a node in a > newly started standalone ZooKeeperServer were failing because of an assertion > in MBeanRegistry. > ZooKeeperServer.startup() first sets up its request processor chain then > registers itself in JMX, but if a connection comes in before the server's JMX > registration happens, registration of the connection will fail because it > trips the assertion that (effectively) its parent (the server) has already > registered itself. > {code:java|title=ZooKeeperServer.java} > public synchronized void startup() { > if (sessionTracker == null) { > createSessionTracker(); > } > startSessionTracker(); > setupRequestProcessors(); > registerJMX(); > state = State.RUNNING; > notifyAll(); > } > {code} > {code:java|title=MBeanRegistry.java} > public void register(ZKMBeanInfo bean, ZKMBeanInfo parent) > throws JMException > { > assert bean != null; > String path = null; > if (parent != null) { > path = mapBean2Path.get(parent); > assert path != null; > } > {code} > This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this > issue with ZK 3.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)