[
https://issues.apache.org/jira/browse/ZOOKEEPER-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141173#comment-17141173
]
Chevaris commented on ZOOKEEPER-3803:
-------------------------------------
I have double check again the ticket and I think the problem is in Curator's
code...
Basically Curator's TestingZookeeperServer starts a ZookeeperServer in a new
thread. The server code is in Curator's TesringZookeeperMain in method
internalRunFromConfig... basically when the started thread to run the server is
interrupted the code closes txnLog
} finally {
if (txnLog != null) {
txnLog.close();
}
But on the other hand on Curator's TestingZookeeperMain#close calls ZKServer
shutdown method
So, there is a race condition because if the mentioned finally on first thread
is called before server shutdown (invoked by close in another thread), you are
getting the NPE.
For instance, ZookeeperServerMain runs all the logic in a single thread (server
starts, thread is blocked, when shutdownLatch is invoked the shutdown run in
the same thread
> FileTxnSnapLog.fastForwardFromEdits() throws NPE if TestingServer is started
> from another thread
> ------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3803
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3803
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.13
> Reporter: Vova Vysotskyi
> Priority: Major
> Time Spent: 1h
> Remaining Estimate: 0h
>
> For the case when {{TestingServer.start()}} and {{TestingServer.close()}}
> methods are running in different threads (but {{TestingServer.close()}} is
> executed after {{TestingServer.start()}} is complited),
> {{FileTxnSnapLog.fastForwardFromEdits()}} throws NPE, since
> {{FileTxnSnapLog.close()}} was already called in
> {{TestingZooKeeperMain.internalRunFromConfig()}} method.
> Such a case may be observed in unit tests when start and close methods are
> called in methods annotated with {{@Before}} and {{@After}} annotations.
> Here is a simple test which helps to reproduce this issue:
> {code:java}
> @Test
> public void testNPE() throws Exception {
> for (int i = 0; i < 100; i++) {
> TestingServer testingServer = new TestingServer();
> Thread thread = new Thread(() -> {
> try {
> testingServer.start();
> } catch (Exception e) {
> throw new RuntimeException(e);
> }
> });
> thread.start();
> thread.join();
> testingServer.close();
> }
> }
> {code}
> The stack trace is the following:
> {noformat}
> java.lang.NullPointerException
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> at
> org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251)
> at
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583)
> at
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:929)
> at
> org.apache.curator.test.TestingZooKeeperMain.close(TestingZooKeeperMain.java:178)
> at
> org.apache.curator.test.TestingZooKeeperServer.stop(TestingZooKeeperServer.java:118)
> at
> org.apache.curator.test.TestingZooKeeperServer.close(TestingZooKeeperServer.java:130)
> at org.apache.curator.test.TestingServer.close(TestingServer.java:178)
> at
> org.apache.drill.exec.coord.zk.TestZookeeperClient.testNPE(TestZookeeperClient.java:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> mockit.integration.junit4.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:157)
> at
> mockit.integration.junit4.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:71)
> at
> mockit.integration.junit4.FakeFrameworkMethod.invokeExplosively(FakeFrameworkMethod.java:29)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
> at
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
> at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
> {noformat}
> Looks like this NPE is a regression after ZOOKEEPER-2845, where instead of
> using a local variable of {{txnLog}} was used class field.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)