[ 
https://issues.apache.org/jira/browse/SOLR-16848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736246#comment-17736246
 ] 

Paul McArthur commented on SOLR-16848:
--------------------------------------

I've been able to reproduce this failure consistently, and I can confirm that 
your diagnosis is correct.

In our case, we've changed the CoreContainer to load cores synchronously, and 
that caused this test to fail consistently by altering the race condition 
slightly so that the `testing_beforeRegisterInZk` code was much more likely to 
be executed before the JettySolrRunner was fully initialized.

 

I believe your proposed solution will work. I implemented a similar fix for 
that code to wait for `JettySolrRunner.isRunning()` to return true, and that 
seems to have resolved the problem for us.

 

> Flaky DeleteReplicaTest.raceConditionOnDeleteAndRegisterReplica
> ---------------------------------------------------------------
>
>                 Key: SOLR-16848
>                 URL: https://issues.apache.org/jira/browse/SOLR-16848
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Alex Deparvu
>            Priority: Minor
>
> Some stats first:
> - Past 7 days trend:
> {noformat}
> Class: org.apache.solr.cloud.DeleteReplicaTest
> Method: raceConditionOnDeleteAndRegisterReplica
> Failures: 11.22% (44 / 392)
> {noformat}
> - Test failure is caused by a NullPointerException:
> {noformat}
> ERROR (coreZkRegister-772-thread-1-processing-127.0.0.1:40471_solr) 
> [n:127.0.0.1:40471_solr c:raceDeleteReplicaCollection s:shard1 r:core_node4 
> x:raceDeleteReplicaCollection_shard1_replica_n2] o.a.s.c.DeleteReplicaTest 
> Failed to delete replica
>  => java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.core.CoreContainer.getZkController()" because the return 
> value of "org.apache.solr.embedded.JettySolrRunner.getCoreContainer()" is null
> {noformat}
> a more complete trace 
> {noformat}
> o.a.s.c.DeleteReplicaTest Failed to delete replica
>   2>           => java.lang.NullPointerException
>   2>    at 
> org.apache.solr.cloud.DeleteReplicaTest.lambda$raceConditionOnDeleteAndRegisterReplica$10(DeleteReplicaTest.java:350)
>   2> java.lang.NullPointerException: null
>   2>    at 
> org.apache.solr.cloud.DeleteReplicaTest.lambda$raceConditionOnDeleteAndRegisterReplica$10(DeleteReplicaTest.java:350)
>   2>    at 
> org.apache.solr.core.ZkContainer.lambda$registerInZk$1(ZkContainer.java:211) 
> [main/:?]
>   2>    at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:289)
>   2>    at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   2>    at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   2>    at java.lang.Thread.run(Thread.java:829) [?:?]
> o.a.s.c.u.ExecutorUtil Uncaught exception java.lang.AssertionError: Failed to 
> delete replica thrown by thread: 
> coreZkRegister-1586-thread-1-processing-127.0.0.1:34497_solr
>   2>           => java.lang.Exception: Submitter stack trace
>   2>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:256)
> java.lang.Exception: Submitter stack trace
>   2>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:256)
>   2>  at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:244) 
> ~[main/:?]
>   2>  at 
> org.apache.solr.core.CoreContainer.lambda$loadInternal$12(CoreContainer.java:1025)
>   2>  at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:234)
>   2>  at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
>   2>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:289)
>   2>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   2>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   2>  at java.lang.Thread.run(Thread.java:829) [?:?]
> Caused by: java.lang.Exception: Submitter stack trace
>   2>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:256)
>   2>  at 
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:140)
>  ~[?:?]
>   2>  at 
> com.codahale.metrics.InstrumentedExecutorService.submit(InstrumentedExecutorService.java:122)
>   2>  at 
> org.apache.solr.core.CoreContainer.loadInternal(CoreContainer.java:1009) 
> ~[main/:?]
>   2>  at org.apache.solr.core.CoreContainer.load(CoreContainer.java:753)
>   2>  at 
> org.apache.solr.servlet.CoreContainerProvider.createCoreContainer(CoreContainerProvider.java:411)
>   2>  at 
> org.apache.solr.servlet.CoreContainerProvider.init(CoreContainerProvider.java:230)
>   2>  at 
> org.apache.solr.embedded.JettySolrRunner$1.lifeCycleStarted(JettySolrRunner.java:407)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to