[ 
https://issues.apache.org/jira/browse/SOLR-15254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655076#comment-17655076
 ] 

Uwe Schindler commented on SOLR-15254:
--------------------------------------

Yes, from the first response I figured, too. I was searching for issues about 
this in Lucene, but I forgot to check Solr.

It could have been fixed much earlier, as all other SPIs use the lazy init 
approach, the analysis ones were the only ones left. My fault, sorry!


> Random deadlock during cores initialization
> -------------------------------------------
>
>                 Key: SOLR-15254
>                 URL: https://issues.apache.org/jira/browse/SOLR-15254
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 8.5.1
>         Environment: openjdk 11.0.9.1
>            Reporter: Thomas Mortagne
>            Priority: Major
>
> At XWiki we are using Solr in an embbedded way with 5 cores, and sometimes we 
> end up with what appears to be a race condition leading to a deadlock between 
> several of the cores's configuration loading while loading some required 
> classes.
> Here is an extract of a thread dump (also attached is a more complete version 
> of the thread dump) we get in such a case:
> {noformat}
> "coreLoadExecutor-9-thread-1" #41 prio=5 os_prio=0 cpu=305.06ms 
> elapsed=101.24s tid=0x00007fb5cfc8f800 nid=0x57a in Object.wait()  
> [0x00007fb5717ec000]
>    java.lang.Thread.State: RUNNABLE
>       at java.lang.Class.forName0(java.base@11.0.9.1/Native Method)
>       at java.lang.Class.forName(java.base@11.0.9.1/Class.java:398)
>       at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:554)
>       at 
> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:639)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader$3.create(FieldTypePluginLoader.java:339)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader$3.create(FieldTypePluginLoader.java:332)
>       at 
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:152)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:356)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:90)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:48)
>       at 
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:152)
>       at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:502)
>       at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:178)
>       at 
> org.apache.solr.schema.ManagedIndexSchema.<init>(ManagedIndexSchema.java:105)
>       at 
> org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:178)
>       at 
> org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:46)
>       at 
> org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:145)
>       at 
> org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:83)
>       at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1302)
>       at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:802)
>       at 
> org.apache.solr.core.CoreContainer$$Lambda$569/0x0000000840bc7040.call(Unknown
>  Source)
>       at 
> org.glowroot.agent.plugin.executor.CallableWrapper.call(CallableWrapper.java:38)
>       at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>       at 
> java.util.concurrent.FutureTask.run(java.base@11.0.9.1/FutureTask.java:264)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$438/0x0000000840b29440.run(Unknown
>  Source)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.9.1/ThreadPoolExecutor.java:1128)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.9.1/ThreadPoolExecutor.java:628)
>       at java.lang.Thread.run(java.base@11.0.9.1/Thread.java:834)
> "coreLoadExecutor-9-thread-3" #43 prio=5 os_prio=0 cpu=294.37ms 
> elapsed=101.24s tid=0x00007fb5cfcdf800 nid=0x57c in Object.wait()  
> [0x00007fb5715ea000]
>    java.lang.Thread.State: RUNNABLE
>       at java.lang.Class.forName0(java.base@11.0.9.1/Native Method)
>       at java.lang.Class.forName(java.base@11.0.9.1/Class.java:398)
>       at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:554)
>       at 
> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:639)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader$3.create(FieldTypePluginLoader.java:339)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader$3.create(FieldTypePluginLoader.java:332)
>       at 
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:152)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:356)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:90)
>       at 
> org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:48)
>       at 
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:152)
>       at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:502)
>       at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:178)
>       at 
> org.apache.solr.schema.ManagedIndexSchema.<init>(ManagedIndexSchema.java:105)
>       at 
> org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:178)
>       at 
> org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:46)
>       at 
> org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:145)
>       at 
> org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:83)
>       at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1302)
>       at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:802)
>       at 
> org.apache.solr.core.CoreContainer$$Lambda$569/0x0000000840bc7040.call(Unknown
>  Source)
>       at 
> org.glowroot.agent.plugin.executor.CallableWrapper.call(CallableWrapper.java:38)
>       at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>       at 
> java.util.concurrent.FutureTask.run(java.base@11.0.9.1/FutureTask.java:264)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$438/0x0000000840b29440.run(Unknown
>  Source)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.9.1/ThreadPoolExecutor.java:1128)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.9.1/ThreadPoolExecutor.java:628)
>       at java.lang.Thread.run(java.base@11.0.9.1/Thread.java:834)
> "coreLoadExecutor-9-thread-2" #42 prio=5 os_prio=0 cpu=388.81ms 
> elapsed=101.24s tid=0x00007fb5cfcde800 nid=0x57b in Object.wait()  
> [0x00007fb5716eb000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> jdk.internal.misc.Unsafe.ensureClassInitialized0(java.base@11.0.9.1/Native 
> Method)
>       at 
> jdk.internal.misc.Unsafe.ensureClassInitialized(java.base@11.0.9.1/Unsafe.java:1042)
>       at 
> jdk.internal.reflect.UnsafeFieldAccessorFactory.newFieldAccessor(java.base@11.0.9.1/UnsafeFieldAccessorFactory.java:43)
>       at 
> jdk.internal.reflect.ReflectionFactory.newFieldAccessor(java.base@11.0.9.1/ReflectionFactory.java:186)
>       at 
> java.lang.reflect.Field.acquireFieldAccessor(java.base@11.0.9.1/Field.java:1105)
>       at 
> java.lang.reflect.Field.getFieldAccessor(java.base@11.0.9.1/Field.java:1086)
>       at java.lang.reflect.Field.get(java.base@11.0.9.1/Field.java:418)
>       at 
> org.apache.lucene.analysis.util.AbstractAnalysisFactory.lookupSPIName(AbstractAnalysisFactory.java:358)
>       at 
> org.apache.lucene.analysis.util.AnalysisSPILoader.reload(AnalysisSPILoader.java:90)
>       - locked <0x00000000e98e20a0> (a 
> org.apache.lucene.analysis.util.AnalysisSPILoader)
>       at 
> org.apache.lucene.analysis.util.AnalysisSPILoader.<init>(AnalysisSPILoader.java:66)
>       at 
> org.apache.lucene.analysis.util.AnalysisSPILoader.<init>(AnalysisSPILoader.java:52)
>       at 
> org.apache.lucene.analysis.util.TokenFilterFactory.<clinit>(TokenFilterFactory.java:33)
>       at 
> org.apache.solr.core.SolrResourceLoader.reloadLuceneSPI(SolrResourceLoader.java:256)
>       at 
> org.apache.solr.core.SolrResourceLoader.<init>(SolrResourceLoader.java:196)
>       at 
> org.apache.solr.core.ConfigSetService$Standalone.createCoreResourceLoader(ConfigSetService.java:207)
>       at 
> org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:66)
>       at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1302)
>       at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:802)
>       at 
> org.apache.solr.core.CoreContainer$$Lambda$569/0x0000000840bc7040.call(Unknown
>  Source)
>       at 
> org.glowroot.agent.plugin.executor.CallableWrapper.call(CallableWrapper.java:38)
>       at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>       at 
> java.util.concurrent.FutureTask.run(java.base@11.0.9.1/FutureTask.java:264)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$438/0x0000000840b29440.run(Unknown
>  Source)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.9.1/ThreadPoolExecutor.java:1128)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.9.1/ThreadPoolExecutor.java:628)
>       at java.lang.Thread.run(java.base@11.0.9.1/Thread.java:834)
> "main" #1 prio=5 os_prio=0 cpu=25108.09ms elapsed=133.50s 
> tid=0x00007fb5cc018800 nid=0x541 waiting on condition  [0x00007fb5d2f73000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>       at jdk.internal.misc.Unsafe.park(java.base@11.0.9.1/Native Method)
>       - parking to wait for  <0x00000000ea480000> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.9.1/LockSupport.java:234)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@11.0.9.1/AbstractQueuedSynchronizer.java:2123)
>       at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(java.base@11.0.9.1/ThreadPoolExecutor.java:1454)
>       at 
> com.codahale.metrics.InstrumentedExecutorService.awaitTermination(InstrumentedExecutorService.java:163)
>       at 
> org.apache.solr.common.util.ExecutorUtil.awaitTermination(ExecutorUtil.java:85)
>       at 
> org.apache.solr.common.util.ExecutorUtil.shutdownAndAwaitTermination(ExecutorUtil.java:77)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:843)
> "h2sc-4-thread-1" #38 prio=5 os_prio=0 cpu=0.20ms elapsed=101.89s 
> tid=0x00007fb5cfc16800 nid=0x577 runnable  [0x00007fb571cf0000]
>    java.lang.Thread.State: RUNNABLE
>       at sun.nio.ch.EPoll.wait(java.base@11.0.9.1/Native Method)
>       at 
> sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.9.1/EPollSelectorImpl.java:120)
>       at 
> sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.9.1/SelectorImpl.java:124)
>       - locked <0x00000000e9819cf0> (a sun.nio.ch.Util$2)
>       - locked <0x00000000e9819c98> (a sun.nio.ch.EPollSelectorImpl)
>       at 
> sun.nio.ch.SelectorImpl.select(java.base@11.0.9.1/SelectorImpl.java:141)
>       at 
> org.eclipse.jetty.io.ManagedSelector.nioSelect(ManagedSelector.java:149)
>       at org.eclipse.jetty.io.ManagedSelector.select(ManagedSelector.java:156)
>       at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:572)
>       at 
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:509)
>       at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
>       at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
>       at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
>       at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135)
>       at 
> org.eclipse.jetty.io.ManagedSelector$$Lambda$437/0x0000000840b28c40.run(Unknown
>  Source)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>       at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$438/0x0000000840b29440.run(Unknown
>  Source)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.9.1/ThreadPoolExecutor.java:1128
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.9.1/ThreadPoolExecutor.java:628)
>       at java.lang.Thread.run(java.base@11.0.9.1/Thread.java:834)
> {noformat}
> Notice the coreLoadExecutor-9-thread-* thread which despite their "RUNNABLE" 
> state are actually internally blocked waiting on some unknown object. The 
> main thread is stuck because it's waiting for the cores's initialization to 
> finish.
> I included "h2sc-4-thread-1" to have all the threads doing stuff but it does 
> not seems to have much to do with the deadlock (other than being a Solr thing 
> too).
> It's not very easy to debug since we cannot always reproduce but maybe you 
> guys have a clue for where this issue might be comming from based on those 
> informations.
> When we hit this problem we can "workaround" it by restarting a few times 
> until it pass the Solr init.
> We never had this issue back when we only had 1 or 2 cores but having 5 of 
> those is very recent for us so I cannot really tell if it's some recent Solr 
> regression or an old bug we did not had before because we did not had enough 
> cores to trigger it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to