[
https://issues.apache.org/jira/browse/SOLR-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698257#comment-14698257
]
Yonik Seeley commented on SOLR-7836:
------------------------------------
I ran a while with trunk, with removing the extra sync on the normal "add" case
only, not the "addDelete" case.
I hit another deadlock.
This looks like the case Mark was looking for before... with a different thread
holding the writer.
{code}
2> "WRITER5" ID=23 BLOCKED on java.lang.Object@5764facb owned by "WRITER4"
ID=22
2> at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:593)
2> - blocked on java.lang.Object@5764facb
2> at
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
2> at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
2> at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1641)
2> at
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1618)
2> at
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:161)
2> at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:270)
[...]
2> "WRITER4" ID=22 TIMED_WAITING on java.lang.Object@5a94059a
2> at java.lang.Object.wait(Native Method)
2> - timed waiting on java.lang.Object@5a94059a
2> at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:96)
2> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1588)
2> at org.apache.solr.update.UpdateLog.add(UpdateLog.java:455)
2> - locked org.apache.solr.update.UpdateLog@5911991e
2> at
org.apache.solr.update.DirectUpdateHandler2.addAndDelete(DirectUpdateHandler2.java:331)
2> - locked java.lang.Object@5764facb
2> at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:200)
2> at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164)
2> at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
[...]
2> "TEST-TestReloadDeadlock.testReloadDeadlock-seed#[D4E455E167793E1]" ID=12
TIMED_WAITING on java.lang.Object@5a94059a
2> at java.lang.Object.wait(Native Method)
2> - timed waiting on java.lang.Object@5a94059a
2> at
org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:156)
2> - locked org.apache.solr.update.DefaultSolrCoreState@1b88614d
2> at org.apache.solr.core.SolrCore.reload(SolrCore.java:479)
{code}
Writer5 wants to do a commit
- calls solrCoreState.getIndexWriter()
- blocks waiting for updateLock
Writer4 wants to do an addAndDelete
- aquires updateLock
- aquires UpdateLog.this
- calls DefaultSolrCoreState.getIndexWriter and waits forever
Main-test-thread wants to do a reload:
- calls DefaultSolrCoreState.newIndexWriter and waits forever
It feels like this type of deadlock can still be hit on trunk now unmodified?
Perhaps the right solution was to just pass the IndexWriter down once you
aquire it.
> Possible deadlock when closing refcounted index writers.
> --------------------------------------------------------
>
> Key: SOLR-7836
> URL: https://issues.apache.org/jira/browse/SOLR-7836
> Project: Solr
> Issue Type: Bug
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Fix For: Trunk, 5.4
>
> Attachments: SOLR-7836-synch.patch, SOLR-7836.patch, SOLR-7836.patch,
> SOLR-7836.patch, deadlock_3.res.zip, deadlock_5_pass_iw.res.zip, deadlock_test
>
>
> Preliminary patch for what looks like a possible race condition between
> writerFree and pauseWriter in DefaultSorlCoreState.
> Looking for comments and/or why I'm completely missing the boat.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]