[ 
https://issues.apache.org/jira/browse/SOLR-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698792#comment-14698792
 ] 

Erick Erickson commented on SOLR-7836:
--------------------------------------

Poking a little more, opening a new searcher in add happens only when 
clearCaches==true, which only happens explicitly in DUH2.addAndDelete which is 
where all this started. There's also a call in the CDCR code that passes a 
variable in, but I don't think that's really relevant.

It's simple enough to move opening a new searcher up to these two places, I'll 
give it a try to evaluate. I don't like that solution much since it's trappy; a 
new call to add(cmd, true) that fails to open a new searcher could re-introduce 
the problem that opening that searcher where it's done now is designed to 
prevent. I suppose a big fat warning is in order?

Let me try it just to see whether it cures things or not. I'm pretty sure it'll 
cure the deadlock problem, I'll first try to just comment out the openSearcher 
and see if I can blow up the real time get tests, then move the open out and 
see if either realtime get tests or the new deadlock test fail with the 
reorganized code. When I collect that data we can discuss some more. Probably 
have something later today.

[~ysee...@gmail.com] Those numbers in the new test were chosen completely 
arbitrarily, I'm guessing that the point of your changes is to drive the 
failure more often without lengthening the time the test takes, so I'll 
incorporate them.

> Possible deadlock when closing refcounted index writers.
> --------------------------------------------------------
>
>                 Key: SOLR-7836
>                 URL: https://issues.apache.org/jira/browse/SOLR-7836
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>             Fix For: Trunk, 5.4
>
>         Attachments: SOLR-7836-synch.patch, SOLR-7836.patch, SOLR-7836.patch, 
> SOLR-7836.patch, deadlock_3.res.zip, deadlock_5_pass_iw.res.zip, deadlock_test
>
>
> Preliminary patch for what looks like a possible race condition between 
> writerFree and pauseWriter in DefaultSorlCoreState.
> Looking for comments and/or why I'm completely missing the boat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to