[ https://issues.apache.org/jira/browse/SOLR-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698792#comment-14698792 ]
Erick Erickson commented on SOLR-7836: -------------------------------------- Poking a little more, opening a new searcher in add happens only when clearCaches==true, which only happens explicitly in DUH2.addAndDelete which is where all this started. There's also a call in the CDCR code that passes a variable in, but I don't think that's really relevant. It's simple enough to move opening a new searcher up to these two places, I'll give it a try to evaluate. I don't like that solution much since it's trappy; a new call to add(cmd, true) that fails to open a new searcher could re-introduce the problem that opening that searcher where it's done now is designed to prevent. I suppose a big fat warning is in order? Let me try it just to see whether it cures things or not. I'm pretty sure it'll cure the deadlock problem, I'll first try to just comment out the openSearcher and see if I can blow up the real time get tests, then move the open out and see if either realtime get tests or the new deadlock test fail with the reorganized code. When I collect that data we can discuss some more. Probably have something later today. [~ysee...@gmail.com] Those numbers in the new test were chosen completely arbitrarily, I'm guessing that the point of your changes is to drive the failure more often without lengthening the time the test takes, so I'll incorporate them. > Possible deadlock when closing refcounted index writers. > -------------------------------------------------------- > > Key: SOLR-7836 > URL: https://issues.apache.org/jira/browse/SOLR-7836 > Project: Solr > Issue Type: Bug > Reporter: Erick Erickson > Assignee: Erick Erickson > Fix For: Trunk, 5.4 > > Attachments: SOLR-7836-synch.patch, SOLR-7836.patch, SOLR-7836.patch, > SOLR-7836.patch, deadlock_3.res.zip, deadlock_5_pass_iw.res.zip, deadlock_test > > > Preliminary patch for what looks like a possible race condition between > writerFree and pauseWriter in DefaultSorlCoreState. > Looking for comments and/or why I'm completely missing the boat. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org