[ https://issues.apache.org/jira/browse/SOLR-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900310#comment-16900310 ]
ASF subversion and git services commented on SOLR-13678: -------------------------------------------------------- Commit a052fb5436840b45909446668c1137cb3f266c99 in lucene-solr's branch refs/heads/master from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a052fb5 ] SOLR-13678: Harden CollectionPropsTest.testReadWriteCached to work around removeCollectionPropsWatcher() deadlock bug > ZkStateReader.removeCollectionPropsWatcher can deadlock with concurrent > zkCallback thread on props watcher > ---------------------------------------------------------------------------------------------------------- > > Key: SOLR-13678 > URL: https://issues.apache.org/jira/browse/SOLR-13678 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Hoss Man > Priority: Major > Attachments: collectionpropswatcher-deadlock-jstack.txt > > > while investigating an (unrelated) test bug in CollectionPropsTest I > discovered a deadlock situation that can occur when calling > {{ZkStateReader.removeCollectionPropsWatcher()}} if a zkCallback thread tries > to concurrently fire the watchers set on the collection props. > {{ZkStateReader.removeCollectionPropsWatcher()}} is itself called when a > {{CollectionPropsWatcher.onStateChanged()}} impl returns "true" -- meaning > that IIUC any usage of {{CollectionPropsWatcher}} could potentially result in > this type of deadlock situation. > {noformat} > "TEST-CollectionPropsTest.testReadWriteCached-seed#[D3C6921874D1CFEB]" #15 > prio=5 os_prio=0 cpu=567.78ms elapsed=682.12s tid=0x00007 > fa5e8343800 nid=0x3f61 waiting for monitor entry [0x00007fa62d222000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ZkStateReader.lambda$removeCollectionPropsWatcher$20(ZkStateReader.java:2001) > - waiting to lock <0x00000000e6207500> (a > java.util.concurrent.ConcurrentHashMap) > at > org.apache.solr.common.cloud.ZkStateReader$$Lambda$617/0x00000001006c1840.apply(Unknown > Source) > at > java.util.concurrent.ConcurrentHashMap.compute(java.base@11.0.3/ConcurrentHashMap.java:1932) > - locked <0x00000000eb9156b8> (a > java.util.concurrent.ConcurrentHashMap$Node) > at > org.apache.solr.common.cloud.ZkStateReader.removeCollectionPropsWatcher(ZkStateReader.java:1994) > at > org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:125) > ... > "zkCallback-88-thread-2" #213 prio=5 os_prio=0 cpu=14.06ms elapsed=672.65s > tid=0x00007fa6041bf000 nid=0x402f waiting for monitor ent > ry [0x00007fa5b8f39000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > java.util.concurrent.ConcurrentHashMap.compute(java.base@11.0.3/ConcurrentHashMap.java:1923) > - waiting to lock <0x00000000eb9156b8> (a > java.util.concurrent.ConcurrentHashMap$Node) > at > org.apache.solr.common.cloud.ZkStateReader$PropsNotification.<init>(ZkStateReader.java:2262) > at > org.apache.solr.common.cloud.ZkStateReader.notifyPropsWatchers(ZkStateReader.java:2243) > at > org.apache.solr.common.cloud.ZkStateReader$PropsWatcher.refreshAndWatch(ZkStateReader.java:1458) > - locked <0x00000000e6207500> (a > java.util.concurrent.ConcurrentHashMap) > at > org.apache.solr.common.cloud.ZkStateReader$PropsWatcher.process(ZkStateReader.java:1440) > at > org.apache.solr.common.cloud.SolrZkClient$ProcessWatchWithExecutor.lambda$process$1(SolrZkClient.java:838) > at > org.apache.solr.common.cloud.SolrZkClient$ProcessWatchWithExecutor$$Lambda$253/0x00000001004a4440.run(Unknown > Source) > at > java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.3/Executors.java:515) > at > java.util.concurrent.FutureTask.run(java.base@11.0.3/FutureTask.java:264) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$140/0x0000000100308c40.run(Unknown > Source) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.3/ThreadPoolExecutor.java:1128) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.3/ThreadPoolExecutor.java:628) > at java.lang.Thread.run(java.base@11.0.3/Thread.java:834) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org