[ 
https://issues.apache.org/jira/browse/SOLR-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900323#comment-16900323
 ] 

ASF subversion and git services commented on SOLR-13678:
--------------------------------------------------------

Commit b18041476ff0ad710c2ddf423cc4a9b0edddba4e in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b180414 ]

SOLR-13678: Harden CollectionPropsTest.testReadWriteCached to work around 
removeCollectionPropsWatcher() deadlock bug

(cherry picked from commit a052fb5436840b45909446668c1137cb3f266c99)


> ZkStateReader.removeCollectionPropsWatcher can deadlock with concurrent 
> zkCallback thread on props watcher
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13678
>                 URL: https://issues.apache.org/jira/browse/SOLR-13678
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Priority: Major
>         Attachments: collectionpropswatcher-deadlock-jstack.txt
>
>
> while investigating an (unrelated) test bug in CollectionPropsTest I 
> discovered a deadlock situation that can occur when calling 
> {{ZkStateReader.removeCollectionPropsWatcher()}} if a zkCallback thread tries 
> to concurrently fire the watchers set on the collection props.
> {{ZkStateReader.removeCollectionPropsWatcher()}} is itself called when a 
> {{CollectionPropsWatcher.onStateChanged()}} impl returns "true" -- meaning 
> that IIUC any usage of {{CollectionPropsWatcher}} could potentially result in 
> this type of deadlock situation. 
> {noformat}
> "TEST-CollectionPropsTest.testReadWriteCached-seed#[D3C6921874D1CFEB]" #15 
> prio=5 os_prio=0 cpu=567.78ms elapsed=682.12s tid=0x00007
> fa5e8343800 nid=0x3f61 waiting for monitor entry  [0x00007fa62d222000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> org.apache.solr.common.cloud.ZkStateReader.lambda$removeCollectionPropsWatcher$20(ZkStateReader.java:2001)
>         - waiting to lock <0x00000000e6207500> (a 
> java.util.concurrent.ConcurrentHashMap)
>         at 
> org.apache.solr.common.cloud.ZkStateReader$$Lambda$617/0x00000001006c1840.apply(Unknown
>  Source)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(java.base@11.0.3/ConcurrentHashMap.java:1932)
>         - locked <0x00000000eb9156b8> (a 
> java.util.concurrent.ConcurrentHashMap$Node)
>         at 
> org.apache.solr.common.cloud.ZkStateReader.removeCollectionPropsWatcher(ZkStateReader.java:1994)
>         at 
> org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:125)
> ...
> "zkCallback-88-thread-2" #213 prio=5 os_prio=0 cpu=14.06ms elapsed=672.65s 
> tid=0x00007fa6041bf000 nid=0x402f waiting for monitor ent
> ry  [0x00007fa5b8f39000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(java.base@11.0.3/ConcurrentHashMap.java:1923)
>         - waiting to lock <0x00000000eb9156b8> (a 
> java.util.concurrent.ConcurrentHashMap$Node)
>         at 
> org.apache.solr.common.cloud.ZkStateReader$PropsNotification.<init>(ZkStateReader.java:2262)
>         at 
> org.apache.solr.common.cloud.ZkStateReader.notifyPropsWatchers(ZkStateReader.java:2243)
>         at 
> org.apache.solr.common.cloud.ZkStateReader$PropsWatcher.refreshAndWatch(ZkStateReader.java:1458)
>         - locked <0x00000000e6207500> (a 
> java.util.concurrent.ConcurrentHashMap)
>         at 
> org.apache.solr.common.cloud.ZkStateReader$PropsWatcher.process(ZkStateReader.java:1440)
>         at 
> org.apache.solr.common.cloud.SolrZkClient$ProcessWatchWithExecutor.lambda$process$1(SolrZkClient.java:838)
>         at 
> org.apache.solr.common.cloud.SolrZkClient$ProcessWatchWithExecutor$$Lambda$253/0x00000001004a4440.run(Unknown
>  Source)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.3/Executors.java:515)
>         at 
> java.util.concurrent.FutureTask.run(java.base@11.0.3/FutureTask.java:264)
>         at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
>         at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$140/0x0000000100308c40.run(Unknown
>  Source)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.3/ThreadPoolExecutor.java:1128)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.3/ThreadPoolExecutor.java:628)
>         at java.lang.Thread.run(java.base@11.0.3/Thread.java:834)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to