[ https://issues.apache.org/jira/browse/GEODE-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sai Boorlagadda updated GEODE-6517: ----------------------------------- Fix Version/s: (was: 1.10.0) 1.9.0 > Race condition exists that a node failed to be shutdown as it is stuck on > PRHARedundancyProvider.waitForPersistentBucketRecovery() > ---------------------------------------------------------------------------------------------------------------------------------- > > Key: GEODE-6517 > URL: https://issues.apache.org/jira/browse/GEODE-6517 > Project: Geode > Issue Type: Bug > Components: regions > Affects Versions: 1.1.0 > Reporter: Eric Shu > Assignee: Eric Shu > Priority: Major > Fix For: 1.9.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The hang thread stack: > "Shutdown Disconnector1" #93 prio=10 os_prio=0 tid=0x00007f84b8002800 > nid=0x6875 waiting on condition [0x00007f844ee31000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000f14f0490> (a > java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.geode.internal.cache.PRHARedundancyProvider.waitForPersistentBucketRecovery(PRHARedundancyProvider.java:2019) > at > org.apache.geode.internal.cache.PartitionedRegion.postDestroyRegion(PartitionedRegion.java:7536) > at > org.apache.geode.internal.cache.LocalRegion.recursiveDestroyRegion(LocalRegion.java:2707) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6308) > at > org.apache.geode.internal.cache.LocalRegion.handleCacheClose(LocalRegion.java:7387) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2281) > - locked <0x00000000f0abeb00> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1593) > - locked <0x00000000f0abeb00> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1255) > at > org.apache.geode.management.internal.cli.functions.ShutDownFunction.lambda$disconnectInNonDaemonThread$0(ShutDownFunction.java:78) > at > org.apache.geode.management.internal.cli.functions.ShutDownFunction$$Lambda$94/665093117.run(Unknown > Source) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > The race occurs during recoverPersistentBuckets, between following latch is > created and then nulled out, shutdown thread could get hold of the reference > of latch and wait for countDown forever. > allBucketsRecoveredFromDisk = new CountDownLatch(proxyBucketArray.length); > try { > if (proxyBucketArray.length > 0) { > this.redundancyLogger = new RedundancyLogger(this); > Thread loggingThread = new LoggingThread( > "RedundancyLogger for region " + this.prRegion.getName(), false, > this.redundancyLogger); > loggingThread.start(); > } > } catch (RuntimeException e) { > allBucketsRecoveredFromDisk = null; > throw e; > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)