Darrel Schneider created GEODE-10286:
----------------------------------------
Summary: cache close in response to a forced disconnect with
persistent regions may skip some cleanup
Key: GEODE-10286
URL: https://issues.apache.org/jira/browse/GEODE-10286
Project: Geode
Issue Type: Bug
Components: core
Reporter: Darrel Schneider
During a cache close, persistent regions may not cleanup as much as they
should. This is because when the PersistentAdvisor is closed, CancelException
is not handled causing other parts of the close to be skipped. I think the
place to handle it is:
DistributedRegion.distributedRegionCleanup(DistributedRegion.java:2564). Here
is an exception showing what it looks like when this happens:
{noformat}
org.apache.geode.distributed.DistributedSystemDisconnectedException:
Distribution manager on rs-RunItNow-ZH1504a1i3xlarge-hydra-client-10(dataStor
egemfire2_host1_421:421)<ec><v22>:41004 started at Wed Mar 23 17:11:48 PDT
2022: Member isn't responding to heartbeat requests, caused by org.apac
he.geode.ForcedDisconnectException: Member isn't responding to heartbeat
requests
at
org.apache.geode.distributed.internal.ClusterDistributionManager$Stopper.generateCancelledException(ClusterDistributionManager.java:289
3)
at
org.apache.geode.distributed.internal.InternalDistributedSystem$Stopper.generateCancelledException(InternalDistributedSystem.java:1177)
at
org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83)
at
org.apache.geode.distributed.internal.ClusterElderManager.getElderId(ClusterElderManager.java:76)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.getElderId(ClusterDistributionManager.java:2085)
at
org.apache.geode.distributed.internal.locks.DLockService.getElderId(DLockService.java:254)
at
org.apache.geode.distributed.internal.locks.DLockService.notLockGrantorId(DLockService.java:824)
at
org.apache.geode.distributed.internal.locks.DLockService.unlock(DLockService.java:1807)
at
org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.releaseTieLock(PersistenceAdvisorImpl.java:1181)
at
org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.close(PersistenceAdvisorImpl.java:1158)
at
org.apache.geode.internal.cache.DistributedRegion.distributedRegionCleanup(DistributedRegion.java:2564)
at
org.apache.geode.internal.cache.DistributedRegion.postDestroyRegion(DistributedRegion.java:2657)
at
org.apache.geode.internal.cache.LocalRegion.recursiveDestroyRegion(LocalRegion.java:2732)
at
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6241)
at
org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1834)
at
org.apache.geode.internal.cache.LocalRegion.handleCacheClose(LocalRegion.java:7320)
at
org.apache.geode.internal.cache.DistributedRegion.handleCacheClose(DistributedRegion.java:2691)
at
org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2308)
at
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2154)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1538)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2545)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
at
org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
at
org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1793)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.apache.geode.ForcedDisconnectException: Member isn't responding
to heartbeat requests
at
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2319)
... 3 more
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)