Eric Shu created GEODE-5592: ------------------------------- Summary: During commit, cleanupTransactionIfNoLongerHost could fail with DistributedSystemDisconnectedException Key: GEODE-5592 URL: https://issues.apache.org/jira/browse/GEODE-5592 Project: Geode Issue Type: Bug Components: transactions Reporter: Eric Shu
The exception stack is as following: <ServerConnection on port 21816 Thread 2> tid=0x71] commit caught exception org.apache.geode.distributed.DistributedSystemDisconnectedException: Distribution manager on 10.32.110.218(bridgep1_host1_11874:11874)<v4>:1025 started at Thu Aug 16 14:29:38 PDT 2018: Membership coordinator 10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared that a network partition has occurred, caused by org.apache.geode.ForcedDisconnectException: Membership coordinator 10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared that a network partition has occurred at org.apache.geode.distributed.internal.ClusterDistributionManager$Stopper.generateCancelledException(ClusterDistributionManager.java:4518) at org.apache.geode.distributed.internal.InternalDistributedSystem$Stopper.generateCancelledException(InternalDistributedSystem.java:963) at org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83) at org.apache.geode.internal.cache.locks.TXLockServiceImpl.<init>(TXLockServiceImpl.java:80) at org.apache.geode.internal.cache.locks.TXLockService.createDTLS(TXLockService.java:53) at org.apache.geode.internal.cache.TXLockRequest.releaseDistributed(TXLockRequest.java:108) at org.apache.geode.internal.cache.TXLockRequest.cleanup(TXLockRequest.java:142) at org.apache.geode.internal.cache.TXState.cleanup(TXState.java:871) at org.apache.geode.internal.cache.TXManagerImpl.cleanupTransactionIfNoLongerHost(TXManagerImpl.java:1045) at org.apache.geode.internal.cache.TXManagerImpl.unmasquerade(TXManagerImpl.java:1028) at org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:177) at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:869) at org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:77) at org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1217) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$4$1.run(AcceptorImpl.java:645) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.geode.ForcedDisconnectException: Membership coordinator 10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared that a network partition has occurred at org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.forceDisconnect(GMSMembershipManager.java:2534) at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1054) at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1373) at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:1823) at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1305) at org.jgroups.JChannel.invokeCallback(JChannel.java:816) at org.jgroups.JChannel.up(JChannel.java:741) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) at org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:73) at org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:72) at org.jgroups.protocols.TP.passMessageUp(TP.java:1658) at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876) at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10) at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789) at org.jgroups.protocols.TP.receive(TP.java:1714) at org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:152) at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701) ... 1 more This cause the lock held by the thread not properly released and the lock will block the cache close. -- This message was sent by Atlassian JIRA (v7.6.3#76005)