Eric Shu created GEODE-5592:
-------------------------------

             Summary: During commit, cleanupTransactionIfNoLongerHost could 
fail with DistributedSystemDisconnectedException
                 Key: GEODE-5592
                 URL: https://issues.apache.org/jira/browse/GEODE-5592
             Project: Geode
          Issue Type: Bug
          Components: transactions
            Reporter: Eric Shu


The exception stack is as following:
<ServerConnection on port 21816 Thread 2> tid=0x71] commit caught exception
org.apache.geode.distributed.DistributedSystemDisconnectedException: 
Distribution manager on 10.32.110.218(bridgep1_host1_11874:11874)<v4>:1025 
started at Thu Aug 16 14:29:38 PDT 2018: Membership coordinator 
10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared 
that a network partition has occurred, caused by 
org.apache.geode.ForcedDisconnectException: Membership coordinator 
10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared 
that a network partition has occurred
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$Stopper.generateCancelledException(ClusterDistributionManager.java:4518)
        at 
org.apache.geode.distributed.internal.InternalDistributedSystem$Stopper.generateCancelledException(InternalDistributedSystem.java:963)
        at 
org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83)
        at 
org.apache.geode.internal.cache.locks.TXLockServiceImpl.<init>(TXLockServiceImpl.java:80)
        at 
org.apache.geode.internal.cache.locks.TXLockService.createDTLS(TXLockService.java:53)
        at 
org.apache.geode.internal.cache.TXLockRequest.releaseDistributed(TXLockRequest.java:108)
        at 
org.apache.geode.internal.cache.TXLockRequest.cleanup(TXLockRequest.java:142)
        at org.apache.geode.internal.cache.TXState.cleanup(TXState.java:871)
        at 
org.apache.geode.internal.cache.TXManagerImpl.cleanupTransactionIfNoLongerHost(TXManagerImpl.java:1045)
        at 
org.apache.geode.internal.cache.TXManagerImpl.unmasquerade(TXManagerImpl.java:1028)
        at 
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:177)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:869)
        at 
org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:77)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1217)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$4$1.run(AcceptorImpl.java:645)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.geode.ForcedDisconnectException: Membership coordinator 
10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared 
that a network partition has occurred
        at 
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.forceDisconnect(GMSMembershipManager.java:2534)
        at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1054)
        at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1373)
        at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:1823)
        at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1305)
        at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
        at org.jgroups.JChannel.up(JChannel.java:741)
        at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
        at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
        at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
        at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
        at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
        at 
org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:73)
        at 
org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:72)
        at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
        at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
        at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
        at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789)
        at org.jgroups.protocols.TP.receive(TP.java:1714)
        at 
org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:152)
        at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701)
        ... 1 more

This cause the lock held by the thread not properly released and the lock will 
block the cache close.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to