[ 
https://issues.apache.org/jira/browse/GEODE-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Shu resolved GEODE-5592.
-----------------------------
       Resolution: Fixed
    Fix Version/s: 1.7.0

> During commit, cleanupTransactionIfNoLongerHost could fail with 
> DistributedSystemDisconnectedException
> ------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-5592
>                 URL: https://issues.apache.org/jira/browse/GEODE-5592
>             Project: Geode
>          Issue Type: Bug
>          Components: transactions
>            Reporter: Eric Shu
>            Assignee: Eric Shu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.7.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The exception stack is as following:
> <ServerConnection on port 21816 Thread 2> tid=0x71] commit caught exception
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Distribution manager on 10.32.110.218(bridgep1_host1_11874:11874)<v4>:1025 
> started at Thu Aug 16 14:29:38 PDT 2018: Membership coordinator 
> 10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared 
> that a network partition has occurred, caused by 
> org.apache.geode.ForcedDisconnectException: Membership coordinator 
> 10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared 
> that a network partition has occurred
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$Stopper.generateCancelledException(ClusterDistributionManager.java:4518)
>         at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Stopper.generateCancelledException(InternalDistributedSystem.java:963)
>         at 
> org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83)
>         at 
> org.apache.geode.internal.cache.locks.TXLockServiceImpl.<init>(TXLockServiceImpl.java:80)
>         at 
> org.apache.geode.internal.cache.locks.TXLockService.createDTLS(TXLockService.java:53)
>         at 
> org.apache.geode.internal.cache.TXLockRequest.releaseDistributed(TXLockRequest.java:108)
>         at 
> org.apache.geode.internal.cache.TXLockRequest.cleanup(TXLockRequest.java:142)
>         at org.apache.geode.internal.cache.TXState.cleanup(TXState.java:871)
>         at 
> org.apache.geode.internal.cache.TXManagerImpl.cleanupTransactionIfNoLongerHost(TXManagerImpl.java:1045)
>         at 
> org.apache.geode.internal.cache.TXManagerImpl.unmasquerade(TXManagerImpl.java:1028)
>         at 
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:177)
>         at 
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:869)
>         at 
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:77)
>         at 
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1217)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$4$1.run(AcceptorImpl.java:645)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.geode.ForcedDisconnectException: Membership coordinator 
> 10.32.110.218(locatorp1_host1_11910:11910:locator)<ec><v1>:1024 has declared 
> that a network partition has occurred
>         at 
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.forceDisconnect(GMSMembershipManager.java:2534)
>         at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1054)
>         at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1373)
>         at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:1823)
>         at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1305)
>         at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>         at org.jgroups.JChannel.up(JChannel.java:741)
>         at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>         at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>         at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>         at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>         at 
> org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>         at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>         at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:73)
>         at 
> org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:72)
>         at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
>         at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
>         at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
>         at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789)
>         at org.jgroups.protocols.TP.receive(TP.java:1714)
>         at 
> org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:152)
>         at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701)
>         ... 1 more
> This cause the lock held by the thread not properly released and the lock 
> will block the cache close.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to