[
https://issues.apache.org/jira/browse/GEODE-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065880#comment-17065880
]
ASF subversion and git services commented on GEODE-7884:
--------------------------------------------------------
Commit 2d2a3f80bd5053749963889c1898df48e9aa0be7 in geode's branch
refs/heads/feature/GEODE-6008b from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=2d2a3f8 ]
GEODE-7884: server hangs due to IllegalStateException (#4822)
* GEODE-7884: server hangs due to IllegalStateException
Added cancellation check before scheduling an idle-timeout or
ack-wait-threshold timer task. I had to add a new method to
SystemTimerTask and then noticed there were no tests for SystemTimer, so
I cleaned up that class and added tests.
* adding missing copyright header to new test
* fixing LGTM issues
* reinstating 'continue' when encountering a null timer during a sweep
* addressing Bill's comments
renamed swarm everwhere
made the collection of timers associated with a DistributedSystem into a Set
made timer task variables in Connection volatile
added checks in tasks to cancel themselves if their Connection is closed
> server hangs due to IllegalStateException
> -----------------------------------------
>
> Key: GEODE-7884
> URL: https://issues.apache.org/jira/browse/GEODE-7884
> Project: Geode
> Issue Type: Bug
> Components: membership, messaging
> Reporter: Bruce J Schuchardt
> Assignee: Bruce J Schuchardt
> Priority: Major
> Fix For: 1.13.0
>
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> An application hung on a cache operation:
> {noformat}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000f61617b8> (a
> java.util.concurrent.CountDownLatch$Sync)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
> at
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:731)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:802)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:779)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:865)
> at
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
> at
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
> at
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
> at
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
> at
> org.apache.geode.internal.cache.DistributedRegion.distributeUpdate(DistributedRegion.java:514)
> at
> org.apache.geode.internal.cache.DistributedRegion.basicPutPart3(DistributedRegion.java:492)
> at
> org.apache.geode.internal.cache.map.RegionMapPut.doAfterCompletionActions(RegionMapPut.java:307)
> at
> org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPut(AbstractRegionMapPut.java:185)
> at
> org.apache.geode.internal.cache.map.AbstractRegionMapPut$$Lambda$243/1152982325.run(Unknown
> Source)
> at
> org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWhileLockedForCacheModification(AbstractRegionMapPut.java:119)
> at
> org.apache.geode.internal.cache.map.RegionMapPut.runWhileLockedForCacheModification(RegionMapPut.java:161)
> at
> org.apache.geode.internal.cache.map.AbstractRegionMapPut.put(AbstractRegionMapPut.java:169)
> at
> org.apache.geode.internal.cache.AbstractRegionMap.basicPut(AbstractRegionMap.java:2044)
> at
> org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5602)
> at
> org.apache.geode.internal.cache.DistributedRegion.virtualPut(DistributedRegion.java:387)
> at
> org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5580)
> at
> org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:156)
> at
> org.apache.geode.internal.cache.LocalRegion.basicPut(LocalRegion.java:5038)
> at
> org.apache.geode.internal.cache.LocalRegion.validatedPut(LocalRegion.java:1637)
> at
> org.apache.geode.internal.cache.LocalRegion.put(LocalRegion.java:1624)
> {noformat}
> Logs show this same thread hitting an IllegalStateException when trying to
> send a message:
> {noformat}
> (edited)
> [fatal 2020/03/10 23:13:08.441 PDT <main> tid=0x67] While pushing message <>
> to recipients: <>
> java.lang.IllegalStateException: Task already scheduled or cancelled
> at java.util.Timer.sched(Timer.java:401)
> at java.util.Timer.schedule(Timer.java:193)
> at org.apache.geode.internal.SystemTimer.schedule(SystemTimer.java:347)
> at
> org.apache.geode.internal.tcp.Connection.scheduleAckTimeouts(Connection.java:1956)
> at
> org.apache.geode.internal.tcp.MsgStreamer.reserveConnections(MsgStreamer.java:225)
> at
> org.apache.geode.internal.tcp.MsgStreamerList.reserveConnections(MsgStreamerList.java:51)
> at
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:303)
> at
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:512)
> at
> org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:344)
> at
> org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:289)
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2058)
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1986)
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2023)
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1083)
> at
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:572)
> at
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
> at
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
> at
> org.apache.geode.internal.cache.DistributedRegion.distributeUpdate(DistributedRegion.java:514)
> at
> org.apache.geode.internal.cache.DistributedRegion.basicPutPart3(DistributedRegion.java:492)
> at
> org.apache.geode.internal.cache.map.RegionMapPut.doAfterCompletionActions(RegionMapPut.java:307)
> at
> org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPut(AbstractRegionMapPut.java:185)
> at
> org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWhileLockedForCacheModification(AbstractRegionMapPut.java:119)
> at
> org.apache.geode.internal.cache.map.RegionMapPut.runWhileLockedForCacheModification(RegionMapPut.java:161)
> at
> org.apache.geode.internal.cache.map.AbstractRegionMapPut.put(AbstractRegionMapPut.java:169)
> at
> org.apache.geode.internal.cache.AbstractRegionMap.basicPut(AbstractRegionMap.java:2044)
> at
> org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5602)
> at
> org.apache.geode.internal.cache.DistributedRegion.virtualPut(DistributedRegion.java:387)
> at
> org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5580)
> at
> org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:156)
> at
> org.apache.geode.internal.cache.LocalRegion.basicPut(LocalRegion.java:5038)
> at
> org.apache.geode.internal.cache.LocalRegion.validatedPut(LocalRegion.java:1637)
> at
> org.apache.geode.internal.cache.LocalRegion.put(LocalRegion.java:1624)
> at event.EventTest.updateObject(EventTest.java:755)
> at event.EventTest.updateObject(EventTest.java:733)
> at event.EventTest.doEntryOperations(EventTest.java:368)
> at event.EventTest.HydraTask_doEntryOperations(EventTest.java:249)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at hydra.MethExecutor.execute(MethExecutor.java:173)
> at hydra.MethExecutor.execute(MethExecutor.java:141)
> at hydra.TestTask.execute(TestTask.java:197)
> at hydra.RemoteTestModule$1.run(RemoteTestModule.java:213)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)