[
https://issues.apache.org/jira/browse/GEODE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904880#comment-17904880
]
Leon Finker commented on GEODE-10453:
-------------------------------------
It's a bug specific to compact index logic. When switching to asynchronous
index (non compact), this issue doesn't happen. It also happens on server cache
startup and index creation over existing data that is received as initial
snapshot from other peer. And it's not really possible to work around when
using overflow to disk regions because those do not support non compact indexes.
{noformat}
[warn <ThreadsMonitor> tid=55] Thread <77> (0x4d) that was executed at <07 Dec
2024 12:47:28 EST> has been stuck for <994.887 seconds> and number of thread
monitor iteration <17>
Thread Name <Pooled High Priority Message Processor 3> state <RUNNABLE>
Executor Group <PooledExecutorWithDMStats>
Monitored metric <ResourceManagerStats.numThreadsStuck>
Thread stack for "Pooled High Priority Message Processor 3" (0x4d):
java.lang.ThreadState: RUNNABLE
at [email protected]/java.lang.Throwable.fillInStackTrace(Native Method)
at [email protected]/java.lang.Throwable.fillInStackTrace(Throwable.java:798)
at [email protected]/java.lang.Throwable.<init>(Throwable.java:271)
at [email protected]/java.lang.Exception.<init>(Exception.java:67)
at
[email protected]/java.lang.RuntimeException.<init>(RuntimeException.java:63)
at
[email protected]/java.lang.ClassCastException.<init>(ClassCastException.java:57)
at [email protected]/java.lang.String.compareTo(String.java:140)
at
app//org.apache.geode.cache.query.internal.types.TypeUtils$ComparisonStrategy$4.execute(TypeUtils.java:90)
at
app//org.apache.geode.cache.query.internal.types.TypeUtils.compare(TypeUtils.java:499)
at
app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.getOldKey(MemoryIndexStore.java:275)
at
app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.basicRemoveMapping(MemoryIndexStore.java:399)
at
app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.removeMapping(MemoryIndexStore.java:298)
at
app//org.apache.geode.cache.query.internal.index.CompactRangeIndex.removeMapping(CompactRangeIndex.java:173)
at
app//org.apache.geode.cache.query.internal.index.AbstractIndex.removeIndexMapping(AbstractIndex.java:508)
at
app//org.apache.geode.cache.query.internal.index.IndexManager.removeIndexMapping(IndexManager.java:1156)
at
app//org.apache.geode.cache.query.internal.index.IndexManager.processAction(IndexManager.java:1121)
at
app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:982)
at
app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:956)
at
app//org.apache.geode.internal.cache.AbstractRegionMap.initialImagePut(AbstractRegionMap.java:836)
at
app//org.apache.geode.internal.cache.InitialImageOperation.processChunk(InitialImageOperation.java:980)
at
app//org.apache.geode.internal.cache.InitialImageOperation$ImageProcessor.process(InitialImageOperation.java:1306)
at
app//org.apache.geode.distributed.internal.ReplyMessage.process(ReplyMessage.java:215)
at
app//org.apache.geode.internal.cache.InitialImageOperation$ImageReplyMessage.process(InitialImageOperation.java:2829)
at
app//org.apache.geode.distributed.internal.ReplyMessage.dmProcess(ReplyMessage.java:198)
at
app//org.apache.geode.distributed.internal.ReplyMessage.process(ReplyMessage.java:191)
at
app//org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:380)
at
app//org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:445)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at
app//org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:449)
at
app//org.apache.geode.distributed.internal.ClusterOperationExecutors.doHighPriorityThread(ClusterOperationExecutors.java:407)
at
app//org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$312/0x00000008018c1e50.invoke(Unknown
Source)
at
app//org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
at
app//org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$310/0x00000008018c1780.run(Unknown
Source)
at [email protected]/java.lang.Thread.run(Thread.java:833)
Locked ownable synchronizers:
- None
{noformat}
> Infinite/slow indexing on reconnect and register interest replay
> ----------------------------------------------------------------
>
> Key: GEODE-10453
> URL: https://issues.apache.org/jira/browse/GEODE-10453
> Project: Geode
> Issue Type: Bug
> Affects Versions: 1.15.1
> Reporter: Leon Finker
> Priority: Major
>
> Cache server was restarted. Client side upon reconnect went into
> infinite/slow indexing loop. This has not recovered even after multiple days.
> The thread stack for thread taking 100% CPU was:
> {code}
> Thread Name <poolTimer-Server-21659> state <BLOCKED>
> Waiting on <org.apache.geode.cache.client.internal.ConnectionImpl@293d172>
> Owned By <queueTimer-Server1> with ID <140>
> Executor Group <ScheduledThreadPoolExecutorWithKeepAlive>
> Monitored metric <ResourceManagerStats.numThreadsStuck>
> Thread stack for "poolTimer-Server-21659" (0x128275):
> java.lang.ThreadState: BLOCKED
> at
> app//org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:283)
> at
> app//org.apache.geode.cache.client.internal.QueueConnectionImpl.execute(QueueConnectionImpl.java:191)
> at
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760)
> at
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:343)
> at
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:312)
> at
> app//org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:848)
> at
> app//org.apache.geode.cache.client.internal.PingOp.execute(PingOp.java:40)
> at
> app//org.apache.geode.cache.client.internal.LiveServerPinger$PingTask.run2(LiveServerPinger.java:128)
> at
> app//org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1340)
> at
> [email protected]/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
> at
> [email protected]/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at
> app//org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at [email protected]/java.lang.Thread.run(Thread.java:833)
> Locked ownable synchronizers:
> - None
> Lock owner thread stack for "queueTimer-Server1" (0x6a):
> java.lang.ThreadState: RUNNABLE
> at
> app//org.apache.geode.cache.query.internal.types.TypeUtils$ComparisonStrategy$4.execute(TypeUtils.java:90)
> at
> app//org.apache.geode.cache.query.internal.types.TypeUtils.compare(TypeUtils.java:499)
> at
> app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.getOldKey(MemoryIndexStore.java:275)
> at
> app//org.apache.geode.cache.query.internal.index.MemoryIndexStore.updateMapping(MemoryIndexStore.java:122)
> at
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.applyProjection(CompactRangeIndex.java:1563)
> at
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.doNestedIterations(CompactRangeIndex.java:1519)
> at
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.evaluate(CompactRangeIndex.java:1372)
> at
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex.addMapping(CompactRangeIndex.java:143)
> at
> app//org.apache.geode.cache.query.internal.index.AbstractIndex.addIndexMapping(AbstractIndex.java:488)
> at
> app//org.apache.geode.cache.query.internal.index.IndexManager.addIndexMapping(IndexManager.java:1143)
> at
> app//org.apache.geode.cache.query.internal.index.IndexManager.processAction(IndexManager.java:1089)
> at
> app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:982)
> at
> app//org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:956)
> at
> app//org.apache.geode.internal.cache.AbstractRegionMap.initialImagePut(AbstractRegionMap.java:839)
> at
> app//org.apache.geode.internal.cache.LocalRegion.refreshEntriesFromServerKeys(LocalRegion.java:4348)
> at
> app//org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:217)
> at
> app//org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:121)
> at
> app//org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:209)
> at
> app//org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:394)
> at
> app//org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:284)
> at
> app//org.apache.geode.cache.client.internal.QueueConnectionImpl.execute(QueueConnectionImpl.java:191)
> at
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760)
> at
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:475)
> at
> app//org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:488)
> at
> app//org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:861)
> at
> app//org.apache.geode.cache.client.internal.RegisterInterestOp.executeOn(RegisterInterestOp.java:113)
> at
> app//org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterestOn(ServerRegionProxy.java:506)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleKey(QueueManagerImpl.java:1236)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleRegion(QueueManagerImpl.java:1183)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleList(QueueManagerImpl.java:1129)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterestList(QueueManagerImpl.java:1250)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverAllInterestTypes(QueueManagerImpl.java:1264)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterest(QueueManagerImpl.java:1094)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl.recoverPrimary(QueueManagerImpl.java:938)
> at
> app//org.apache.geode.cache.client.internal.QueueManagerImpl$RedundancySatisfierTask.run2(QueueManagerImpl.java:1475)
> at
> app//org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1340)
> at
> [email protected]/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
> at [email protected]/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> at
> app//org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.doNestedIterations(CompactRangeIndex.java:1509)
> {code}
> After client stop attempt and cache close, the following stack trace was
> logged:
> {code}
> The index is corrupted and
> marked as invalid.
> org.apache.geode.cache.CacheClosedException: The cache is closed.
> at
> org.apache.geode.internal.cache.GemFireCacheImpl$Stopper.generateCancelledException(GemFireCacheImpl.java:5207)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.internal.cache.LocalRegion.checkRegionDestroyed(LocalRegion.java:7382)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.internal.cache.LocalRegion.checkReadiness(LocalRegion.java:2788)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.internal.cache.LocalRegion.values(LocalRegion.java:1970)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.QRegion.<init>(QRegion.java:81)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.DummyQRegion.<init>(DummyQRegion.java:52)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.CompactRangeIndex$IMQEvaluator.evaluate(CompactRangeIndex.java:1342)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.CompactRangeIndex.addMapping(CompactRangeIndex.java:143)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.AbstractIndex.addIndexMapping(AbstractIndex.java:488)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.IndexManager.addIndexMapping(IndexManager.java:1143)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.IndexManager.processAction(IndexManager.java:1089)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:982)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.query.internal.index.IndexManager.updateIndexes(IndexManager.java:956)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.internal.cache.AbstractRegionMap.initialImagePut(AbstractRegionMap.java:839)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.internal.cache.LocalRegion.refreshEntriesFromServerKeys(LocalRegion.java:4348)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:217)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.RegisterInterestOp$RegisterInterestOpImpl.processResponse(RegisterInterestOp.java:121)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:209)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:394)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:284)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueConnectionImpl.execute(QueueConnectionImpl.java:191)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:475)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:488)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:861)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.RegisterInterestOp.executeOn(RegisterInterestOp.java:113)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterestOn(ServerRegionProxy.java:506)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleKey(QueueManagerImpl.java:1236)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleRegion(QueueManagerImpl.java:1183)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverSingleList(QueueManagerImpl.java:1129)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterestList(QueueManagerImpl.java:1250)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverAllInterestTypes(QueueManagerImpl.java:1264)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverInterest(QueueManagerImpl.java:1094)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverPrimary(QueueManagerImpl.java:938)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.QueueManagerImpl$RedundancySatisfierTask.run2(QueueManagerImpl.java:1475)
> ~[geode-core-1.15.1.jar:?]
> at
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1340)
> ~[geode-core-1.15.1.jar:?]
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
> ~[?:?]
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> ~[?:?]
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)