Kirill Sizov created IGNITE-28543:
-------------------------------------
Summary: VacuumTxStateReplicaRequest times out
Key: IGNITE-28543
URL: https://issues.apache.org/jira/browse/IGNITE-28543
Project: Ignite
Issue Type: Bug
Reporter: Kirill Sizov
We have many failed vacuum attempts with the following stacktrace:
{noformat}
2026-01-20 05:25:27:479 +0000
[ERROR][%cac-dpd-cde-gg-aks-dev-1%partition-operations-9][FailureManager]
Critical system error detected. Will be handled accordingly to configured
handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED,
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=CRITICAL_ERROR,
failureCtxId=476be4a6-0f5c-451d-b53f-294d309a2369]
org.apache.ignite.internal.failure.StackTraceCapturingException: IGN-CMN-65535
Failed to vacuum tx states from the persistent storage. TraceId:f5b8d83b
at
org.apache.ignite.internal.failure.FailureManager.process(FailureManager.java:191)
at
org.apache.ignite.internal.failure.FailureManager.process(FailureManager.java:168)
at
org.apache.ignite.internal.tx.impl.PersistentTxStateVacuumizer.lambda$vacuumPersistentTxStates$0(PersistentTxStateVacuumizer.java:147)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown
Source)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
Source)
at
org.apache.ignite.internal.replicator.ReplicaService.lambda$sendToReplicaRaw$1(ReplicaService.java:148)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.util.concurrent.CompletionException:
org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException:
IGN-REP-3 Replication is timed out [replicaGrpId=2297_part_21] TraceId:f5b8d83b
at
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source)
at
java.base/java.util.concurrent.CompletableFuture.completeThrowable(Unknown
Source)
at
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown
Source)
... 6 more
{noformat}
And this:
{noformat}
hread [name="%cac-dpd-cde-gg-aks-dev-1%partition-operations-10", id=290,
state=RUNNABLE, blockCnt=2610322, waitCnt=1208642]
at
app//org.apache.ignite.internal.network.direct.stream.DirectByteBufferStreamImplV1.write(DirectByteBufferStreamImplV1.java:2156)
at
app//org.apache.ignite.internal.network.direct.stream.DirectByteBufferStreamImplV1.writeCollection(DirectByteBufferStreamImplV1.java:932)
at
app//org.apache.ignite.internal.network.direct.stream.DirectByteBufferStreamImplV1.writeSet(DirectByteBufferStreamImplV1.java:950)
at
app//org.apache.ignite.internal.network.direct.DirectMessageWriter.writeSet(DirectMessageWriter.java:453)
at
app//org.apache.ignite.internal.tx.message.VacuumTxStatesCommandSerializer.writeMessage(VacuumTxStatesCommandSerializer.java:35)
at
app//org.apache.ignite.internal.tx.message.VacuumTxStatesCommandSerializer.writeMessage(VacuumTxStatesCommandSerializer.java:9)
at
app//org.apache.ignite.internal.network.direct.stream.DirectByteBufferStreamImplV1.writeMessage(DirectByteBufferStreamImplV1.java:856)
at
app//org.apache.ignite.internal.raft.util.OptimizedMarshaller.marshall(OptimizedMarshaller.java:120)
at
app//org.apache.ignite.internal.table.distributed.schema.ThreadLocalPartitionCommandsMarshaller.marshall(ThreadLocalPartitionCommandsMarshaller.java:44)
at
app//org.apache.ignite.internal.raft.client.RaftGroupServiceImpl.lambda$run$44(RaftGroupServiceImpl.java:561)
at
app//org.apache.ignite.internal.raft.client.RaftGroupServiceImpl$$Lambda/0x0000000800cc3ed8.apply(Unknown
Source)
at
app//org.apache.ignite.internal.raft.client.RetryContext.<init>(RetryContext.java:116)
at
app//org.apache.ignite.internal.raft.client.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:680)
at
app//org.apache.ignite.internal.raft.client.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:635)
at
app//org.apache.ignite.internal.raft.client.RaftGroupServiceImpl.run(RaftGroupServiceImpl.java:574)
at
app//org.apache.ignite.internal.raft.client.RaftGroupServiceImpl.run(RaftGroupServiceImpl.java:545)
at
app//org.apache.ignite.internal.raft.client.TopologyAwareRaftGroupService.run(TopologyAwareRaftGroupService.java:487)
at
app//org.apache.ignite.internal.raft.ExecutorInclinedRaftCommandRunner.run(ExecutorInclinedRaftCommandRunner.java:36)
at
app//org.apache.ignite.internal.partition.replicator.ReplicationRaftCommandApplicator.applyCommand(ReplicationRaftCommandApplicator.java:89)
at
app//org.apache.ignite.internal.partition.replicator.handlers.VacuumTxStateReplicaRequestHandler.handle(VacuumTxStateReplicaRequestHandler.java:48)
at
app//org.apache.ignite.internal.partition.replicator.ZonePartitionReplicaListener.processZoneReplicaRequest(ZonePartitionReplicaListener.java:305)
at
app//org.apache.ignite.internal.partition.replicator.ZonePartitionReplicaListener.processRequest(ZonePartitionReplicaListener.java:241)
at
app//org.apache.ignite.internal.partition.replicator.ZonePartitionReplicaListener.lambda$invoke$0(ZonePartitionReplicaListener.java:208)
at
app//org.apache.ignite.internal.partition.replicator.ZonePartitionReplicaListener$$Lambda/0x0000000800f73b58.apply(Unknown
Source)
at
[email protected]/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown
Source)
at
[email protected]/java.util.concurrent.CompletableFuture.thenCompose(Unknown
Source)
at
app//org.apache.ignite.internal.partition.replicator.ZonePartitionReplicaListener.invoke(ZonePartitionReplicaListener.java:208)
at
app//org.apache.ignite.internal.replicator.ZonePartitionReplicaImpl.processRequest(ZonePartitionReplicaImpl.java:67)
at
app//org.apache.ignite.internal.replicator.ReplicaManager.handleReplicaRequest(ReplicaManager.java:382)
at
app//org.apache.ignite.internal.replicator.ReplicaManager.lambda$onReplicaMessageReceived$0(ReplicaManager.java:313)
at
app//org.apache.ignite.internal.replicator.ReplicaManager$$Lambda/0x0000000800f774b8.run(Unknown
Source)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at [email protected]/java.lang.Thread.runWith(Unknown Source)
at [email protected]/java.lang.Thread.run(Unknown Source)
Locked synchronizers:
java.util.concurrent.ThreadPoolExecutor$Worker@297a2b4b
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)