[
https://issues.apache.org/jira/browse/IGNITE-25808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denis Chudov updated IGNITE-25808:
----------------------------------
Description:
If there are hundreds of tables and there is a delay in the placement driver
TopologyTracker for any reason, the placement driver knows nothing about the
topology change and keeps choosing the offline node as a candidate. It is not a
critical issue, but may fill the logs with warnings in just a few seconds:
{code:java}
2025-05-06 10:06:44:672 +0000
[WARNING][%gridgain-2.novalocal%JRaft-FSMCaller-Disruptor-metastorage_group_stripe_0-0][LeaseNegotiator]
Lease was not negotiated due to exception [lease=Lease
[leaseholder=gridgain-5.novalocal,
leaseholderId=3844f6cb-ee97-49bd-8d40-5b96d0a6bc82, accepted=false,
startTime=HybridTimestamp [physical=2025-05-06 10:06:44:602 +0000, logical=114,
composite=114460328237596786], expirationTime=HybridTimestamp
[physical=2025-05-06 10:08:44:602 +0000, logical=0,
composite=114460336101916672], prolongable=true, proposedCandidate=null,
replicationGroupId=590_part_13]]
org.apache.ignite.internal.network.UnresolvableConsistentIdException:
IGN-NETWORK-1 TraceId:57c4ef42-4786-431b-a6ff-a7d79edcfd6a Recipient consistent
ID cannot be resolved: gridgain-5.novalocal
at
org.apache.ignite.internal.network.DefaultMessagingService.invoke(DefaultMessagingService.java:231)
at
org.apache.ignite.internal.network.MessagingService.invoke(MessagingService.java:190)
at
org.apache.ignite.internal.placementdriver.negotiation.LeaseNegotiator.negotiate(LeaseNegotiator.java:63)
at
org.apache.ignite.internal.placementdriver.LeaseUpdater$Updater.lambda$updateLeaseBatchInternal$0(LeaseUpdater.java:590)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$sendWithRetry$49(RaftGroupServiceImpl.java:603)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
at
org.apache.ignite.internal.network.DefaultMessagingService.onInvokeResponse(DefaultMessagingService.java:575)
at
org.apache.ignite.internal.network.DefaultMessagingService.send0(DefaultMessagingService.java:261)
at
org.apache.ignite.internal.network.DefaultMessagingService.respond(DefaultMessagingService.java:204)
at
org.apache.ignite.internal.network.MessagingService.respond(MessagingService.java:107)
at
org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$NetworkRpcContext.sendResponse(IgniteRpcServer.java:240)
at
org.apache.ignite.raft.jraft.rpc.impl.ActionRequestProcessor.sendResponse(ActionRequestProcessor.java:268)
at
org.apache.ignite.raft.jraft.rpc.impl.ActionRequestProcessor$1.result(ActionRequestProcessor.java:177)
at
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$WriteCommandIterator$1.result(JraftServerImpl.java:995)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler$ResultCachingClosure.result(MetaStorageWriteHandler.java:453)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteWithTime(MetaStorageWriteHandler.java:217)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleNonCachedWriteCommand(MetaStorageWriteHandler.java:153)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteCommand(MetaStorageWriteHandler.java:123)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageListener.onWrite(MetaStorageListener.java:193)
at
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:825)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:570)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:536)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:454)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:123)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:117)
at
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:322)
at
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:279)
at
com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:122)
at java.base/java.lang.Thread.run(Thread.java:840){code}
As a possible solution, we may throttle this log.
was:
If there are hundreds of tables and there is a delay in the placement driver
TopologyTracker for any reason, the placement driver knows nothing about the
topology change and keeps choosing the offline node as a candidate. It is not a
critical issue, but may fill the logs with warnings in just a few seconds:
{code:java}
2025-05-06 10:06:44:672 +0000
[WARNING][%gridgain-2.novalocal%JRaft-FSMCaller-Disruptor-metastorage_group_stripe_0-0][LeaseNegotiator]
Lease was not negotiated due to exception [lease=Lease
[leaseholder=gridgain-5.novalocal,
leaseholderId=3844f6cb-ee97-49bd-8d40-5b96d0a6bc82, accepted=false,
startTime=HybridTimestamp [physical=2025-05-06 10:06:44:602 +0000, logical=114,
composite=114460328237596786], expirationTime=HybridTimestamp
[physical=2025-05-06 10:08:44:602 +0000, logical=0,
composite=114460336101916672], prolongable=true, proposedCandidate=null,
replicationGroupId=590_part_13]]org.apache.ignite.internal.network.UnresolvableConsistentIdException:
IGN-NETWORK-1 TraceId:57c4ef42-4786-431b-a6ff-a7d79edcfd6a Recipient
consistent ID cannot be resolved: gridgain-5.novalocal at
org.apache.ignite.internal.network.DefaultMessagingService.invoke(DefaultMessagingService.java:231)
at
org.apache.ignite.internal.network.MessagingService.invoke(MessagingService.java:190)
at
org.apache.ignite.internal.placementdriver.negotiation.LeaseNegotiator.negotiate(LeaseNegotiator.java:63)
at
org.apache.ignite.internal.placementdriver.LeaseUpdater$Updater.lambda$updateLeaseBatchInternal$0(LeaseUpdater.java:590)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$sendWithRetry$49(RaftGroupServiceImpl.java:603)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
at
org.apache.ignite.internal.network.DefaultMessagingService.onInvokeResponse(DefaultMessagingService.java:575)
at
org.apache.ignite.internal.network.DefaultMessagingService.send0(DefaultMessagingService.java:261)
at
org.apache.ignite.internal.network.DefaultMessagingService.respond(DefaultMessagingService.java:204)
at
org.apache.ignite.internal.network.MessagingService.respond(MessagingService.java:107)
at
org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$NetworkRpcContext.sendResponse(IgniteRpcServer.java:240)
at
org.apache.ignite.raft.jraft.rpc.impl.ActionRequestProcessor.sendResponse(ActionRequestProcessor.java:268)
at
org.apache.ignite.raft.jraft.rpc.impl.ActionRequestProcessor$1.result(ActionRequestProcessor.java:177)
at
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$WriteCommandIterator$1.result(JraftServerImpl.java:995)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler$ResultCachingClosure.result(MetaStorageWriteHandler.java:453)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteWithTime(MetaStorageWriteHandler.java:217)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleNonCachedWriteCommand(MetaStorageWriteHandler.java:153)
at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteCommand(MetaStorageWriteHandler.java:123)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) at
org.apache.ignite.internal.metastorage.server.raft.MetaStorageListener.onWrite(MetaStorageListener.java:193)
at
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:825)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:570)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:536)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:454)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:123)
at
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:117)
at
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:322)
at
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:279)
at
com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:122)
at java.base/java.lang.Thread.run(Thread.java:840) {code}
As a possible solution, we may throttle this log.
> Lease negotiator may flood the log if a candidate left and topology tracker
> not updated
> ---------------------------------------------------------------------------------------
>
> Key: IGNITE-25808
> URL: https://issues.apache.org/jira/browse/IGNITE-25808
> Project: Ignite
> Issue Type: Bug
> Reporter: Denis Chudov
> Priority: Major
> Labels: ignite-3
>
> If there are hundreds of tables and there is a delay in the placement driver
> TopologyTracker for any reason, the placement driver knows nothing about the
> topology change and keeps choosing the offline node as a candidate. It is not
> a critical issue, but may fill the logs with warnings in just a few seconds:
> {code:java}
> 2025-05-06 10:06:44:672 +0000
> [WARNING][%gridgain-2.novalocal%JRaft-FSMCaller-Disruptor-metastorage_group_stripe_0-0][LeaseNegotiator]
> Lease was not negotiated due to exception [lease=Lease
> [leaseholder=gridgain-5.novalocal,
> leaseholderId=3844f6cb-ee97-49bd-8d40-5b96d0a6bc82, accepted=false,
> startTime=HybridTimestamp [physical=2025-05-06 10:06:44:602 +0000,
> logical=114, composite=114460328237596786], expirationTime=HybridTimestamp
> [physical=2025-05-06 10:08:44:602 +0000, logical=0,
> composite=114460336101916672], prolongable=true, proposedCandidate=null,
> replicationGroupId=590_part_13]]
> org.apache.ignite.internal.network.UnresolvableConsistentIdException:
> IGN-NETWORK-1 TraceId:57c4ef42-4786-431b-a6ff-a7d79edcfd6a Recipient
> consistent ID cannot be resolved: gridgain-5.novalocal
> at
> org.apache.ignite.internal.network.DefaultMessagingService.invoke(DefaultMessagingService.java:231)
> at
> org.apache.ignite.internal.network.MessagingService.invoke(MessagingService.java:190)
> at
> org.apache.ignite.internal.placementdriver.negotiation.LeaseNegotiator.negotiate(LeaseNegotiator.java:63)
> at
> org.apache.ignite.internal.placementdriver.LeaseUpdater$Updater.lambda$updateLeaseBatchInternal$0(LeaseUpdater.java:590)
> at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
> at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
> at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$sendWithRetry$49(RaftGroupServiceImpl.java:603)
> at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
> at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
> at
> org.apache.ignite.internal.network.DefaultMessagingService.onInvokeResponse(DefaultMessagingService.java:575)
> at
> org.apache.ignite.internal.network.DefaultMessagingService.send0(DefaultMessagingService.java:261)
> at
> org.apache.ignite.internal.network.DefaultMessagingService.respond(DefaultMessagingService.java:204)
> at
> org.apache.ignite.internal.network.MessagingService.respond(MessagingService.java:107)
> at
> org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$NetworkRpcContext.sendResponse(IgniteRpcServer.java:240)
> at
> org.apache.ignite.raft.jraft.rpc.impl.ActionRequestProcessor.sendResponse(ActionRequestProcessor.java:268)
> at
> org.apache.ignite.raft.jraft.rpc.impl.ActionRequestProcessor$1.result(ActionRequestProcessor.java:177)
> at
> org.apache.ignite.internal.raft.server.impl.JraftServerImpl$WriteCommandIterator$1.result(JraftServerImpl.java:995)
> at
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler$ResultCachingClosure.result(MetaStorageWriteHandler.java:453)
> at
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteWithTime(MetaStorageWriteHandler.java:217)
> at
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleNonCachedWriteCommand(MetaStorageWriteHandler.java:153)
> at
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteCommand(MetaStorageWriteHandler.java:123)
> at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
> at
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageListener.onWrite(MetaStorageListener.java:193)
> at
> org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:825)
> at
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:570)
> at
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:536)
> at
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:454)
> at
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:123)
> at
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:117)
> at
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:322)
> at
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:279)
> at
> com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
> at
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:122)
> at java.base/java.lang.Thread.run(Thread.java:840){code}
> As a possible solution, we may throttle this log.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)