[jira] [Updated] (IGNITE-25915) Critical system error in ItIgniteNodeRestartTest on sendWithRetryTimeout
[
https://issues.apache.org/jira/browse/IGNITE-25915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Sizov updated IGNITE-25915:
---
Description:
Found on "ItIgniteNodeRestartTest.testRestartDiffConfig",
"testOneNodeRestartWithGap", "testCfgGapWithoutData" fail with
"PeerUnavailableException"
{noformat}
10:58:19]W: [:ignite-runner:integrationTest]
org.apache.ignite.internal.failure.StackTraceCapturingException: Unknown
error[10:58:19]W: [:ignite-runner:integrationTest] at
org.apache.ignite.internal.failure.FailureManager.process(FailureManager.java:191)[10:58:19]W:
[:ignite-runner:integrationTest] at
org.apache.ignite.internal.failure.FailureManager.process(FailureManager.java:168)[10:58:19]W:
[:ignite-runner:integrationTest] at
org.apache.ignite.internal.metastorage.server.WatchProcessor.notifyFailureHandlerOnFirstFailureInNotificationChain(WatchProcessor.java:441)[10:58:19]W:
[:ignite-runner:integrationTest] at
org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$enqueue$3(WatchProcessor.java:240)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)[10:58:19]W:
[:ignite-runner:integrationTest] at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:686)[10:58:19]W:
[:ignite-runner:integrationTest] at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:660)[10:58:19]W:
[:ignite-runner:integrationTest] at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$scheduleRetry$51(RaftGroupServiceImpl.java:910)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)[10:58:19]W:
[:ignite-runner:integrationTest] at
java.base/java.lang.Thread.run(Thread.java:833)[10:58:19]W:
[:ignite-runner:integrationTest] Caused by:
java.util.concurrent.CompletionException:
java.util.concurrent.TimeoutException: Send with retry timed out [retryCount =
340, groupId = metastorage_group, traceId = null, request =
org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl(org.apache.ignite.internal.metastorage.command.MultiInvokeCommandImpl),
originCommand = null, retryReasons = [[time=1752595098493, msg=Peer
iinrt_tonrwg_0:0 threw PeerUnavailableException; attemptWaitDuration=18,
attemptDuration=2, attemptStartTime=2025-07-15T15:58:18,493],
[time=1752595098513, msg=Peer iinrt_tonrwg_0:0 threw PeerUnavailableException;
attemptWaitDuration=18, attemptDuration=2,
attemptStartTime=2025-07-15T15:58:18,513], [time=1752595098533, msg=Peer
iinrt_tonrwg_0:0 threw PeerUnavailableException; attemptWaitDuration=18,
attemptDuration=2, attemptStartTime=2025-07-15T15:58:18,533],
[time=1752595098553, msg=Peer iinrt_tonrwg_0:0 threw PeerUnavailableException;
attemptWaitDuration=18, attemptDuration=2,
attemptStartTime=2025-07-15T15:58:18,553], [time=1752595098573, msg=Peer
iinrt_tonrwg_0:0 threw PeerUnavailableException; attemptWaitDuration=18,
attemptDuration=2, attemptStartTime=2025-07-15T15:58:18,573],
[time=1752595098594, msg=Peer iinrt_tonrwg_0:0 threw PeerUnavailableException;
attemptWaitDuration=18, attemptDuration=3,
attemptStartTime=2025-07-15T15:58:18,594], [time=1752595098614, msg=Peer
iinrt_tonrwg_0:0 threw PeerUnavailableException; attemptWaitDuration=17,
attemptDuration=3, attemptStartTime=2025-07-15T15:58:18,614],
[time=1752595098634, msg=Peer iinrt_tonrwg_0:0 threw PeerUnavailableException;
attemptWaitDuration=17, attemptDuration=3,
attemptStartTime=2025-07-15T15:58:18,634], [time=1752595098654, msg=Peer
iinrt_tonrwg_0:0 threw PeerUnavailableException; attemptWaitDuration=17,
attemp
[jira] [Updated] (IGNITE-25915) Critical system error in ItIgniteNodeRestartTest on sendWithRetryTimeout
[
https://issues.apache.org/jira/browse/IGNITE-25915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Sizov updated IGNITE-25915:
---
Summary: Critical system error in ItIgniteNodeRestartTest on
sendWithRetryTimeout (was: Critical system error on sendWithRetryTimeout)
> Critical system error in ItIgniteNodeRestartTest on sendWithRetryTimeout
>
>
> Key: IGNITE-25915
> URL: https://issues.apache.org/jira/browse/IGNITE-25915
> Project: Ignite
> Issue Type: Bug
>Affects Versions: 3.1
>Reporter: Kirill Sizov
>Priority: Major
> Labels: ignite-3
> Attachments: Integration Tests Module Runner 41134.zip,
> _Integration_Tests_Module_Runner_41125.log.zip
>
>
> Found on "ItIgniteNodeRestartTest.testRestartDiffConfig",
> "testOneNodeRestartWithGap" fail with "PeerUnavailableException"
> {noformat}
> 10:58:19]W: [:ignite-runner:integrationTest]
> org.apache.ignite.internal.failure.StackTraceCapturingException: Unknown
> error[10:58:19]W: [:ignite-runner:integrationTest] at
> org.apache.ignite.internal.failure.FailureManager.process(FailureManager.java:191)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> org.apache.ignite.internal.failure.FailureManager.process(FailureManager.java:168)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> org.apache.ignite.internal.metastorage.server.WatchProcessor.notifyFailureHandlerOnFirstFailureInNotificationChain(WatchProcessor.java:441)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$enqueue$3(WatchProcessor.java:240)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:686)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:660)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$scheduleRetry$51(RaftGroupServiceImpl.java:910)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)[10:58:19]W:
> [:ignite-runner:integrationTest] at
> java.base/java.lang.Thread.run(Thread.java:833)[10:58:19]W:
> [:ignite-runner:integrationTest] Caused by:
> java.util.concurrent.CompletionException:
> java.util.concurrent.TimeoutException: Send with retry timed out [retryCount
> = 340, groupId = metastorage_group, traceId = null, request =
> org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl(org.apache.ignite.internal.metastorage.command.MultiInvokeCommandImpl),
> originCommand = null, retryReasons = [[time=1752595098493, msg=Peer
> iinrt_tonrwg_0:0 threw PeerUnavailableException; attemptWaitDuration=18,
> attemptDuration=2, attemptStartTime=2025-07-15T15:58:18,493],
> [time=1752595098513, msg=Peer iinrt_tonrwg_0:0 threw
> PeerUnavailableException; attemptWaitDuration=18, attemptDuration=2,
> attemptStartTime=2025-07-15T15:58:18,513], [time=1752595098533, msg=Peer
> iinrt_tonrwg_0:0 threw PeerUnavailableException; attemptWaitDuration=18,
> attemptDuration=2, attemptStartTime=2025-07-15T15:58:18,533],
> [time=1752595098553, msg=Peer iinrt_tonrwg_0:0 threw
> PeerUnavailableException; attemptWaitDuration=18, attemptDuration=2,
> attemptStartTime=2025-07-15T15:58:18,553], [time=175
