Fedor Malchikov created IGNITE-16423: -----------------------------------------
Summary: [%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to close writer Key: IGNITE-16423 URL: https://issues.apache.org/jira/browse/IGNITE-16423 Project: Ignite Issue Type: Bug Affects Versions: 3.0.0-alpha3 Reporter: Fedor Malchikov Attachments: my-first-node(sql).log, my-first-node.log {code:java} 2022-01-28 13:17:41:192 +0300 [ERROR][%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to close writerjava.io.IOException at org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotStorage.close(LocalSnapshotStorage.java:242) at org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:93) at org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:88) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:387) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829)2022-01-28 13:17:41:202 +0300 [ERROR][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][StateMachineAdapter] Encountered an error=Status[EIO<1014>: Fail to save snapshot.] on StateMachine org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine, it's highly recommended to implement this method as raft stops working since some error occurs, you should figure out the cause and repair or remove this node.Error [type=ERROR_TYPE_SNAPSHOT, status=Status[EIO<1014>: Fail to save snapshot.]] at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.reportError(SnapshotExecutorImpl.java:682) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:406) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829)2022-01-28 13:17:41:203 +0300 [WARNING][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][NodeImpl] Node <metastorage_raft_group/127.0.1.1:3344> got error: Error [type=ERROR_TYPE_SNAPSHOT, status=Status[EIO<1014>: Fail to save snapshot.]].2022-01-28 13:17:41:203 +0300 [WARNING][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][FSMCallerImpl] FSMCaller already in error status, ignore new errorError [type=ERROR_TYPE_SNAPSHOT, status=Status[EIO<1014>: Fail to save snapshot.]] at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.reportError(SnapshotExecutorImpl.java:682) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:406) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135) at org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829)2022-01-28 13:17:41:205 +0300 [INFO][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][ReplicatorGroupImpl] Fail to find the next candidate.2022-01-28 13:17:41:205 +0300 [INFO][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][StateMachineAdapter] onLeaderStop: status=Status[EBADNODE<10009>: Raft node(leader or candidate) is in error.].2022-01-28 13:17:51:262 +0300 [ERROR][Thread-72][MetaStorageServiceImpl] Unexpected exceptionclass org.apache.ignite.lang.IgniteInternalException: java.util.concurrent.TimeoutException at org.apache.ignite.internal.metastorage.client.CursorImpl$InnerIterator.hasNext(CursorImpl.java:121) at org.apache.ignite.internal.metastorage.client.MetaStorageServiceImpl$WatchProcessor$Watcher.run(MetaStorageServiceImpl.java:476)Caused by: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) at org.apache.ignite.internal.metastorage.client.CursorImpl$InnerIterator.hasNext(CursorImpl.java:113) ... 1 moreCaused by: java.util.concurrent.TimeoutException at org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:502) at org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl$1.lambda$accept$2(RaftGroupServiceImpl.java:555) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) {code} And after that, the timeout error repeats every 10 seconds. The problem is reproducible stably, but each time needs a different time. In attachment 2 different test: one without any activity and one houre waite , second after sql table creation -- This message was sent by Atlassian Jira (v8.20.1#820001)