[
https://issues.apache.org/jira/browse/RATIS-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sammi Chen updated RATIS-2507:
------------------------------
Description:
When RaftLog end log index is smaller than last snapshot index during
RaftServer startup, ratis applied a new entity to raft log before it throws out
{code:java}
2026-04-19 17:28:34,546 ERROR
[om43-server-thread1]-org.apache.ratis.server.raftlog.RaftLog:
om43@group-A6BF76F23EF4-SegmentedRaftLog: Failed to append (t:11, i:1251979),
METADATAENTRY(c:1251977)
java.lang.IllegalStateException: gap between entries 1251979 and 975728
at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
at
org.apache.ratis.server.raftlog.segmented.LogSegment.append(LogSegment.java:320)
at
org.apache.ratis.server.raftlog.segmented.LogSegment.appendToOpenSegment(LogSegment.java:307)
at
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.appendEntry(SegmentedRaftLogCache.java:556)
at
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:414)
at
org.apache.ratis.server.raftlog.RaftLogBase.lambda$appendEntry$10(RaftLogBase.java:330)
at
org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:78)
at
org.apache.ratis.server.raftlog.RaftLogBase.appendEntry(RaftLogBase.java:330)
at
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendImpl(SegmentedRaftLog.java:456)
at
org.apache.ratis.server.raftlog.RaftLogBase.lambda$append$11(RaftLogBase.java:337)
at
org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:69)
at org.apache.ratis.server.raftlog.RaftLogBase.append(RaftLogBase.java:337)
at
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1532)
at
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1396)
at
org.apache.ratis.server.impl.RaftServerProxy.lambda$null$25(RaftServerProxy.java:637)
at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117)
at
org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$11(RaftServerImpl.java:825)
at
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
{code}
A real case, is an OM failed to install snapshot and terminate itself. The on
disk state machine state(rocksdb) and raft log files are from different source,
state machine data from leader, raft log files are local, and snapshot index
get from state machine is bigger than raft log end index, which further cause
another issue.
Refer to https://issues.apache.org/jira/browse/HDDS-15103 to get more info.
was:
RaftLog end log index is smaller than last snapshot index during RaftServer
startup, it indicates that raft log state and state machine state is
inconsistent.
In this case, it's better to fail the RaftServer, instead of currently log a
WARN message and continue.
A real case, is an OM failed to install snapshot and terminate itself. The on
disk state machine state(rocksdb) and raft log files are from different source,
state machine data from leader, raft log files are local, and snapshot index
get from state machine is bigger than raft log end index, which further cause
another issue.
Refer to https://issues.apache.org/jira/browse/HDDS-15103 to get more info.
> Fix java.lang.IllegalStateException: "gap between entries"
> ----------------------------------------------------------
>
> Key: RATIS-2507
> URL: https://issues.apache.org/jira/browse/RATIS-2507
> Project: Ratis
> Issue Type: Bug
> Reporter: Sammi Chen
> Assignee: Sammi Chen
> Priority: Major
> Time Spent: 1h
> Remaining Estimate: 0h
>
> When RaftLog end log index is smaller than last snapshot index during
> RaftServer startup, ratis applied a new entity to raft log before it throws
> out
> {code:java}
> 2026-04-19 17:28:34,546 ERROR
> [om43-server-thread1]-org.apache.ratis.server.raftlog.RaftLog:
> om43@group-A6BF76F23EF4-SegmentedRaftLog: Failed to append (t:11, i:1251979),
> METADATAENTRY(c:1251977)
> java.lang.IllegalStateException: gap between entries 1251979 and 975728
> at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
> at
> org.apache.ratis.server.raftlog.segmented.LogSegment.append(LogSegment.java:320)
> at
> org.apache.ratis.server.raftlog.segmented.LogSegment.appendToOpenSegment(LogSegment.java:307)
> at
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.appendEntry(SegmentedRaftLogCache.java:556)
> at
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:414)
> at
> org.apache.ratis.server.raftlog.RaftLogBase.lambda$appendEntry$10(RaftLogBase.java:330)
> at
> org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:78)
> at
> org.apache.ratis.server.raftlog.RaftLogBase.appendEntry(RaftLogBase.java:330)
> at
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendImpl(SegmentedRaftLog.java:456)
> at
> org.apache.ratis.server.raftlog.RaftLogBase.lambda$append$11(RaftLogBase.java:337)
> at
> org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:69)
> at org.apache.ratis.server.raftlog.RaftLogBase.append(RaftLogBase.java:337)
> at
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1532)
> at
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1396)
> at
> org.apache.ratis.server.impl.RaftServerProxy.lambda$null$25(RaftServerProxy.java:637)
> at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117)
> at
> org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$11(RaftServerImpl.java:825)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base/java.lang.Thread.run(Thread.java:840)
> {code}
> A real case, is an OM failed to install snapshot and terminate itself. The on
> disk state machine state(rocksdb) and raft log files are from different
> source, state machine data from leader, raft log files are local, and
> snapshot index get from state machine is bigger than raft log end index,
> which further cause another issue.
> Refer to https://issues.apache.org/jira/browse/HDDS-15103 to get more info.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)