[ 
https://issues.apache.org/jira/browse/RATIS-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated RATIS-2507:
------------------------------
    Description: 
When RaftLog end log index is smaller than last snapshot index during 
RaftServer startup, ratis applied a new entity to raft log before it throws out 

{code:java}
2026-04-19 17:28:34,546 ERROR 
[om43-server-thread1]-org.apache.ratis.server.raftlog.RaftLog: 
om43@group-A6BF76F23EF4-SegmentedRaftLog: Failed to append (t:11, i:1251979), 
METADATAENTRY(c:1251977)
java.lang.IllegalStateException: gap between entries 1251979 and 975728
  at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
  at 
org.apache.ratis.server.raftlog.segmented.LogSegment.append(LogSegment.java:320)
  at 
org.apache.ratis.server.raftlog.segmented.LogSegment.appendToOpenSegment(LogSegment.java:307)
  at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.appendEntry(SegmentedRaftLogCache.java:556)
  at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:414)
  at 
org.apache.ratis.server.raftlog.RaftLogBase.lambda$appendEntry$10(RaftLogBase.java:330)
  at 
org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:78)
  at 
org.apache.ratis.server.raftlog.RaftLogBase.appendEntry(RaftLogBase.java:330)
  at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendImpl(SegmentedRaftLog.java:456)
  at 
org.apache.ratis.server.raftlog.RaftLogBase.lambda$append$11(RaftLogBase.java:337)
  at 
org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:69)
  at org.apache.ratis.server.raftlog.RaftLogBase.append(RaftLogBase.java:337)
  at 
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1532)
  at 
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1396)
  at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$null$25(RaftServerProxy.java:637)
  at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117)
  at 
org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$11(RaftServerImpl.java:825)
  at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  at java.base/java.lang.Thread.run(Thread.java:840)
{code}


A real case, is an OM failed to install snapshot and terminate itself. The on 
disk state machine state(rocksdb) and raft log files are from different source, 
state machine data from leader, raft log files are local, and snapshot index 
get from state machine is bigger than raft log end index, which further cause 
another issue. 

Refer to https://issues.apache.org/jira/browse/HDDS-15103 to get more info. 

  was:
RaftLog end log index is smaller than last snapshot index during RaftServer 
startup, it indicates that raft log state and state machine state is 
inconsistent. 
In this case, it's better to fail the RaftServer, instead of currently log a 
WARN message and continue. 

A real case, is an OM failed to install snapshot and terminate itself. The on 
disk state machine state(rocksdb) and raft log files are from different source, 
state machine data from leader, raft log files are local, and snapshot index 
get from state machine is bigger than raft log end index, which further cause 
another issue. 

Refer to https://issues.apache.org/jira/browse/HDDS-15103 to get more info. 


> Fix java.lang.IllegalStateException: "gap between entries"
> ----------------------------------------------------------
>
>                 Key: RATIS-2507
>                 URL: https://issues.apache.org/jira/browse/RATIS-2507
>             Project: Ratis
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> When RaftLog end log index is smaller than last snapshot index during 
> RaftServer startup, ratis applied a new entity to raft log before it throws 
> out 
> {code:java}
> 2026-04-19 17:28:34,546 ERROR 
> [om43-server-thread1]-org.apache.ratis.server.raftlog.RaftLog: 
> om43@group-A6BF76F23EF4-SegmentedRaftLog: Failed to append (t:11, i:1251979), 
> METADATAENTRY(c:1251977)
> java.lang.IllegalStateException: gap between entries 1251979 and 975728
>   at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
>   at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.append(LogSegment.java:320)
>   at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.appendToOpenSegment(LogSegment.java:307)
>   at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.appendEntry(SegmentedRaftLogCache.java:556)
>   at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:414)
>   at 
> org.apache.ratis.server.raftlog.RaftLogBase.lambda$appendEntry$10(RaftLogBase.java:330)
>   at 
> org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:78)
>   at 
> org.apache.ratis.server.raftlog.RaftLogBase.appendEntry(RaftLogBase.java:330)
>   at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendImpl(SegmentedRaftLog.java:456)
>   at 
> org.apache.ratis.server.raftlog.RaftLogBase.lambda$append$11(RaftLogBase.java:337)
>   at 
> org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:69)
>   at org.apache.ratis.server.raftlog.RaftLogBase.append(RaftLogBase.java:337)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1532)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1396)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.lambda$null$25(RaftServerProxy.java:637)
>   at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$11(RaftServerImpl.java:825)
>   at 
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at java.base/java.lang.Thread.run(Thread.java:840)
> {code}
> A real case, is an OM failed to install snapshot and terminate itself. The on 
> disk state machine state(rocksdb) and raft log files are from different 
> source, state machine data from leader, raft log files are local, and 
> snapshot index get from state machine is bigger than raft log end index, 
> which further cause another issue. 
> Refer to https://issues.apache.org/jira/browse/HDDS-15103 to get more info. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to