[ https://issues.apache.org/jira/browse/RATIS-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609322#comment-16609322 ]
Mukul Kumar Singh commented on RATIS-304: ----------------------------------------- A summary of the problem a) This problem only happens when the request for Votes is withheld. b) The Leader sends a new request in the new term , however the previous entry still points to the previous term. On the follower, when the entries are validated, it fails. The previous index check fails. {code} private TermIndex getPrevious() { TermIndex previous = raftLog.getTermIndex(follower.getNextIndex() - 1); if (previous == null) { // if previous is null, nextIndex must be equal to the log start // index (otherwise we will install snapshot). Preconditions.assertTrue(follower.getNextIndex() == raftLog.getStartIndex(), "%s: follower's next index %s, local log start index %s", this, follower.getNextIndex(), raftLog.getStartIndex()); SnapshotInfo snapshot = server.getState().getLatestSnapshot(); previous = snapshot == null ? null : snapshot.getTermIndex(); } return previous; } {code} The place holder index returns the correct value, which has the correct index for the new term. {code} placeHolderIndex = raftLog.getNextIndex(); {code} > Leader send incorrect entries to follower after withholding election > -------------------------------------------------------------------- > > Key: RATIS-304 > URL: https://issues.apache.org/jira/browse/RATIS-304 > Project: Ratis > Issue Type: Bug > Components: server > Affects Versions: 0.3.0 > Reporter: Mukul Kumar Singh > Assignee: Mukul Kumar Singh > Priority: Blocker > Labels: ozone > Fix For: 0.3.0 > > > Leader send incorrect entries to follower after withholding election. > The following exception was seen using Ozone, where the leader send incorrect > append entries to follower. > {code} > 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: > Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index > to:20312 > 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, > lastRpcTime:1182, electionTimeout:990ms > 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14 > for changeToCandidate > 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 > for changeToFollower > 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, > lastRpcTime:2167, electionTimeout:976ms > 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14 > for changeToCandidate > 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from > 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini > tElection > 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15 > 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 > for changeToFollower > 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app > endEntries > > 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8: > Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with > term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, > lastRpcElapsed=0ms > > 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 > response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2 > bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, > leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, > voted=2e240240-0fac-4f93-8aa8-fa8f > 74bf1810_9858, raftlog=[(t:14, i:20374)], > conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, > 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, > 2e240240-0fac-4f93-8aa8-fa8f74bf > 1810_9858:172.26.32.228:9858], old=null > 2018-08-20 07:54:31,227 WARN > org.apache.ratis.grpc.server.RaftServerProtocolService: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1 > java.lang.IllegalStateException: Unexpected Index: previous is (t:14, > i:20374) but entries[0].getIndex()=0 > at > org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60) > at > org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786) > at > org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:859) > at > org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:824) > at > org.apache.ratis.server.impl.RaftServerProxy.appendEntriesAsync(RaftServerProxy.java:247) > at > org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:76) > at > org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:66) > at > org.apache.ratis.shaded.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) > at > org.apache.ratis.shaded.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:252) > at > org.apache.ratis.shaded.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:629) > at > org.apache.ratis.shaded.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.shaded.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2018-08-20 07:54:33,260 WARN > org.apache.ratis.grpc.server.RaftServerProtocolService: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#3 > java.lang.IllegalStateException: Unexpected Index: entries[1].getIndex()=0 > but entries[0].getIndex()=0 > {code} > > > {code} > 2018-08-20 11:54:29,174 INFO > org.apache.ratis.grpc.server.RaftServerProtocolService: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: appendEntries completed > 2018-08-20 11:54:30,182 INFO org.apache.ratis.server.impl.FollowerState: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 changes to CANDIDATE, > lastRpcTime:1332, electionTimeout:860ms > 2018-08-20 11:54:30,182 INFO org.apache.ratis.server.impl.RaftServerImpl: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@465f9230 to CANDIDATE at term 14 > for changeToCandidate > 2018-08-20 11:54:30,185 INFO org.apache.ratis.server.impl.RaftServerImpl: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: change Leader from > 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for initElection > 2018-08-20 11:54:30,199 INFO org.apache.ratis.server.impl.LeaderElection: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: begin an election in Term 15 > 2018-08-20 11:54:30,230 INFO org.apache.ratis.server.impl.LeaderElection: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: Election PASSED; received 1 > response(s) [b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858<-2bf > 278ca-2dad-4029-a387-2faeb10adef5_9858#0:OK-t15] and 0 exception(s); > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:t15, leader=null, > voted=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, raftlog=[(t:14, i:20488)], > conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, > 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:172.26.32.228:9858], old=null > 2018-08-20 11:54:30,230 INFO org.apache.ratis.server.impl.RaftServerImpl: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@465f9230 to LEADER at term 15 for > changeToLeader > 2018-08-20 11:54:30,230 INFO org.apache.ratis.server.impl.RaftServerImpl: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: change Leader from null to > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for becomeLeader > 2018-08-20 11:54:30,246 INFO org.apache.ratis.conf.ConfUtils: > raft.server.staging.catchup.gap = 1000 (default) > 2018-08-20 11:54:30,246 INFO org.apache.ratis.conf.ConfUtils: > raft.server.rpc.sleep.time = 25 ms (default) > 2018-08-20 11:54:30,269 INFO org.apache.ratis.conf.ConfUtils: > raft.server.log.appender.buffer.capacity = 134217728 (custom) > 2018-08-20 11:54:30,270 INFO org.apache.ratis.conf.ConfUtils: > raft.server.log.appender.batch.enabled = true (custom) > 2018-08-20 11:54:30,270 INFO org.apache.ratis.conf.ConfUtils: > raft.server.log.appender.snapshot.chunk.size.max = 16MB (=16777216) (default) > 2018-08-20 11:54:30,273 INFO org.apache.ratis.conf.ConfUtils: > raft.grpc.server.leader.outstanding.appends.max = 128 (default) > 2018-08-20 11:54:30,273 INFO org.apache.ratis.conf.ConfUtils: > raft.server.rpc.request.timeout = 3000 ms (custom) > 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: > raft.server.log.appender.buffer.capacity = 134217728 (custom) > 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: > raft.server.log.appender.batch.enabled = true (custom) > 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: > raft.server.log.appender.snapshot.chunk.size.max = 16MB (=16777216) (default) > 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: > raft.grpc.server.leader.outstanding.appends.max = 128 (default) > 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: > raft.server.rpc.request.timeout = 3000 ms (custom) > 2018-08-20 11:54:30,277 INFO org.apache.ratis.server.storage.RaftLogWorker: > Rolling segment:b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858-RaftLogWorker index > to:20489 > 2018-08-20 11:54:30,733 INFO org.apache.ratis.server.impl.RaftServerImpl: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858-org.apache.ratis.server.impl.RoleInfo@465f9230: > Withhold vote from candidate 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 with > term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, > lastRpcElapsed=null > 2018-08-20 11:54:31,230 WARN org.apache.ratis.server.impl.LogAppender: > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: Failed appendEntries to > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:172.26.32.228:9858 > java.io.IOException: java.lang.IllegalStateException: Unexpected Index: > previous is (t:14, i:20374) but entries[0].getIndex()=0 > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:49) > at > org.apache.ratis.grpc.RaftGrpcUtil.tryUnwrapException(RaftGrpcUtil.java:81) > at > org.apache.ratis.grpc.RaftGrpcUtil.unwrapThrowable(RaftGrpcUtil.java:58) > at org.apache.ratis.grpc.RaftGrpcUtil.warn(RaftGrpcUtil.java:129) > at > org.apache.ratis.grpc.server.GRpcLogAppender$AppendLogResponseHandler.onError(GRpcLogAppender.java:241) > at > org.apache.ratis.shaded.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:419) > at > org.apache.ratis.shaded.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41) > at > org.apache.ratis.shaded.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:684) > at > org.apache.ratis.shaded.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41) > at > org.apache.ratis.shaded.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:391) > at > org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:475) > at > org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63) > at > org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:557) > at > org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:478) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)