[ 
https://issues.apache.org/jira/browse/RATIS-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609322#comment-16609322
 ] 

Mukul Kumar Singh commented on RATIS-304:
-----------------------------------------

A summary of the problem 

a) This problem only happens when the request for Votes is withheld.

b) The Leader sends a new request in the new term , however the previous entry 
still points to the previous term.
On the follower, when the entries are validated, it fails.

The previous index check fails.
{code}
  private TermIndex getPrevious() {
    TermIndex previous = raftLog.getTermIndex(follower.getNextIndex() - 1);
    if (previous == null) {
      // if previous is null, nextIndex must be equal to the log start
      // index (otherwise we will install snapshot).
      Preconditions.assertTrue(follower.getNextIndex() == 
raftLog.getStartIndex(),
          "%s: follower's next index %s, local log start index %s",
          this, follower.getNextIndex(), raftLog.getStartIndex());
      SnapshotInfo snapshot = server.getState().getLatestSnapshot();
      previous = snapshot == null ? null : snapshot.getTermIndex();
    }
    return previous;
  }
{code}

The place holder index returns the correct value, which has the correct index 
for the new term.
{code}
    placeHolderIndex = raftLog.getNextIndex();
{code}

> Leader send incorrect entries to follower after withholding election
> --------------------------------------------------------------------
>
>                 Key: RATIS-304
>                 URL: https://issues.apache.org/jira/browse/RATIS-304
>             Project: Ratis
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 0.3.0
>            Reporter: Mukul Kumar Singh
>            Assignee: Mukul Kumar Singh
>            Priority: Blocker
>              Labels: ozone
>             Fix For: 0.3.0
>
>
> Leader send incorrect entries to follower after withholding election.
> The following exception was seen using Ozone, where the leader send incorrect 
> append entries to follower.
> {code}
> 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
> to:20312
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:1182, electionTimeout:990ms
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
> for changeToFollower
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:2167, electionTimeout:976ms
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
> tElection
> 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
> 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
> for changeToFollower
> 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
> endEntries
>  
> 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
>  Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
> term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
> lastRpcElapsed=0ms
>  
> 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
> response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
> bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
> leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
> voted=2e240240-0fac-4f93-8aa8-fa8f
> 74bf1810_9858, raftlog=[(t:14, i:20374)], 
> conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
> 2e240240-0fac-4f93-8aa8-fa8f74bf
> 1810_9858:172.26.32.228:9858], old=null
> 2018-08-20 07:54:31,227 WARN 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
> java.lang.IllegalStateException: Unexpected Index: previous is (t:14, 
> i:20374) but entries[0].getIndex()=0
>         at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:859)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:824)
>         at 
> org.apache.ratis.server.impl.RaftServerProxy.appendEntriesAsync(RaftServerProxy.java:247)
>         at 
> org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:76)
>         at 
> org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:66)
>         at 
> org.apache.ratis.shaded.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:252)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:629)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> 2018-08-20 07:54:33,260 WARN 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#3
> java.lang.IllegalStateException: Unexpected Index: entries[1].getIndex()=0 
> but entries[0].getIndex()=0
> {code}
>  
>  
> {code}
> 2018-08-20 11:54:29,174 INFO 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: appendEntries completed
> 2018-08-20 11:54:30,182 INFO org.apache.ratis.server.impl.FollowerState: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 changes to CANDIDATE, 
> lastRpcTime:1332, electionTimeout:860ms
> 2018-08-20 11:54:30,182 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@465f9230 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 11:54:30,185 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: change Leader from 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for initElection
> 2018-08-20 11:54:30,199 INFO org.apache.ratis.server.impl.LeaderElection: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: begin an election in Term 15
> 2018-08-20 11:54:30,230 INFO org.apache.ratis.server.impl.LeaderElection: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: Election PASSED; received 1 
> response(s) [b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858<-2bf
> 278ca-2dad-4029-a387-2faeb10adef5_9858#0:OK-t15] and 0 exception(s); 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:t15, leader=null, 
> voted=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, raftlog=[(t:14, i:20488)],
> conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:172.26.32.228:9858], old=null
> 2018-08-20 11:54:30,230 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@465f9230 to LEADER at term 15 for 
> changeToLeader
> 2018-08-20 11:54:30,230 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: change Leader from null to 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for becomeLeader
> 2018-08-20 11:54:30,246 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.staging.catchup.gap = 1000 (default)
> 2018-08-20 11:54:30,246 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.rpc.sleep.time = 25 ms (default)
> 2018-08-20 11:54:30,269 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.log.appender.buffer.capacity = 134217728 (custom)
> 2018-08-20 11:54:30,270 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.log.appender.batch.enabled = true (custom)
> 2018-08-20 11:54:30,270 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.log.appender.snapshot.chunk.size.max = 16MB (=16777216) (default)
> 2018-08-20 11:54:30,273 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.server.leader.outstanding.appends.max = 128 (default)
> 2018-08-20 11:54:30,273 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.rpc.request.timeout = 3000 ms (custom)
> 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.log.appender.buffer.capacity = 134217728 (custom)
> 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.log.appender.batch.enabled = true (custom)
> 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.log.appender.snapshot.chunk.size.max = 16MB (=16777216) (default)
> 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.server.leader.outstanding.appends.max = 128 (default)
> 2018-08-20 11:54:30,274 INFO org.apache.ratis.conf.ConfUtils: 
> raft.server.rpc.request.timeout = 3000 ms (custom)
> 2018-08-20 11:54:30,277 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858-RaftLogWorker index 
> to:20489
> 2018-08-20 11:54:30,733 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858-org.apache.ratis.server.impl.RoleInfo@465f9230:
>  Withhold vote from candidate 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 with 
> term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
> lastRpcElapsed=null
> 2018-08-20 11:54:31,230 WARN org.apache.ratis.server.impl.LogAppender: 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858: Failed appendEntries to 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:172.26.32.228:9858
> java.io.IOException: java.lang.IllegalStateException: Unexpected Index: 
> previous is (t:14, i:20374) but entries[0].getIndex()=0
>         at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:49)
>         at 
> org.apache.ratis.grpc.RaftGrpcUtil.tryUnwrapException(RaftGrpcUtil.java:81)
>         at 
> org.apache.ratis.grpc.RaftGrpcUtil.unwrapThrowable(RaftGrpcUtil.java:58)
>         at org.apache.ratis.grpc.RaftGrpcUtil.warn(RaftGrpcUtil.java:129)
>         at 
> org.apache.ratis.grpc.server.GRpcLogAppender$AppendLogResponseHandler.onError(GRpcLogAppender.java:241)
>         at 
> org.apache.ratis.shaded.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:419)
>         at 
> org.apache.ratis.shaded.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:684)
>         at 
> org.apache.ratis.shaded.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:391)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:475)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:557)
>         at 
> org.apache.ratis.shaded.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:478)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to