[ https://issues.apache.org/jira/browse/RATIS-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022804#comment-17022804 ]
Hadoop QA commented on RATIS-794: --------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 52s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 5s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 2s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | ratis.logservice.TestLogServiceWithNetty | | | ratis.logservice.TestLogServiceWithGrpc | | | ratis.grpc.TestServerRestartWithGrpc | | | ratis.grpc.TestRaftWithGrpc | | | ratis.grpc.TestRaftStateMachineExceptionWithGrpc | | | ratis.netty.TestRaftStateMachineExceptionWithNetty | | | ratis.netty.TestRaftSnapshotWithNetty | | | ratis.grpc.TestRaftAsyncWithGrpc | | | ratis.server.simulation.TestRaftSnapshotWithSimulatedRpc | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/ratis:date2020-01-24 | | JIRA Issue | RATIS-794 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12991719/r794_20200124.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile | | uname | Linux 1084d547f26f 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh | | git revision | master / 90cd474 | | maven | version: Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) | | Default Java | 1.8.0_232 | | unit | https://builds.apache.org/job/PreCommit-RATIS-Build/1218/artifact/out/patch-unit-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-RATIS-Build/1218/testReport/ | | Max. process+thread count | 1656 (vs. ulimit of 5000) | | modules | C: ratis-server ratis-test U: . | | Console output | https://builds.apache.org/job/PreCommit-RATIS-Build/1218/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Ratils leader should retry append requests based on follower commit info in > case of intermittent append failures > ---------------------------------------------------------------------------------------------------------------- > > Key: RATIS-794 > URL: https://issues.apache.org/jira/browse/RATIS-794 > Project: Ratis > Issue Type: Bug > Components: server > Reporter: Shashikant Banerjee > Assignee: Tsz-wo Sze > Priority: Major > Fix For: 0.5.0 > > Attachments: r794_20200122.patch, r794_20200124.patch > > > During Ozone testing, it was observed that a leader election happens in > between the test , where a follower has caught to a certain index 313. The > new leader starts sends an append request to the follower which fails with > grpc Exception. This leads to leader reset the connection and start from the > beginning (index 1). > > > {code:java} > 2020-01-13 14:56:32,995 INFO org.apache.ratis.server.impl.RaftServerImpl: > 0.0.0.0:9858@group-4F125BF42C14: changes role from CANDIDATE to LEADER at > term 7 for changeToLeader > 2020-01-13 14:56:32,995 INFO > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > Leader change notification received for group: group-4F125BF42C14 with new > leaderId: ed90869c-317e-4303-8922-9fa83a3983cb > 2020-01-13 14:56:33,042 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler: > Failed appendEntries: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io > exception > 2020-01-13 14:56:33,043 DEBUG org.apache.ratis.util.PeerProxyMap: > ed90869c-317e-4303-8922-9fa83a3983cb: reset proxy for > b65b0b6c-b0bb-429f-a23d-467c72d4b85c > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: RUNNING -> CLOSING > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: CLOSING -> CLOSED > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: NEW > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.TimeoutScheduler: new > ScheduledThreadPoolExecutor > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.PeerProxyMap: > ed90869c-317e-4303-8922-9fa83a3983cb: Closing proxy for peer > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858 > 2020-01-13 14:56:33,045 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 1 > 2020-01-13 14:56:33,047 INFO org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateUnconditionally 314 -> 1 ---------------------> set the next index for > the follower back to 1 and starts from 1) > 2020-01-13 14:56:35,840 DEBUG org.apache.ratis.grpc.server.GrpcLogAppender: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler: > received the first reply > ed90869c-317e-4303-8922-9fa83a3983cb<-b65b0b6c-b0bb-429f-a23d-467c72d4b85c#2:OK,SUCCESS,nextIndex:314,term:5,followerCommit:313, > request=AppendEntriesRequest:cid=2,entriesCount=0,lastEntry=null . > -------------------> (Receives the response from follower indficating > follower is at 312) > Although the follower is at 313, the leader keeps on sending the > appendRequests from index 1. > 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateIncreasingly 1 -> 2 > 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 7 > 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateIncreasingly 2 -> 3 > 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 8 > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)