[ https://issues.apache.org/jira/browse/RATIS-601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869386#comment-16869386 ]
Mukul Kumar Singh commented on RATIS-601: ----------------------------------------- Thanks for the review [~ljain]. Please find my comments inline. OrderedAsync#sendRequest:242-245 - We can incorporate these lines into the changes as well. I was thinking we should handle NotLeaderException in the raft client reply itself and remove the handling in exceptionally clause. We can remove the changes related to NotLeaderException in GrpcClientProtocolClient mentioned in your previous comment? bq. Raised RATIS-602 as a followup for this. RaftClientImpl#handleIOException:367 - Based on Nicholas's comment, should we remove the condition for TimeoutIOException? So that for a dead leader we retry in a separate server. bq. a dead datanode will throw a SocketTimeoutException/ClosedChannelException exception. These exception are listed in IOUtils#shouldReconnect. > Fix NotLeaderException handling > ------------------------------- > > Key: RATIS-601 > URL: https://issues.apache.org/jira/browse/RATIS-601 > Project: Ratis > Issue Type: Bug > Components: server > Reporter: Mukul Kumar Singh > Assignee: Mukul Kumar Singh > Priority: Major > Attachments: RATIS-601.001.patch > > > There are 3 issues with leader election > a) OrderedAsync#sendRequest doesn't handle NotLeaderException > b) RaftServerImpl#generateNotLeaderException should not guess current leader > when it does not has information about it. This leads to client retrying > aggressively which leads into RetryException. > c) RaftClient right now changes leader for AlreadyClosedException and > TimeoutIOException, these events do not trigger leader election and hence the > leader should not be changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)