[ 
https://issues.apache.org/jira/browse/RATIS-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155319#comment-17155319
 ] 

runzhiwang edited comment on RATIS-800 at 7/10/20, 10:27 AM:
-------------------------------------------------------------

[~ljain] Thanks for review.
bq. Balancing the leader in an active ratis ring might be difficult to achieve. 
For a candidate to be elected as leader its term and index should be >= 
follower's term index. Even if we trigger an election it is not guaranteed that 
the datanode will become leader.

We can first focus on balance leader.  This has been explained in raft paper as 
following.  If leadership transfer does not complete after about an election 
timeout, the prior leader aborts the transfer and still act as the leader, and 
resumes accepting client requests. 

 !image-2020-07-10-18-27-01-890.png! 


was (Author: yjxxtd):
[~ljain] Thanks for review.
bq. Balancing the leader in an active ratis ring might be difficult to achieve. 
For a candidate to be elected as leader its term and index should be >= 
follower's term index. Even if we trigger an election it is not guaranteed that 
the datanode will become leader.

We can first focus on balance leader.  This has been explained in raft paper as 
following.  If leadership transfer does not complete after about an election 
timeout, the prior leader aborts the transfer and still act as the leader, and 
resumes accepting client requests. 

bq. To transfer leadership in Raft, the prior leader sends its log entries to 
the target server, then the
bq. target server runs an election without waiting for an election timeout to 
elapse. The prior leader
bq. thus ensures that the target server has all committed entries at the start 
of its term, and, as in normal
bq. elections, the majority voting guarantees the safety properties (such as 
the Leader Completeness
bq. Property) are maintained. The following steps describe the process in more 
detail:
bq. 1. The prior leader stops accepting new client requests.
bq. 2. The prior leader fully updates the target server’s log to match its own, 
using the normal log
bq. replication mechanism described in Section 3.5.
bq. 3. The prior leader sends a TimeoutNow request to the target server. This 
request has the same
bq. effect as the target server’s election timer firing: the target server 
starts a new election (incrementing
bq. its term and becoming a candidate).
bq. Once the target server receives the TimeoutNow request, it is highly likely 
to start an election before
bq. any other server and become leader in the next term. Its next message to 
the prior leader will include
bq. its new term number, causing the prior leader to step down. At this point, 
leadership transfer is
bq. complete.
bq. It is also possible for the target server to fail; in this case, the 
cluster must resume client operations.
bq. If leadership transfer does not complete after about an election timeout, 
the prior leader aborts
bq. the transfer and resumes accepting client requests. If the prior leader was 
mistaken and the target
bq. server is actually operational, then at worst this mistake will result in 
an extra election, after which
bq. client operations will be restored.

> Make Ratis consume recommended leader host from the pipeline creator
> --------------------------------------------------------------------
>
>                 Key: RATIS-800
>                 URL: https://issues.apache.org/jira/browse/RATIS-800
>             Project: Ratis
>          Issue Type: Sub-task
>            Reporter: Li Cheng
>            Assignee: runzhiwang
>            Priority: Critical
>         Attachments: image-2020-07-10-18-27-01-890.png
>
>
> Start a Jira for suggested leader sematics. It would help Ratis performance 
> if it can consume the leader host which its upstream user like Ozone 
> recommends. User can choose the leader host based on load balance and rack 
> awareness. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to