runzhiwang opened a new pull request #1371:
URL: https://github.com/apache/hadoop-ozone/pull/1371


   ## What changes were proposed in this pull request?
   
   **What's the problem ?**
   
   When enable multi-raft, the leader distribution in datanodes is not balance. 
In my test, there are 72 datanodes, each datanode 
   engage in 6 pipelines, so there are 144 pipelines. As the image shows, the 
leader number of the 4 datanodes is 0, 0, 4, 2, it's not balance. Because ratis 
leader not only accept client request, but also replicate log to 2 followers, 
and follower only replicate log from leader, so the leader's load is at least 3 
times of follower. So we need to balance leader.
   
   
![image](https://user-images.githubusercontent.com/51938049/91788208-3cc6a400-ec3e-11ea-9e22-4dd4d30016df.png)
   
   **How to improve ?**
   
   With the guidance of @szetszwo , 
[RATIS-967](https://issues.apache.org/jira/browse/RATIS-967) not only support 
priority in leader election, but also support lower priority leader try to 
yield leadership to higher priority peer when  higher priority peer's log catch 
up.
   
   So in ozone
   1. assign the suggested leader with higher priority, and 2 followers with 
lower priority, then we can achieve leader distribution's balance.
   2. record the suggested leader count in DatanodeDetails, when create 
pipeline, choose the datanode with the smallest suggested leader count as the 
suggested leader.
   3. to avoid we lose the suggested leader count in SCM when restart SCM, we 
also record it in datanode, when scm restart, datanode will report the 
suggested leader count to SCM.
   
   As the following image shows, there are 72 datanodes, each datanode engage 
in 6 pipelines, so there are 144 pipelines.
   The leader count of each datanode is 2, there is no exception, we achieve 
the leader distribution's balance. 
   
   
![image](https://user-images.githubusercontent.com/51938049/91788822-c7f46980-ec3f-11ea-87e1-3d7a5fccf181.png)
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2922
   
   
   ## How was this patch tested?
   
   add new ut.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to