Hi, Xiangdong and Xinyu,

The PR https://github.com/apache/iotdb/pull/3797 for JIRA 
https://issues.apache.org/jira/browse/IOTDB-1564 is ready for review.
Please give some suggestions to those codes~. 

Thanks.

-----邮件原件-----
发件人: Xiangdong Huang <saint...@gmail.com> 
发送时间: 2021年8月25日 12:02
收件人: dev <dev@iotdb.apache.org>
主题: Re: 回复: Conclusion about JIRA issue[IOTDB-1564]: Make leader failure 
detection and election faster

Hi,

 current codes are:

```
long electionWait =
    ClusterConstant.getElectionLeastTimeOutMs()
        + Math.abs(random.nextLong() %
ClusterConstant.getElectionRandomTimeOutMs());
```

where the comment says: electionLeastTimeOutMs should be at least as long as a 
heartbeat;

IMO,  these two parameters are enough, and we do not need to add more 
parameters.

But the default value can be changed:
1. electionLeastTimeOutMs can be heartbeat *2 or something others, rather than 
2 seconds by default.
2. by default, electionRandomTimeOutMs can be 50 ms or something like
heartbeat/10  ?

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院

Eric Pai <ericpa...@hotmail.com> 于2021年8月23日周一 上午10:18写道:
>
> Hi, Xiangdong,
>
> So what your suggestions about the election waiting time? Add another 
> configuration parameter called election_wait_time_ms, or left as a shorter 
> hardcode constant?
>
> 发件人: Eric Pai <ericpa...@hotmail.com>
> 日期: 2021年8月21日 星期六 下午7:32
> 收件人: "dev@iotdb.apache.org" <dev@iotdb.apache.org>
> 主题: 回复: Conclusion about JIRA issue[IOTDB-1564]: Make leader failure 
> detection and election faster
>
> Hi, all,
>
> Now the randomElectionWait time is hardcode as 3-5s, which is not suitable 
> when the heartbeat_interval_ms and election_timeout_ms is too small.
>
> I decide to change it to [2* heartbeat_interval_ms, 2* heartbeat_interval_ms 
> + 50ms).
>
> The 50ms is referred from the Raft paper with a low probability and fast 
> election when split votes happens.
>
> But I haven’t found any detailed descriptions about the relationship between 
> heartbeat_interval_ms and the least waiting time.
>
> Any good suggestions?
>
> 发件人: 白 渐
> 发送时间: 2021年8月18日 22:14
> 收件人: dev@iotdb.apache.org
> 主题: Conclusion about JIRA issue[IOTDB-1564]: Make leader failure 
> detection and election faster
>
> Hi, all,
>
> @Xinyu Tan and me have made a conclusion about the refine of hearbeat and 
> election related timeout parameters:
>
> JIRA link: 
> https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiss
> ues.apache.org%2Fjira%2Fbrowse%2FIOTDB-1564&amp;data=04%7C01%7C%7C9782
> 3463d4104095d18608d9677d1fd9%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C
> 0%7C637654609373686618%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJ
> QIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=XxyiqSz7m
> KozmmG4E85jShds9D63H5vEVMfYExv4Sag%3D&amp;reserved=0
>
> Two parameters are added:
>
> heartbeat_interval_ms (t1): The time interval(ms) between two rounds of 
> heartbeat broadcast of one raft group leader.
>
> election_timeout_ms (t2 and t3): The election timeout time of candidates and 
> followers, or as the parameter of waiting for voting result.
>
>                        t1             t1
> Leader view: Send HB - - -> Send HB - - -> Send HB
>                                                 t2                            
>          t3
> Follower view: Receive HB - - -> Receive HB - - - - -> HB expired / 
> Start election - - - - -> Election Timeout
>
> I will do the following works sooner or later:
>
> 1.     Coding.
>
> 2.     Proper test cases.
>
> 3.     Docs about new parameters.
>
> Thanks.
>
>

Reply via email to