> * It's possible to update the old configuration first by using client.admin().setConfiguration(), let's say set N=11 first, then start new nodes. However, since 5 < 11/2, the cluster won't be able to elect leader until at least 1 new node join. Yes, you are right. Also, even if one node has joined, the group has to wait for it to catch up with the previous log entries in order to obtain a majority for committing new entries.
> * Or may be we should limit the count when scaling? From N=5 -> N=7 -> N=9 -> N=11. We may start the 6 new nodes as listeners first. Listeners receive log entries but they are not voting members and they won't be counted for majority. When the listeners catch up, we may change them to normal nodes so that they become voting members. Tsz-Wo On Fri, Jul 1, 2022 at 10:23 AM Tsz Wo Sze <[email protected]> wrote: > Hi Riguz, > > > Start 6 new nodes with new configuration N=11, while keeping the > previous nodes running > > This step probably won't work as expected since it will create a new group > but not adding nodes to the original group. We must use the > setConfiguration API to change configuration (add/remove nodes); see > https://github.com/apache/ratis/blob/bd83e7d7fd41540c8bda6bd92a52ac99ccec2076/ratis-client/src/main/java/org/apache/ratis/client/api/AdminApi.java#L35 > > Hope it helps. Thanks a lot for trying Ratis! > > Tsz-Wo > > On Fri, Jul 1, 2022 at 12:30 AM Riguz Lee <[email protected]> wrote: > >> >> >> Hi, >> >> >> I'm testing scaling up/down the raft cluster, but ratis is not working as >> expected in new cluster. My steps are: >> >> >> * Initialize a cluster with 5 nodes, the size and peers of the cluster is >> configured in a configuration file, let's say N=5. The cluster works >> perfectly, raft logs are synchronized across the cluster. >> >> * Start 6 new nodes with new configuration N=11, while keeping the >> previous nodes running >> >> * Recreate the previous nodes with N=11 one by one >> >> >> According the raft paper, raft should be able to handle configuration >> change by design, but after the above steps, what I've found is that: >> >> >> - New nodes not able to join the cluster >> >> - Old nodes still has a size of 5(by >> *client.getGroupManagementApi(peerId).info(groupId)*) >> >> >> So how should I scale the cluster correctly? A few thoughts of mine: >> >> >> * Definitely the old cluster should not be stopped while starting new >> nodes, otherwise new nodes might be able to elect new leader(eg. N=11 with >> 6 new nodes) and raft logs in old nodes will be overriden. >> >> * It's possible to update the old configuration first by using >> client.admin().setConfiguration(), let's say set N=11 first, then start new >> nodes. However, since 5 < 11/2, the cluster won't be able to elect leader >> until at least 1 new node join. >> >> * Or may be we should limit the count when scaling? From N=5 -> N=7 -> >> N=9 -> N=11. >> >> >> Thanks, >> >> Riguz Lee >> >> >> >>
