OneSizeFitsQuorum opened a new pull request, #13221: URL: https://github.com/apache/iotdb/pull/13221
The current Ratis module's Leader may step down voluntarily without knowing who the new Leader is, which will not trigger the state machine's notifyLeaderChange callback. As a result, some modules that rely on this interface to determine whether the current node is no longer the Leader might delay resource release, potentially causing split-brain issues with multiple Leaders. <img width="1103" alt="image" src="https://github.com/user-attachments/assets/360b8b71-d3bf-4157-84b2-83c3c93deb19"> <img width="901" alt="image" src="https://github.com/user-attachments/assets/5d5d9195-cb7f-4eee-a1b5-6d00dea4734c"> For example, in a 3-node ConfigNode setup, if a symmetric network partition fault is injected into the Leader node, the other two nodes will elect a new Leader. However, certain services (such as heartbeat, procedure, etc.) on the old Leader will not be cleared, leading to a split-brain scenario, which could cause some unexpected behavior. <img width="351" alt="image" src="https://github.com/user-attachments/assets/12c70de0-308c-450f-a334-91b39ca17ae8"> After this PR, even if the new Leader is unknown, Ratis will still call the notifyNotReady function, thereby preventing split-brain issues from occurring. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
