OneSizeFitsQuorum opened a new pull request, #13221:
URL: https://github.com/apache/iotdb/pull/13221

   The current Ratis module's Leader may step down voluntarily without knowing 
who the new Leader is, which will not trigger the state machine's 
notifyLeaderChange callback. As a result, some modules that rely on this 
interface to determine whether the current node is no longer the Leader might 
delay resource release, potentially causing split-brain issues with multiple 
Leaders.
   
   <img width="1103" alt="image" 
src="https://github.com/user-attachments/assets/360b8b71-d3bf-4157-84b2-83c3c93deb19";>
   
   <img width="901" alt="image" 
src="https://github.com/user-attachments/assets/5d5d9195-cb7f-4eee-a1b5-6d00dea4734c";>
   
   For example, in a 3-node ConfigNode setup, if a symmetric network partition 
fault is injected into the Leader node, the other two nodes will elect a new 
Leader. However, certain services (such as heartbeat, procedure, etc.) on the 
old Leader will not be cleared, leading to a split-brain scenario, which could 
cause some unexpected behavior.
   
   <img width="351" alt="image" 
src="https://github.com/user-attachments/assets/12c70de0-308c-450f-a334-91b39ca17ae8";>
   
   
   After this PR, even if the new Leader is unknown, Ratis will still call the 
notifyNotReady function, thereby preventing split-brain issues from occurring.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to