[jira] [Updated] (IGNITE-27746) MDC. Implement parallel ping of DC2's nodes with the connection recovery.

Vladimir Steshin (Jira) Wed, 04 Feb 2026 10:16:59 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-27746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vladimir Steshin updated IGNITE-27746:
--------------------------------------
    Description: 
Consider:
 * The Multy-DC feature is on.
 * A corner node from DC1 can't send a message to it's next node in DC2.
 * DC2 is unavailable.
 * No node of DC1 can connect to any node in DC2.

To prevent sequential nodes failure in DC1 we need to extend the connection 
recovery mechanics. We need to know whether DC2 is completely unavailable. If 
so, we switch to DC/brain split but keep nodes of DC1 online. To achive this we 
might ping DC2's nodes from the edge node while it does the normal connection 
recovery under the same connection recovery timeout. If the recovery fails and 
no ping to DC2 is success, we consider DC1 to work separatelly from DC2.

  was:
Consider:
 * The Multy-DC feature is on.
 * A corner node from DC1 can't send a message to it's next node in DC2.
 * DC2 is unavailable.
 * No node of DC1 can connect.

To prevent sequential nodes failure in DC1 we need to extend the connection 
recovery mechanics. We need to know whether DC2 is completely unavailable. If 
so, we switch to DC/brain split but keep nodes of DC1 online. To achive this we 
might ping DC2's nodes from the edge node while it does the normal connection 
recovery under the same connection recovery timeout. If the recovery fails and 
no ping to DC2 is success, we consider DC1 to work separatelly from DC2.


> MDC. Implement parallel ping of DC2's nodes with the connection recovery.
> -------------------------------------------------------------------------
>
>                 Key: IGNITE-27746
>                 URL: https://issues.apache.org/jira/browse/IGNITE-27746
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Vladimir Steshin
>            Priority: Major
>              Labels: ise
>
> Consider:
>  * The Multy-DC feature is on.
>  * A corner node from DC1 can't send a message to it's next node in DC2.
>  * DC2 is unavailable.
>  * No node of DC1 can connect to any node in DC2.
> To prevent sequential nodes failure in DC1 we need to extend the connection 
> recovery mechanics. We need to know whether DC2 is completely unavailable. If 
> so, we switch to DC/brain split but keep nodes of DC1 online. To achive this 
> we might ping DC2's nodes from the edge node while it does the normal 
> connection recovery under the same connection recovery timeout. If the 
> recovery fails and no ping to DC2 is success, we consider DC1 to work 
> separatelly from DC2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-27746) MDC. Implement parallel ping of DC2's nodes with the connection recovery.

Reply via email to