BTW,  you can try zookeeper discovery, I think it's the easier way to
resolve split-brain problem:
https://www.gridgain.com/docs/latest/developers-guide/clustering/zookeeper-discovery

пт, 11 сент. 2020 г. в 14:13, Michael Cherkasov <michael.cherka...@gmail.com
>:

> Make sure first you stop all nodes in one segment and only then start
> them, rolling restart might not fix cluster segmentation.
>
>
> пт, 11 сент. 2020 г. в 09:08, Denis Magda <dma...@apache.org>:
>
>> Hi Samuel,
>>
>> With the current behavior, the segments will not rejoin automatically.
>> Once the network is recovered from a network partitioning event, you need
>> to restart all the nodes of one of the segments. Those nodes will join the
>> other nodes and the cluster will become fully operational.
>>
>> Let me know if you have any other questions or guidance with this.
>>
>> -
>> Denis
>>
>>
>> On Fri, Sep 11, 2020 at 7:38 AM Samuel Ueltschi <
>> samuel.uelts...@bsi-software.com> wrote:
>>
>>> Hi
>>>
>>>
>>>
>>> I've been testing Ignite (2.8.1) and it's behaviour under network
>>> segmentation.
>>>
>>> According to the docs, Ignite nodes should be able to detect network
>>> segmentation and apply the configured SegmentationPolicy.
>>>
>>>
>>>
>>> However the segmentation handling didn't trigger as I would have
>>> expected it to do.
>>>
>>> For my tests, I setup three cluster nodes c1, c2 and c3 running in
>>> docker containers, all competing for a shared IgniteLock instance in a loop.
>>>
>>> Then I used iptables in container c2 to drop all incoming and outgoing
>>> packages on that node.
>>>
>>> After a few seconds I got the following events:
>>>
>>>
>>>
>>> c1:
>>>
>>> - EVT_NODE_FAILED for c2
>>>
>>>
>>>
>>> c2:
>>>
>>> - EVT_NODE_FAILED for c1
>>>
>>> - EVT_NODE_FAILED for c3
>>>
>>>
>>>
>>> c3:
>>>
>>> - EVT_NODE_FAILED for c2
>>>
>>>
>>>
>>> Then I reset the iptables rules expecting that c2 would rejoin the
>>> cluster and detect segmentation.
>>>
>>> However this didn't happen, c2 just keeps running as a second standalone
>>> cluster instance.
>>>
>>> Only after restarting c2 it rejoined the cluster.
>>>
>>>
>>>
>>> Eventually I was able to trigger the EVT_NODE_SEGMENTED event by pausing
>>> the c2 container for 1minute. After resuming, c2 detects the segmentation
>>> and runs the segmentation policy as excepcted.
>>>
>>>
>>>
>>> Is this behaviour correct? Shouldn't the Ignite cluster be able to
>>> recover from the first scenario?
>>>
>>> During a network segmentation no packages would be able to move between
>>> nodes, so the iptables approach should be realistic in my oppinion.
>>>
>>>
>>>
>>> Maybe I have some wrong assumptions about network segmentation so any
>>> feedback would be greatly appreciated.
>>>
>>>
>>>
>>> Cheers Sam
>>>
>>>
>>>
>>> --
>>> Software Engineer
>>> BSI Business Systems Integration AG
>>> Erlachstrasse 16B, CH-3012 Bern
>>> Telefon +41 31 850 12 06
>>>
>>> www.bsi-software.com
>>>
>>>
>>>
>>

Reply via email to