Re: question about node segment and split brain

2020-07-15 Thread Ilya Kasnacheev
Hello!

1. if gc time > failureDetectionTimeout.
2. No, NODE_SEGMENTED is reserved for case where node is excluded from
cluster, alone, and not able to function as a single-node cluster
(explicitly removed from cluster).
3. I've never seen splitting into two clusters.
4. Perhaps, but see above.

Regards,
-- 
Ilya Kasnacheev


чт, 2 июл. 2020 г. в 13:45, bbweb :

> Hi,
>
>  we are considering ignite for mission critical system and concerning
> about node segmentation and split brain problem. I searched for current
> documents and suggestions but still have some questions in the following,
> please help and thanks in advance:
>
>
>
> 1、For long gc case , when a node has long gc and after gc is done, what's
> the heartbeat check and rejoin sequence for this node and when will this
> node be judged as segmented?
>
>
>
> 2、It seems sometimes EVT_NODE_SEGMENTED is reported from segmented node.
> When will  EVT_NODE_SEGMENTED event be fired and what's the further
> action?  SegmentationPolicy defines what happens when segmentation occurs.
> The default value is STOP。Does this mean that even no segmentation plugin
> is defined,  behavior can still be controlled(like stop or restart) using
> SegmentationPolicy  for EVT_NODE_SEGMENTED  ?
>
>
> 3、If there is network issue like switch outage which cause one cluster be
> splitted to 2 clusters,  will  node segmentation be reported and
> EVT_NODE_SEGMENTED event be fired?  Does this related to node number in the
> cluster, e.g if this is only one node isolated then this event is possible
> to be fired and if the splitted cluster has at least 2 nodes then no event
> will be fired and this two splitted cluster can both provide service which
> cause split brain? In out test, actually short time network outage can
> generate even one node cluster and this node can still provide service, no
> segmentation event is reported.
>
>
> 4、Can ZooKeeper resolve all this kind of issues including long GC or
> short-time network outage?
>
>
> 5、As ZooKeeper is for big cluster, for small cluster can plugin resolve
> all cases? I looked at some implentation including gridgain's paid verion,
> different plugins checks for node access, port access and shared first
> sytem but is seems it's difficult for some cases like GC as in this case
> the host is still reacheable but ignite node  is not reachable for
> sometime,  so what's the suggestions for plugin that can resolve all cases?
>
> Thanks!
>
>
>
>
>
> 
>
>
>
>


question about node segment and split brain

2020-07-02 Thread bbweb
Hi,  we are considering ignite for mission critical system and concerning about node segmentation and split brain problem. I searched for current documents and suggestions but still have some questions in the following, please help and thanks in advance: 1、For long gc case , when a node has long gc and after gc is done, what's the heartbeat check and rejoin sequence for this node and when will this node be judged as segmented?  2、It seems sometimes EVT_NODE_SEGMENTED is reported from segmented node. When will  EVT_NODE_SEGMENTED event be fired and what's the further action?  SegmentationPolicy defines what happens when segmentation occurs. The default value is STOP。Does this mean that even no segmentation plugin is defined,  behavior can still be controlled(like stop or restart) using SegmentationPolicy  for EVT_NODE_SEGMENTED  ?3、If there is network issue like switch outage which cause one cluster be splitted to 2 clusters,  will  node segmentation be reported and EVT_NODE_SEGMENTED event be fired?  Does this related to node number in the cluster, e.g if this is only one node isolated then this event is possible to be fired and if the splitted cluster has at least 2 nodes then no event will be fired and this two splitted cluster can both provide service which cause split brain? In out test, actually short time network outage can generate even one node cluster and this node can still provide service, no segmentation event is reported. 4、Can ZooKeeper resolve all this kind of issues including long GC or short-time network outage?5、As ZooKeeper is for big cluster, for small cluster can plugin resolve all cases? I looked at some implentation including gridgain's paid verion, different plugins checks for node access, port access and shared first sytem but is seems it's difficult for some cases like GC as in this case the host is still reacheable but ignite node  is not reachable for sometime,  so what's the suggestions for plugin that can resolve all cases? Thanks!