Re: How long Ignite retries upon NODE_FAILED events

Evgenii Zhuravlev Mon, 02 Jul 2018 10:24:53 -0700

If cluster already decided that node failed, it will be stopped after it
will try to reconnect to the cluster with the same id


2018-07-02 18:37 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH <
subash.hewawidanagam...@fmr.com>:

> Yes failureDetectionTimeout determines the time it wait to mark a node
> failed. But my question is, after such node failed happened, and then what
> happens when that failed node becomes reachable in the network (less that
> failureDetectionTimeout) ?
>
>
>
> *From:* Evgenii Zhuravlev [mailto:e.zhuravlev...@gmail.com]
> *Sent:* Monday, July 02, 2018 11:05 AM
> *To:* user@ignite.apache.org
> *Subject:* Re: How long Ignite retries upon NODE_FAILED events
>
>
>
> Hi,
>
>
>
> by default, Ignite uses a mechanism, that can be configured using
> failureDetectionTimeout: https://apacheignite.readme.io/v2.
> 5/docs/tcpip-discovery#section-failure-detection-timeout
>
>
>
> Evgenii
>
>
>
> 2018-07-02 16:40 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH <
> subash.hewawidanagam...@fmr.com>:
>
> Hi team,
>
>
>
> For example, let’s say one of the node is not down(JVM is up), but network
> not reachable from/to it. Then rest of the nodes will see  NODE_FAILED and
> started working as normal with reduced cluster size. If that failed node,
> the network from/to it, becomes normal again  after X minutes. Then,
>
> - will other nodes discover them, or will that node be able to figure it
> out ?
>
> - How long X can be at max? Is there max retry or timeout. (I seen
> joinTimeout param in discovery, but that’s seems only applicable for
> startup, like how long it should pause starting the node to let join others)
>
>
>

Re: How long Ignite retries upon NODE_FAILED events

Reply via email to