If cluster already decided that node failed, it will be stopped after it will try to reconnect to the cluster with the same id
2018-07-02 18:37 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH < subash.hewawidanagam...@fmr.com>: > Yes failureDetectionTimeout determines the time it wait to mark a node > failed. But my question is, after such node failed happened, and then what > happens when that failed node becomes reachable in the network (less that > failureDetectionTimeout) ? > > > > *From:* Evgenii Zhuravlev [mailto:e.zhuravlev...@gmail.com] > *Sent:* Monday, July 02, 2018 11:05 AM > *To:* user@ignite.apache.org > *Subject:* Re: How long Ignite retries upon NODE_FAILED events > > > > Hi, > > > > by default, Ignite uses a mechanism, that can be configured using > failureDetectionTimeout: https://apacheignite.readme.io/v2. > 5/docs/tcpip-discovery#section-failure-detection-timeout > > > > Evgenii > > > > 2018-07-02 16:40 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH < > subash.hewawidanagam...@fmr.com>: > > Hi team, > > > > For example, let’s say one of the node is not down(JVM is up), but network > not reachable from/to it. Then rest of the nodes will see NODE_FAILED and > started working as normal with reduced cluster size. If that failed node, > the network from/to it, becomes normal again after X minutes. Then, > > - will other nodes discover them, or will that node be able to figure it > out ? > > - How long X can be at max? Is there max retry or timeout. (I seen > joinTimeout param in discovery, but that’s seems only applicable for > startup, like how long it should pause starting the node to let join others) > > >