Re: Node failure handling semantics

2018-05-07 Thread vkulichenko
Nick, If B is unhealthy, C will not be able to send a heartbeat message to it. After a timeout, C will consider B as failed and will connect to A. Along with this connection it will send a NODE_FAILED message that will go through all the nodes. Once B is back again, it will try to send a message

Re: Node failure handling semantics

2018-05-04 Thread npordash
Thanks Val I'd like to make sure I understand this correctly. Let's say we have a ring of nodes A <- B <- C <- D <- A. If B is unhealthy then C won't see a heartbeat within the configured failure detection time and will then proceed to connect to A. When this happens, how is B's ejection coordina

Re: Node failure handling semantics

2018-04-27 Thread vkulichenko
Nick, Here are some comments on your questions: 1. Heartbeat is always sent to the next node in the ring. That's the whole point of the ring architecture vs. peer-to-peer. 2. That's not possible, because each discovery message is sent across the ring, and the ordering of those messages is guarant

Node failure handling semantics

2018-04-26 Thread npordash
Hi, I was wondering if there is any additional documentation on how Ignite internally handles node failures? The section in the documentation kind of skims over this too quickly[1]. I specifically have the following inquiries: 1) Does each node in the ring send heartbeats to all other nodes in t