Igniters! We are working on proposal described in IEP-14 Ignite failures handling [1] and it's time to discuss it with community (although it was necessary to do this before).
Most important question: what should be default behaviour in case of failure? There are 4 actions: 1. Restart JVM process (it's possible only if process was started from ignite.(sh|bat) script) 2. Terminate JVM; 3. Stop node (if there is only one node in process then process will be also terminated); 4. No operation. I believe that node should be stopped by default. But there is chance that node will not stopped correctly. May be we should terminate JVM process by default. But it will kill all nodes in the JVM process. It's especially bad behaviour in case when nodes belong different Ignite clusters (real use case). May be we should restart JVM process default. This approach has the same problems as the previous one. And additionally it could lead to continues restarts and, therefore, continues exchanges and rebalancing. Difficult choice. Could you please share your thoughts. [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-14+Ignite+failures+handling