Thanks Andrey! I have added a few comments to the IEP-14 page. D.
On Fri, Mar 16, 2018 at 6:44 AM, Andrey Gura <ag...@apache.org> wrote: > Hi! > > Thank you all for your opinions and ideas! > > While reading the thread I made two important conclusions: > > 1. Proposed API should be changed because possible actions enumeration > is bad idea. More clean and simple design should allow user provide > failure handler implementation with custom logic of failure handling > if needed. > > 2. Several failure handler implementations should be provided out-of > box in order to provide simple way of changing default behaviour > through configuration. The following implementations should be > provided: > > - NoOpFailureHandler - It's useful for tests and debugging. > - RestartProcessFailureHandler - Specific implementation that > could be used only with ignite.(sh|bat). > - StopNodeFailureHandler - This implementation will stop Ignite > node in case of critical error. > - StopNodeOrHaltFailureHandler(boolean tryStop, long timeout) - > Default failure handler will try stop node if tryStop value is true. > If node can't be stopped or tryStop value is false then JVM process > will be terminated forcibly (Runtime.halt()). Default value for > tryStop parameter is false. Of course we should limit time of node > shutdown in order to prevent hangs. > > As for the default behavior, I agree with those who believe that most > suitable default option is process termination (although I had a > different opinion before) and most strong argument for this choice is > impossibility of reasoning about system state in case of critical > error. > Also I believe that we can't choose solution that will be suitable for > any community member and the best that we can do is provide simple way > of changing this behavior. > > So, I think, default behavior discussion should be finished. I'll > update IEP-14 [1] accordingly to my conclusions above. If you have any > ideas or thoughts about this conclusions, please feel free to share. > > Thanks! > > [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP- > 14+Ignite+failures+handling > > On Fri, Mar 16, 2018 at 1:07 AM, Dmitriy Setrakyan > <dsetrak...@apache.org> wrote: > > On Thu, Mar 15, 2018 at 5:21 AM, Dmitry Pavlov <dpavlov....@gmail.com> > > wrote: > > > >> Hi Dmitriy, > >> > >> It seems, here everyone agrees that killing the process will give a more > >> guaranteed result. The question is that the majority in the community > does > >> not consider this to be acceptable in case Ignite as started as embedded > >> lib (e.g. from Java, using Ignition.start()) > >> > >> What can help to accept the community's opinion? Let's remember Apache > >> principle: "community first". > >> > > > > I am still confused about the problem the majority of the community is > > trying to solve. If our priority is to keep the cluster in frozen state, > > then what is the reason for this task altogether? > > > > The priority should be to keep the cluster operational, not frozen. The > > only solution here is "kill" or "stop+kill". If the community does not > > accept this option as a default, then I propose to drop this task > > altogether, because we do not have to do anything to keep the cluster > > frozen. > > > > > >> If release 2.5 will show us it was inpractical, we will change default > to > >> kill even for library. What do you think? > >> > > > > See above. I do not see a reason to continue with this task if the end > > result is identical to what we have today. > > > > I want to give the community another chance to speak up and voice their > > opinions again, having fully understood the context and the problem being > > solved here. > > > > D. >