Re: IEP-14: Ignite failures handling (Discussion)

Dmitriy Setrakyan Tue, 13 Mar 2018 04:54:05 -0700

Guys, I do not understand the alternative. If Ignite is frozen and causes
the whole grid to freeze, how can we justify not killing it? Will uses
rather have their applications freeze?


I would consider real life use cases here. Can someone present a life
example where keeping a frozen grid node around is better than killing JVM?

D.

On Tue, Mar 13, 2018 at 6:16 AM, Alexey Goncharuk <
[email protected]> wrote:

> I also like "kill if standalone, stop if embedded" by default. A use can
> change it to kill for embedded mode, but it will be a controlled safe
> choice.
>
> 2018-03-13 11:26 GMT+03:00 Vladimir Ozerov <[email protected]>:
>
> > +1 for "kill if standalone, stop if embedded". We should never kill a
> > process in embedded node because it might be disastrous for user
> > application.
> >
> > On Tue, Mar 13, 2018 at 10:41 AM, Dmitry Pavlov <[email protected]>
> > wrote:
> >
> > > Denis, Dmitriy, I am not sure I agree here, please see close analogue -
> > JVM
> > > itself, and its parameter ExitOnOutOfMemoryError,- it is not default.
> > >
> > > If server node is started from sh script, kill OK for me, as process is
> > > controlled only by ignite.  It is sufficient to add option to override
> > > default for sh script.
> > >
> > > Users interested in this behaviour may also setup this option to "kill"
> > >
> > > If server node is started from java, it should never kill whole
> process.
> > > This mode is not prohibited by docs, users are allowed to start several
> > > nodes in one process, run its own application logic in this node.
> > >
> > > Why we should kill user code running? It could be negative surprise to
> > > user.
> > >
> > >
> > >
> > > вт, 13 мар. 2018 г. в 8:26, Dmitriy Setrakyan <[email protected]>:
> > >
> > > > On Tue, Mar 13, 2018 at 1:18 AM, Andrey Kornev <
> > [email protected]
> > > >
> > > > wrote:
> > > >
> > > > > I believe the only reasonable way to handle a critical system
> failure
> > > (as
> > > > > it is defined in the IEP) is a JVM halt (not a graceful
> > > exit/shutdown!).
> > > > > The sooner - the better, lesser impact. There’s simply no way to
> > reason
> > > > > about the state of the system in a situation like that, all bets
> are
> > > off.
> > > > > Any other policy would only confuse the matters and in all
> likelihood
> > > > make
> > > > > things worse.
> > > > >
> > > > > In practice, SREs/Operations would very much rather have a process
> > die
> > > a
> > > > > quick clean death, than let it run indefinitely and hope that it’ll
> > > > somehow
> > > > > recover by itself at some point in future, potentially degrading
> the
> > > > > overall system stability and availability all the while.
> > > > >
> > > >
> > > > Completely agree.
> > > >
> > >
> >
>

Re: IEP-14: Ignite failures handling (Discussion)

Reply via email to