Thanks for your input on this.
I'm more than happy for misbehaving things to just fail, though I hope
that we can leave proper logs so people can know something is wrong.
Do you feel the auto restarting behaviour should still remain a part of
the lifecycle? I'm not entirely sure if it is something the supervisor
should be doing, and the majority of components that fail tend to just
keep failing.
With these simplicity needs in mind though, the starting point that
Brock has attached to the JIRA looks simple and performs most of what
we need. I've been working on something bigger than that, but with more
settings for policy(configurable policy restricting number of attempts
to start/stop, timeouts, and forced termination). This probably isn't
necessary though if we decide on some "correct" behavior for components
and enforce it in code reviews.
On 06/07/2012 03:32 PM, Eric Sammer wrote:
I can try and answer any lingering questions about the existing lifecycle
management code. I knew there were outstanding issues in it (which was the
impetus for that JIRA to move to Guava's service model) but I just never
was able to put in the time. I'm in favor of moving to the Guava
implementation.
More generally, I strongly believe in well defined semantics and simple
contracts. In other words, we should not get into the business of
attempting to deal with byzantine failure. The idea is that the shutdown
handler (in response to SIGINT) should request an orderly shutdown. If the
system is in good working order, it should do so. If, for instance,
something PermGen OOMs or there's incorrect behavior in a LifecycleAware
component (i.e. a component that does not respect the contract), we should
explicitly *not* try and handle that and a forced kill is proper. My vote
is just to avoid insanely complex logic to deal with incorrectly
implemented components; that always leads to insanely complicated code that
doesn't always work when things are correctly implemented. Not to mention,
process stop events suffer from the halting problem[1] anyway...
[1] http://bit.ly/Kz5GMJ
Thanks for taking this on guys. It's not sexy work, but it's super
important.
On Wed, Jun 6, 2012 at 8:01 PM, Juhani Connolly<
[email protected]> wrote:
The biggest barrier to this right now is with the restarting behavior our
current lifecycle model has, which is not part of the guava lifecycle. It
means if we're to restart services we need to store everything needed to
build a new service when the old one dies, and start that. In essence we're
going to need an outer layer to watch the inner(guava layer), which sort of
defeats the purpose.
I'm trying to figure out if there's a way to get around this, or if
switching from a restarting model to guavas
starting/running/stopping/**terminated
model is possible(this would probably require some components to take
better care of themselves as once they fail they wouldn't auto-restart)
On 06/06/2012 05:48 PM, Hari Shreedharan wrote:
Juhani,
It would be interesting to see how much of an effort it would be to
replace the current system with Guava. It would be nice to see an initial
proof of concept, maybe you can post it on the dev list. I think there
would be others would also have ideas and feel the need to update the
Lifecycle system.
Thanks
Hari