I've been burrowing into the lifecycle code recently and posted https://issues.apache.org/jira/browse/FLUME-1257 regarding it(more specifically, regarding inability to shutdown after a component gets stuck in a starting loop).

Right now, to put it bluntly, the shutdown model feels incorrect, there are a lot of places with a loop waiting for changes in state, that have an exception handler for InterruptedException, but nothing will ever trigger that interrupt. SIGINT is handled by the shutdown hook, which requests a clean shutdown. If a clean shutdown doesn't succeed, everything just hangs in limbo until you forcibly kill it.

I'd like to start some kind of discussion on how to go about handling this. One argument may be that it's just not a problem... so long as everything starts and stops in a timely manner. Other simple kludges would be setting a limit to number of times through each loop. Or we could just start sending interrupts after a predetermined delay.

Another possibility is adding more lifecycle states to better communicate state(starting/stopping?) so that actions can be aborted as necessary. This would likely require all existing components to communicate these new states.

Feedback/opinions would be appreciated

Reply via email to