Juhani Connolly created FLUME-1257:
--------------------------------------
Summary: Components that fail to start can put flume into a state
which it can't shutdown from
Key: FLUME-1257
URL: https://issues.apache.org/jira/browse/FLUME-1257
Project: Flume
Issue Type: Bug
Reporter: Juhani Connolly
Clean shutdown of a flume agent where a component fails to start doesn't work.
One way of confirming this is to try and use a FileChannel without hadoop IO
jars on the classpath.
My understanding of this is that the first Ctrl+C will try to stop the
supervisor, which in turn should take down everything, but
AbstractFileConfigurationProvider#stop will try to gently stop the local
executor, which in turn is in an endless loop trying to start up a
channel(DefaultLogicalNodeManager#startAllComponents). This loop can only be
broken by an interrupt, but none ever comes to it, or the higher level(which
would try shutdownNow on the executors.
The interrupts will never come, since they are all relying on something further
up the chain for them, but it doesn't exist.
One solution for this is just to be less merciful in
AbstractFileConfiguration#stop() and give it a moderate timeout, then do
executorService.shutdownNow()
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira