[
https://issues.apache.org/jira/browse/FLUME-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Juhani Connolly updated FLUME-1257:
-----------------------------------
Component/s: Node
Configuration
Affects Version/s: v1.2.0
v1.1.0
> Components that fail to start can put flume into a state which it can't
> shutdown from
> -------------------------------------------------------------------------------------
>
> Key: FLUME-1257
> URL: https://issues.apache.org/jira/browse/FLUME-1257
> Project: Flume
> Issue Type: Bug
> Components: Configuration, Node
> Affects Versions: v1.1.0, v1.2.0
> Reporter: Juhani Connolly
>
> Clean shutdown of a flume agent where a component fails to start doesn't work.
> One way of confirming this is to try and use a FileChannel without hadoop IO
> jars on the classpath.
> My understanding of this is that the first Ctrl+C will try to stop the
> supervisor, which in turn should take down everything, but
> AbstractFileConfigurationProvider#stop will try to gently stop the local
> executor, which in turn is in an endless loop trying to start up a
> channel(DefaultLogicalNodeManager#startAllComponents). This loop can only be
> broken by an interrupt, but none ever comes to it, or the higher level(which
> would try shutdownNow on the executors.
> The interrupts will never come, since they are all relying on something
> further up the chain for them, but it doesn't exist.
> One solution for this is just to be less merciful in
> AbstractFileConfiguration#stop() and give it a moderate timeout, then do
> executorService.shutdownNow()
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira