[
https://issues.apache.org/jira/browse/FLUME-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264591#comment-13264591
]
Will McQueen commented on FLUME-1162:
-------------------------------------
Thanks for your detailed reply.
>>>>(3) Is it expected that a new component instance is not created at each
>>>>reconfig event, unless that component's
>>>> instance name changes in the config file?
>>No, components are always created. This is because during reconfiguration, we
>>do not know what the old components are.
What I'm seeing in this next test that I'll describe below is that the same
source component object is used after a particular reconfiguration (changing a
property value), rather than constructing a new source component object.
If I just unix-touch the config file then the reconfig logic kicks-in and the
source component's ctor is not called a 2nd time -- so the old source instance
is used. But in the long example I gave previously, I changed the name of the
component (from r1 to r2) in the config file, and this (expectedly) caused the
ctor to be called for the new r2 component.
In this next test, I additionally implemented the Configurable interface:
public class MyBasicSource extends AbstractSource implements EventDrivenSource,
Configurable {
public MyBasicSource() {
@Override
public void configure(Context context) {
}
}
}
I start with a similar config (I just added a property 'foo'):
agent.channels = c1
agent.sources = r1
agent.sinks = k1
#
agent.channels.c1.type = MEMORY
#
agent.sources.r1.channels = c1
agent.sources.r1.type = org.apache.flume.source.custom.MyBasicSource
agent.sources.r1.foo = bar
#
agent.sinks.k1.channel = c1
agent.sinks.k1.type = NULL
..then after agent startup, I changed the property value from foo=bar to
foo=tar, and waited for the next reconfig event. What I see is that the
source's ctor is not called, so we're re-using the same component. Here is the
trace:
*****MyBasicSource():conf-file-poller-0
*****setName(String):conf-file-poller-0
*****configure(Context):conf-file-poller-0
*****setChannelProcessor(ChannelProcessor):conf-file-poller-0
*****start():lifecycleSupervisor-1-2
<reconfig event occurs>
[missing call to MyBasicSource() ctor here]
*****configure(Context):conf-file-poller-0
*****setChannelProcessor(ChannelProcessor):conf-file-poller-0
*****stop():conf-file-poller-0
*****start():lifecycleSupervisor-1-1
As a separate note, I ran another test where I copied the MyBasicSource class
and called it MyBasicSource2. Then I started the agent with the original config
file, then modified the config file to just change the type to MyBasicSource2
(retaining everything else including the component name, r1), and waited for
the next reconfig event. Here is the trace of what happened (I appended the
instance's class name to each line, as output by the test code):
*****MyBasicSource():conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource
*****setName(String):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource
*****configure(Context):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource
*****setChannelProcessor(ChannelProcessor):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource
*****start():lifecycleSupervisor-1-2:org.apache.flume.source.custom.MyBasicSource
<reconfig event, changing type of r1 from MyBasicSource to MyBasicSource2>
*****MyBasicSource2():conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****setName(String):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****configure(Context):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****setChannelProcessor(ChannelProcessor):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****stop():conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource
*****start():lifecycleSupervisor-1-1:org.apache.flume.source.custom.MyBasicSource2
So in this last trace I see that stop() on MyBasicSource is called before
start() on MyBasicSource2, which is expected. But I'm not sure why that stop()
is called after MyBasicSource2's ctor, setName(), and configure() have been
called. Is this expected? Should the sequence after a reconfig event be like
the following, with stop() being called first for all existing components
before new components are created:
*****stop():conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource
*****MyBasicSource2():conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****setName(String):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****configure(Context):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****setChannelProcessor(ChannelProcessor):conf-file-poller-0:org.apache.flume.source.custom.MyBasicSource2
*****start():lifecycleSupervisor-1-1:org.apache.flume.source.custom.MyBasicSource2
Cheers,
Will
> Possible reconfig issue -- setName and setChannel being called before stop
> --------------------------------------------------------------------------
>
> Key: FLUME-1162
> URL: https://issues.apache.org/jira/browse/FLUME-1162
> Project: Flume
> Issue Type: Bug
> Components: Configuration
> Affects Versions: v1.2.0
> Reporter: Will McQueen
> Fix For: v1.2.0
>
>
> During reconfig testing, I found some possible issues. Sending to this list
> to seek clarification.
> (1) Is it expected that the stop() and start() methods of a component should
> be called by a lifecycleSupervisor thread?
> (2) If true for (1), then is it expected that start() and stop() should be
> called by the same lifecycleSupervisor thread (for the same component
> instance)?
> (3) Is it expected that a new component instance is not created at each
> reconfig event, unless that component's instance name changes in the config
> file?
> As an example, I have the following valid config as a bare minimal test
> config:
> agent.channels = c1
> agent.sources = r1
> agent.sinks = k1
> agent.channels.c1.type = MEMORY
> agent.sources.r1.channels = c1
> agent.sources.r1.type = org.apache.flume.source.custom.MyBasicSource
> agent.sinks.k1.channel = c1
> agent.sinks.k1.type = NULL
> The MyBasicSource class looks like this (I created an explicit ctor so that I
> can set a breakpoint on it, and I set breakpoints at all entry points of
> methods in AbstractSource to trace which thread calls which method of the
> source component instance):
> public class MyBasicSource extends AbstractSource implements
> EventDrivenSource {
> public MyBasicSource() {
> }
> }
> After a reconfig event is triggered (I just touched the config file this time
> without modifying it), the source's setChannelProcessor() (with a new
> ChannelProcessor instance) is being called before calling stop(). So It seems
> to me that if stop() does cleanup work (light setting all fields to null,
> including a ChannelProcessor field), then the new ChannelProcessor instance
> (after reconfig) would be lost.
> In a similar test, I repeated the above steps but this time changed the
> instance name in the config from 'r1' to 'r2'. This time upon a reconfig, a
> call to the ctor and to setName() preceeded the call to stop(), and then
> after stop() comes start().
> In case that made you dizzy, here's a summary of behavior during a reconfig:
> Format:
> method():thread (where <method> is called by <thread>)
> *****MyBasicSource():conf-file-poller-0
> *****setName(String):conf-file-poller-0
> *****setChannelProcessor(ChannelProcessor):conf-file-poller-0
> *****start():lifecycleSupervisor-1-2
> <<conf file reconfig event occurs here.. changed the name of the source
> component instance from r1 to r2>
> *****MyBasicSource():conf-file-poller-0
> *****setName(String):conf-file-poller-0
> *****setChannelProcessor(ChannelProcessor):conf-file-poller-0
> *****stop():conf-file-poller-0
> *****start():lifecycleSupervisor-1-4
> Some things to notice:
> 1) The LifecycleAware method are called 3 different threads. start() is
> called by lifecycleSupervisor-1-2, stop() is called by conf-file-poller-0
> (!... expected?), and the start() in the 2nd instance of MyBasicSource is
> called by lifecycleSupervisor-1-4. Although I can understand that different
> instances (r1 and r2) can be called by different lifecycle threads, note that
> even the same instance (r1) had its start() method called by different
> lifecycle threads before and after reconfig. My concern about this relates to
> visibility of field values and happens-before relationships. If one thread
> calls stop() and sets field values to null, while another thread calls start
> to set field values to some value, then the call by the first thread might
> propagate to the working space of the 2nd thread sometime after thread 2
> writes the value, which means that it could be possible that the new instance
> receives a null value sometime after the 2nd start() is called. JCIP.
> 2) After the reconfig, setName(..) and setChannelProcessor(..) are being
> called before stop().
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira