[jira] [Created] (FLUME-1162) Possible reconfig issue -- setName and setChannel being called before stop

Will McQueen (JIRA) Sat, 28 Apr 2012 22:54:48 -0700

Will McQueen created FLUME-1162:
-----------------------------------

             Summary: Possible reconfig issue -- setName and setChannel being 
called before stop
                 Key: FLUME-1162
                 URL: https://issues.apache.org/jira/browse/FLUME-1162
             Project: Flume
          Issue Type: Bug
          Components: Configuration
    Affects Versions: v1.2.0
            Reporter: Will McQueen
             Fix For: v1.2.0



During reconfig testing, I found some possible issues. Sending to this list to 
seek clarification.

(1) Is it expected that the stop() and start() methods of a component should be 
called by a lifecycleSupervisor thread?
(2) If true for (1), then is it expected that start() and stop() should be 
called by the same lifecycleSupervisor thread (for the same component instance)?
(3) Is it expected that a new component instance is not created at each 
reconfig event, unless that component's instance name changes in the config 
file?

As an example, I have the following valid config as a bare minimal test config. 
It has no sink, but the config is still valid because this config could be used 
to stage/populate the channel:

agent.channels = c1
agent.sources = r1
agent.sinks = k1

agent.channels.c1.type = MEMORY

agent.sources.r1.channels = c1
agent.sources.r1.type = org.apache.flume.source.custom.MyBasicSource

agent.sinks.k1.channel = c1
agent.sinks.k1.type = NULL

The MyBasicSource class looks like this (I created an explicit ctor so that I 
can set a breakpoint on it, and I set breakpoints at all entry points of 
methods in AbstractSource to trace which thread calls which method of the 
source component instance):

public class MyBasicSource extends AbstractSource implements EventDrivenSource {
    public MyBasicSource() {

    }
}

After a reconfig event is triggered (I just touched the config file this time 
without modifying it), the source's setChannelProcessor() (with a new 
ChannelProcessor instance) is being called before calling stop(). So It seems 
to me that if stop() does cleanup work (light setting all fields to null, 
including a ChannelProcessor field), then the new ChannelProcessor instance 
(after reconfig) would be lost.

In a similar test, I repeated the above steps but this time changed the 
instance name in the config from 'r1' to 'r2'. This time upon a reconfig, a 
call to the ctor and to setName() preceeded the call to stop(), and then after 
stop() comes start().



In case that made you dizzy, here's a summary of behavior during a reconfig:

Format:
method():thread (where <method> is called by <thread>)

*****MyBasicSource():conf-file-poller-0
*****setName(String):conf-file-poller-0
*****setChannelProcessor(ChannelProcessor):conf-file-poller-0
*****start():lifecycleSupervisor-1-2

<<conf file reconfig event occurs here.. changed the name of the source 
component instance from r1 to r2>

*****MyBasicSource():conf-file-poller-0
*****setName(String):conf-file-poller-0
*****setChannelProcessor(ChannelProcessor):conf-file-poller-0
*****stop():conf-file-poller-0
*****start():lifecycleSupervisor-1-4


Some things to notice:

1) The LifecycleAware method are called 3 different threads. start() is called 
by lifecycleSupervisor-1-2, stop() is called by conf-file-poller-0 (!... 
expected?), and the start() in the 2nd instance of MyBasicSource is called by 
lifecycleSupervisor-1-4. Although I can understand that different instances (r1 
and r2) can be called by different lifecycle threads, note that even the same 
instance (r1) had its start() method called by different lifecycle threads 
before and after reconfig. My concern about this relates to visibility of field 
values and happens-before relationships. If one thread calls stop() and sets 
field values to null, while another thread calls start to set field values to 
some value, then the call by the first thread might propagate to the working 
space of the 2nd thread sometime after thread 2 writes the value, which means 
that it could be possible that the new instance receives a null value sometime 
after the 2nd start() is called. JCIP.

2) After the reconfig, setName(..) and setChannelProcessor(..) are being called 
before stop().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (FLUME-1162) Possible reconfig issue -- setName and setChannel being called before stop

Reply via email to