Re: [DISCUSS] Should we support hot reconfiguration?

alo alt Tue, 12 Jun 2012 23:12:18 -0700

+1 on this

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF


On Jun 13, 2012, at 8:03 AM, Senthilvel Rangaswamy wrote:

> Thanks for the nice summary Mike.
> 
> How about if we add an option to Flume, to dictate the behavior.
> 
> auto_reconfigure = on or off
> 
> It has to be a command line option.
> 
> 
> On Tue, Jun 12, 2012 at 10:55 PM, Mike Percy <[email protected]> wrote:
> 
>> Hi Eric and all,
>> It's nice to see input from people using and considering Flume. Three
>> sub-features coming out of this discussion appear to be:
>> 
>> 1. Ability to hot-modify the configuration of a single component while
>> it's running;
>> 2. Ability to add/remove components without affecting other parts of the
>> system;
>> - To me it seems that doing #2 would get us ~80% of the uptime
>> improvement of doing #1 correctly, but would involve ~20% of the complexity.
>> 
>> 3. Ability to trigger a reconfiguration manually, instead of using the
>> current file modification polling system
>> - This looks a little less prone to human error vs. how we reconfigure
>> now. I have a couple of ideas for ways to implement this simply.
>> 
>> - Side note: right now, Flume 1.x "reconfiguration" means that, whenever
>> the poller thread detects a change to the configuration file, we:
>>  a. stop all components
>>  b. configure each component with the latest settings in the file
>>  c. start all components
>> 
>> Best,
>> Mike
>> 
>> 
>> 
>> 
>> On Thursday, June 7, 2012 at 3:18 PM, Eric Sammer wrote:
>> 
>>> Flumers:
>>> 
>>> Flume 0.9.x supported online reconfiguration and the intention was for
>> the 1.x branch to do so as well (it doesn't yet). I wanted to start a
>> discussion around whether people are interested in this kind of
>> functionality or if simply restarting the daemon(s) was sufficient for your
>> deployment. There are two ways of thinking about it:
>>> 
>>> * Support reconfiguration. Agents may have multiple flows passing
>> through them and, ideally, adding new ones shouldn't interrupt existing
>> flows. Agent restarts interrupt collection and, for non-durable channels
>> (i.e. MemoryChannel), data *may* be lost. Reconfiguration will add
>> significant complexity and ultimately does not get around host level
>> maintenance, software upgrades, and the like.
>>> 
>>> * Do no support reconfiguration. Accept the fact that agents may go down
>> eventually, so it should be supported as a first class case. In other
>> words, embrace the idea of failure / maintenance and handle it by
>> recommending topologies of agents that include multiple agents at each tier
>> and simply roundrobin / failover where necessary. The only downside is the
>> agent tier closest to the originating source (e.g. a log4j client);
>> restarting that agent means the client application needs to be able to find
>> another agent or buffer (which impacts durability or blocks the
>> application).
>>> 
>>> We can optionally support some subset of online reconfiguration such as
>> only allowing new flows to be introduced or existing flows to be
>> "decommissioned," but not allow alteration of existing flows. Ultimately
>> this feature is a ton of work and adds a ton of complexity so if it's not
>> something folks are clambering for, we should spend our time worrying about
>> more pressing issues.
>>> 
>>> Thoughts? Comments?--
>>> Eric Sammer
>>> twitter: esammer
>>> data: www.cloudera.com (http://www.cloudera.com)
>> 
>> 
>> 
>> 
> 
> 
> -- 
> ..Senthil
> 
> "If there's anything more important than my ego around, I want it
> caught and shot now."
>                                                    - Douglas Adams.

Re: [DISCUSS] Should we support hot reconfiguration?

Reply via email to