+1 on this -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF
On Jun 13, 2012, at 8:03 AM, Senthilvel Rangaswamy wrote: > Thanks for the nice summary Mike. > > How about if we add an option to Flume, to dictate the behavior. > > auto_reconfigure = on or off > > It has to be a command line option. > > > On Tue, Jun 12, 2012 at 10:55 PM, Mike Percy <[email protected]> wrote: > >> Hi Eric and all, >> It's nice to see input from people using and considering Flume. Three >> sub-features coming out of this discussion appear to be: >> >> 1. Ability to hot-modify the configuration of a single component while >> it's running; >> 2. Ability to add/remove components without affecting other parts of the >> system; >> - To me it seems that doing #2 would get us ~80% of the uptime >> improvement of doing #1 correctly, but would involve ~20% of the complexity. >> >> 3. Ability to trigger a reconfiguration manually, instead of using the >> current file modification polling system >> - This looks a little less prone to human error vs. how we reconfigure >> now. I have a couple of ideas for ways to implement this simply. >> >> - Side note: right now, Flume 1.x "reconfiguration" means that, whenever >> the poller thread detects a change to the configuration file, we: >> a. stop all components >> b. configure each component with the latest settings in the file >> c. start all components >> >> Best, >> Mike >> >> >> >> >> On Thursday, June 7, 2012 at 3:18 PM, Eric Sammer wrote: >> >>> Flumers: >>> >>> Flume 0.9.x supported online reconfiguration and the intention was for >> the 1.x branch to do so as well (it doesn't yet). I wanted to start a >> discussion around whether people are interested in this kind of >> functionality or if simply restarting the daemon(s) was sufficient for your >> deployment. There are two ways of thinking about it: >>> >>> * Support reconfiguration. Agents may have multiple flows passing >> through them and, ideally, adding new ones shouldn't interrupt existing >> flows. Agent restarts interrupt collection and, for non-durable channels >> (i.e. MemoryChannel), data *may* be lost. Reconfiguration will add >> significant complexity and ultimately does not get around host level >> maintenance, software upgrades, and the like. >>> >>> * Do no support reconfiguration. Accept the fact that agents may go down >> eventually, so it should be supported as a first class case. In other >> words, embrace the idea of failure / maintenance and handle it by >> recommending topologies of agents that include multiple agents at each tier >> and simply roundrobin / failover where necessary. The only downside is the >> agent tier closest to the originating source (e.g. a log4j client); >> restarting that agent means the client application needs to be able to find >> another agent or buffer (which impacts durability or blocks the >> application). >>> >>> We can optionally support some subset of online reconfiguration such as >> only allowing new flows to be introduced or existing flows to be >> "decommissioned," but not allow alteration of existing flows. Ultimately >> this feature is a ton of work and adds a ton of complexity so if it's not >> something folks are clambering for, we should spend our time worrying about >> more pressing issues. >>> >>> Thoughts? Comments?-- >>> Eric Sammer >>> twitter: esammer >>> data: www.cloudera.com (http://www.cloudera.com) >> >> >> >> > > > -- > ..Senthil > > "If there's anything more important than my ego around, I want it > caught and shot now." > - Douglas Adams.
