Re: [DISCUSS] Should we support hot reconfiguration?

Eran Kutner Tue, 12 Jun 2012 11:51:34 -0700

will take a deeper look at how multiple configurations are handled, thanks
for the tip.
And yes, I think dynamically loading the file would be very useful.


-eran



On Tue, Jun 12, 2012 at 9:18 PM, Eric Sammer <[email protected]> wrote:

> Eran:
>
> On Tue, Jun 12, 2012 at 3:00 AM, Eran Kutner <[email protected]> wrote:
>
>> As a long time flume user in a large production environment I'd like to
>> say the main reason I preferred flume over scribe and the reason I'm
>> avoiding switching to flume NG  is the dynamic configuration capability.
>> From my experience it is much much easier to manage the node configurations
>> in a single text file, then deploy the whole thing at once to all the
>> servers than dealing with individual configurations on each server. In
>> particular is makes it much easier in cases where the configuration is NOT
>> identical between the servers.
>> For example, if you have a tier of servers producing events and a tier of
>> collectors, you may probably want to load balance the collectors so some
>> servers write to certain collectors while others write to other collectors,
>> which means their failover options are also different, so you have multiple
>> configurations to deal with, now imagine you're adding more collector nodes
>> and need to rebalance the whole thing, so some servers will now have to
>> write to other collectors. Managing individual configuration files becomes
>> a nightmare very quickly, while having one configuration source that can
>> easily deploy the configuration to all the servers makes it very easy.
>> Using Puppet or Chef will not help in this case either.
>>
> Now that I think about it, there might be a good enough middle-ground
>> here. If there was one "master" configuration file that could contain
>> configurations for multiple servers, and each server knew its own identity
>> and could load its own configuration out of this file, then it would be
>> enough to deploy this one file to all the servers at once and gain the same
>> behavior is flume OG without having a real "master" server. I'd still like
>> to see this configuration being auto-loaded when deployed without having to
>> restart the flume service, this alternative should be much simpler to
>> develop and give 90% of the important functionality (the other 10% is
>> having one location to see the status of all the nodes).
>>
>
> This last paragraph is exactly what 1.x does: it supports defining
> multiple agents within a config file. When the agent starts, it knows its
> name and only loads the slice of the file that matters to that agent. The
> question is if we should poll this file for changes and load them
> dynamically. It sounds like that's valuable to you (but correct me if I'm
> wrong). Either way, the ability to define all config in a single file is
> still there. The thing 1.x doesn't do it get it out to all the machines for
> you (in other words, you should switch to 1.x! :)).
>
>
>> Hope this makes sense.
>>
>
> Thanks for your input. Definitely makes sense. I'd love to hear your
> thoughts on my comments above.
>
>
>>
>>
>> -eran
>>
>>
>>
>> On Tue, Jun 12, 2012 at 10:16 AM, Jorn Argelo - Ephorus <
>> [email protected]> wrote:
>>
>>>  Hi all,****
>>>
>>> ** **
>>>
>>> Not sure if a regular user should participate in these sort of
>>> discussions but here’s my opinion nevertheless ;-)****
>>>
>>> ** **
>>>
>>> I think one of the biggest flaws of OG Flume is that it’s too complex
>>> and maybe even over-engineered. For example the centralized configuration
>>> in the Flume Master sounds good on paper but in practice it doesn’t make
>>> things easier at all (fortunately this was fixed with NG Flume). So IMO
>>> stick to the KISS principle and keep things down-to-earth. I have no
>>> problem restarting agents, and a HUP construction to reload the
>>> configuration as Senthivil suggested is even better.****
>>>
>>> ** **
>>>
>>> I guess most users are using a config management system like Puppet or
>>> Chef to deploy and configure their agents. If you keep that in mind in
>>> terms of configuration it makes things a whole lot easier for those users.
>>> ****
>>>
>>> ** **
>>>
>>> Regards,****
>>>
>>> Jorn****
>>>
>>>
>>>
>>> ****
>>>
>>> *Van:* Senthilvel Rangaswamy [mailto:[email protected]]
>>> *Verzonden:* zaterdag 9 juni 2012 1:32
>>> *Aan:* [email protected]
>>> *CC:* Flume Development
>>> *Onderwerp:* Re: [DISCUSS] Should we support hot reconfiguration?****
>>>
>>> ** **
>>>
>>> IMHO, online reconfiguration is dangerous. A typical use case for flume
>>> is to be deployed at the very
>>> beginning of the data source, like web servers. These are typically in
>>> large numbers. Say you push out
>>> a bad config and that gets picked up, it will wreak havoc on the
>>> infrastructure.
>>>
>>> I would like the flume to pick the new config when it is HUP'ed. This
>>> way, it is a controlled deployment,
>>> but at the same time not a full restart.
>>>
>>> ****
>>>
>>> On Thu, Jun 7, 2012 at 3:18 PM, Eric Sammer <[email protected]>
>>> wrote:****
>>>
>>> Flumers:****
>>>
>>> ** **
>>>
>>> Flume 0.9.x supported online reconfiguration and the intention was for
>>> the 1.x branch to do so as well (it doesn't yet). I wanted to start a
>>> discussion around whether people are interested in this kind of
>>> functionality or if simply restarting the daemon(s) was sufficient for your
>>> deployment. There are two ways of thinking about it:****
>>>
>>> ** **
>>>
>>> * Support reconfiguration. Agents may have multiple flows passing
>>> through them and, ideally, adding new ones shouldn't interrupt existing
>>> flows. Agent restarts interrupt collection and, for non-durable channels
>>> (i.e. MemoryChannel), data *may* be lost. Reconfiguration will add
>>> significant complexity and ultimately does not get around host level
>>> maintenance, software upgrades, and the like.****
>>>
>>> ** **
>>>
>>> * Do no support reconfiguration. Accept the fact that agents may go down
>>> eventually, so it should be supported as a first class case. In other
>>> words, embrace the idea of failure / maintenance and handle it by
>>> recommending topologies of agents that include multiple agents at each tier
>>> and simply roundrobin / failover where necessary. The only downside is the
>>> agent tier closest to the originating source (e.g. a log4j client);
>>> restarting that agent means the client application needs to be able to find
>>> another agent or buffer (which impacts durability or blocks the
>>> application).
>>> ****
>>>
>>> ** **
>>>
>>> We can optionally support some subset of online reconfiguration such as
>>> only allowing new flows to be introduced or existing flows to be
>>> "decommissioned," but not allow alteration of existing flows. Ultimately
>>> this feature is a ton of work and adds a ton of complexity so if it's not
>>> something folks are clambering for, we should spend our time worrying about
>>> more pressing issues.****
>>>
>>> ** **
>>>
>>> Thoughts? Comments?****
>>>
>>> --
>>> Eric Sammer
>>> twitter: esammer
>>> data: www.cloudera.com****
>>>
>>>
>>>
>>>
>>> --
>>> ..Senthil
>>>
>>> "If there's anything more important than my ego around, I want it
>>>  caught and shot now."
>>>                                                     - Douglas Adams.****
>>>
>>
>>
>
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com
>

Re: [DISCUSS] Should we support hot reconfiguration?

Reply via email to