Jorn:

On Tue, Jun 12, 2012 at 12:16 AM, Jorn Argelo - Ephorus <
[email protected]> wrote:

>  Hi all,****
>
> ** **
>
> Not sure if a regular user should participate in these sort of discussions
> but here’s my opinion nevertheless ;-)
>

You should *absolutely* share your opinion on this. That's the point of the
thread. To be clear, there is no such thing as a "developer" or
"committer-only" discussion in Apache projects. All development is open and
user feedback is just as much a contribution as a patch!


> ****
>
> ** **
>
> I think one of the biggest flaws of OG Flume is that it’s too complex and
> maybe even over-engineered. For example the centralized configuration in
> the Flume Master sounds good on paper but in practice it doesn’t make
> things easier at all (fortunately this was fixed with NG Flume). So IMO
> stick to the KISS principle and keep things down-to-earth. I have no
> problem restarting agents, and a HUP construction to reload the
> configuration as Senthivil suggested is even better.
>

Just to clarify, when I talk about online reconfiguration, that is
orthogonal to a centralized master that controls configuration. In other
words, if we were to do it, it would be reloading changes to the property
config file, not via ZK and a master (unless someone wanted to work on that
- the plugin interfaces are there - it's just that I'm not interested in
doing that at this time for the KISS reason you mention).

As an aside, Java does not make signal handling easy[1] (at all) because
it's a platform specific feature so the specific mechanism of responding to
SIGHUP is unlikely. Trivially, we could poll the file for mtime changes and
reload; that's the most likely candidate. Atomicity of changes could be
achieved by making changes to a temp file and renaming into place (which
most conf management tools support).

[1] The SIGKILL handler is a special case in Java as it supports JVM
termination callbacks. That's how we handle that. Technically, there's a
(not so) secret signal handling class in the com.sun namespace in the
Oracle JDK, but it's entirely unsupported so depending on it could be
dangerous. I'd rather not make it harder to support non-Linux platforms if
someone wanted to pick up that ball and run with it.

****
>
> ** **
>
> I guess most users are using a config management system like Puppet or
> Chef to deploy and configure their agents. If you keep that in mind in
> terms of configuration it makes things a whole lot easier for those users.
>

Many of the users I've talked to are doing exactly this, I agree. The
question is only if we should support reloading changes to the files (for
now).


> ****
>
> ** **
>
> Regards,****
>
> Jorn****
>
>
>
> ****
>
> *Van:* Senthilvel Rangaswamy [mailto:[email protected]]
> *Verzonden:* zaterdag 9 juni 2012 1:32
> *Aan:* [email protected]
> *CC:* Flume Development
> *Onderwerp:* Re: [DISCUSS] Should we support hot reconfiguration?****
>
> ** **
>
> IMHO, online reconfiguration is dangerous. A typical use case for flume is
> to be deployed at the very
> beginning of the data source, like web servers. These are typically in
> large numbers. Say you push out
> a bad config and that gets picked up, it will wreak havoc on the
> infrastructure.
>
> I would like the flume to pick the new config when it is HUP'ed. This way,
> it is a controlled deployment,
> but at the same time not a full restart.
>
> ****
>
> On Thu, Jun 7, 2012 at 3:18 PM, Eric Sammer <[email protected]> wrote:*
> ***
>
> Flumers:****
>
> ** **
>
> Flume 0.9.x supported online reconfiguration and the intention was for the
> 1.x branch to do so as well (it doesn't yet). I wanted to start a
> discussion around whether people are interested in this kind of
> functionality or if simply restarting the daemon(s) was sufficient for your
> deployment. There are two ways of thinking about it:****
>
> ** **
>
> * Support reconfiguration. Agents may have multiple flows passing through
> them and, ideally, adding new ones shouldn't interrupt existing flows.
> Agent restarts interrupt collection and, for non-durable channels (i.e.
> MemoryChannel), data *may* be lost. Reconfiguration will add significant
> complexity and ultimately does not get around host level maintenance,
> software upgrades, and the like.****
>
> ** **
>
> * Do no support reconfiguration. Accept the fact that agents may go down
> eventually, so it should be supported as a first class case. In other
> words, embrace the idea of failure / maintenance and handle it by
> recommending topologies of agents that include multiple agents at each tier
> and simply roundrobin / failover where necessary. The only downside is the
> agent tier closest to the originating source (e.g. a log4j client);
> restarting that agent means the client application needs to be able to find
> another agent or buffer (which impacts durability or blocks the
> application).
> ****
>
> ** **
>
> We can optionally support some subset of online reconfiguration such as
> only allowing new flows to be introduced or existing flows to be
> "decommissioned," but not allow alteration of existing flows. Ultimately
> this feature is a ton of work and adds a ton of complexity so if it's not
> something folks are clambering for, we should spend our time worrying about
> more pressing issues.****
>
> ** **
>
> Thoughts? Comments?****
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com****
>
>
>
>
> --
> ..Senthil
>
> "If there's anything more important than my ego around, I want it
>  caught and shot now."
>                                                     - Douglas Adams.****
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com

Reply via email to