Joe, I'm not sure if I follow what you are saying. I think you are saying
that unless we make all configs dynamic it will be confusing. Is that
right? I somewhat agree. We are already kind of in that situation with the
topic configs being dynamic, though.

I think trying to make all configs dynamic all at once will be very hard.
However some of the hard cases you describe aren't that hard. Currently we
construct all kinds of objects and pass in values based on config such
as replicaFetchBackoffMs
or replicaFetchMaxBytes as numbers. These numbers are immutable so you are
saying we would restart the ReplicaFetcherManager to change them. But
actually the proposal is that we would instead pass the KafkaConfiguration
object into the ReplicaFetcherManager and dynamically reference the current
value as config.current.replicaFetchBackoffMs when it is needed.

This is the same pattern we follow with LogConfig.

This is a big change but we can do it a bit at a time. To make a config
dynamic you just need to move the corresponding system over to not cache
the value and instead reference the KafkaConfiguration instance.

The way config changes take effect is not always the same, though. For
example if you change the replicaFetchBackoffMs it takes effect
immediately, however if you change a socket buffer it would only take
effect for newly created sockets (there is no way to go back and adjust the
old ones).

This will handle perhaps 60% of the configs. However there are some configs
which are very hard to change dynamically and not worth supporting. For
example the compaction buffer in the cleaner or the number of threads in
the thread pool.

I guess the question is how confusing is it to have the story for each
config be a little different? Even if we add this to ConfigDef and hence
automatically document which are dynamic this will still mean that you will
often make a change and not really know if it has taken effect or if it
requires a restart without consulting the docs.

-Jay

On Mon, May 4, 2015 at 6:31 AM, Joe Stein <joe.st...@stealth.ly> wrote:

> Aditya, when I think about the motivation of "not having to restart brokers
> to change a config" I think about all of the configurations I have seen
> having to get changed in brokers and restarted (which is just about all of
> them). What I mean by "stop the world" is when producers and/or consumers
> will not be able to use the broker(s) for a period of time or something
> within the broker holds/blocks everything for the changes to take affect
> and LeaderElection is going to occur or ISR change.
>
> Lets say someone wanted to change replicaFetchMaxBytes
> or replicaFetchBackoffMs dynamically you would have to stop the
> ReplicaFetcherManager. If you use a watcher then then all brokers at the
> same time will have to stop and (hopefully) start ReplicaFetcherManager at
> the same time. Or lets say someone wanted to change NumNetworkThreads, the
> entire SocketServer for every broker at the same time would have to stop
> and (hopefully) start.I believe most of the configurations fall into this
> category and using a watcher notification to every broker without some
> control is going to be a problem. If the notification just goes to the
> controller and the controller is able to managing the processing for every
> broker that might work but doesn't solve all the problems to be worked on.
> We would also have to think about what to-do for the controller broker also
> itself (unless we make the controller maybe not a broker as possible) as
> well as how to deal with some of these changes that could take brokers in
> and out of the ISR or cause Leader Election. If we can make these changes
> without "stopping the world" (not just a matter of having the controller
> managing the broker by broker restart) so that Brokers that are leaders
> would still be leaders (perhaps the connections for producing / consuming
> get buffered or something) when (if) they come back online.
>
> The thing is that lots of folks want all (as many as possible) the
> configuration to be dynamic and I am concerned that if we don't code for
> the harder cases then we only have one or two configurations able to be
> dynamic. If that is the motivation for this KIP so quotas work that is ok.
>
> The more I think about it I am not sure just labeling certain configs to be
> dynamic is going to be helpful for folks because they are still having to
> manage the updates for all the configurations, restarting brokers and now a
> new burden to understand dynamic properties. I think we need to add
> solutions for folks where we can to make things easier without having to
> add new items for them to contend with.
>
> Thanks!
>
> ~ Joe Stein
> - - - - - - - - - - - - - - - - -
>
>   http://www.stealth.ly
> - - - - - - - - - - - - - - - - -
>
> On Sun, May 3, 2015 at 8:23 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Hey Joe,
> >
> > Can you elaborate what you mean by a stop the world change? In this
> > protocol, we can target notifications to a subset of brokers in the
> cluster
> > (controller if we need to). Is the AdminChangeNotification a ZK
> > notification or a request type exposed by each broker?
> >
> > Thanks,
> > Aditya
> >
> > ________________________________________
> > From: Joe Stein [joe.st...@stealth.ly]
> > Sent: Friday, May 01, 2015 5:25 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-21 Configuration Management
> >
> > Hi Aditya, thanks for the write up and focusing on this piece.
> >
> > Agreed we need something that we can do broker changes dynamically
> without
> > rolling restarts.
> >
> > I think though if every broker is getting changes it with notifications
> it
> > is going to limit which configs can be dynamic.
> >
> > We could never deliver a "stop the world" configuration change because
> then
> > that would happen on the entire cluster to every broker on the same time.
> >
> > Can maybe just the controller get the notification?
> >
> > And we provide a layer for brokers to work with the controller to-do the
> > config change operations at is discretion (so it can stop things if
> needs).
> >
> > controller gets notification, sends AdminChangeNotification to broker [X
> ..
> > N] then brokers can do their things, even send a response for
> heartbeating
> > while it takes the few milliseconds it needs or crashes. We need to go
> > through both scenarios.
> >
> > I am worried we put this change in like this and it works for quotas and
> > maybe a few other things but nothing else gets dynamic and we don't get
> far
> > enough for almost no more rolling restarts.
> >
> > ~ Joe Stein
> > - - - - - - - - - - - - - - - - -
> >
> >   http://www.stealth.ly
> > - - - - - - - - - - - - - - - - -
> >
> > On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
> >
> > > >    1. I have deep concerns about managing configuration in ZooKeeper.
> > > >    First, Producers and Consumers shouldn't depend on ZK at all, this
> > > seems
> > > >    to add back a dependency we are trying to get away from.
> > >
> > > The KIP probably needs to be clarified here - I don't think Aditya was
> > > referring to client (producer/consumer) configs. These are global
> > > client-id-specific configs that need to be managed centrally.
> > > (Specifically, quota overrides on a per-client basis).
> > >
> > >
> >
>

Reply via email to