Aditya, when I think about the motivation of "not having to restart brokers
to change a config" I think about all of the configurations I have seen
having to get changed in brokers and restarted (which is just about all of
them). What I mean by "stop the world" is when producers and/or consumers
will not be able to use the broker(s) for a period of time or something
within the broker holds/blocks everything for the changes to take affect
and LeaderElection is going to occur or ISR change.

Lets say someone wanted to change replicaFetchMaxBytes
or replicaFetchBackoffMs dynamically you would have to stop the
ReplicaFetcherManager. If you use a watcher then then all brokers at the
same time will have to stop and (hopefully) start ReplicaFetcherManager at
the same time. Or lets say someone wanted to change NumNetworkThreads, the
entire SocketServer for every broker at the same time would have to stop
and (hopefully) start.I believe most of the configurations fall into this
category and using a watcher notification to every broker without some
control is going to be a problem. If the notification just goes to the
controller and the controller is able to managing the processing for every
broker that might work but doesn't solve all the problems to be worked on.
We would also have to think about what to-do for the controller broker also
itself (unless we make the controller maybe not a broker as possible) as
well as how to deal with some of these changes that could take brokers in
and out of the ISR or cause Leader Election. If we can make these changes
without "stopping the world" (not just a matter of having the controller
managing the broker by broker restart) so that Brokers that are leaders
would still be leaders (perhaps the connections for producing / consuming
get buffered or something) when (if) they come back online.

The thing is that lots of folks want all (as many as possible) the
configuration to be dynamic and I am concerned that if we don't code for
the harder cases then we only have one or two configurations able to be
dynamic. If that is the motivation for this KIP so quotas work that is ok.

The more I think about it I am not sure just labeling certain configs to be
dynamic is going to be helpful for folks because they are still having to
manage the updates for all the configurations, restarting brokers and now a
new burden to understand dynamic properties. I think we need to add
solutions for folks where we can to make things easier without having to
add new items for them to contend with.

Thanks!

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Sun, May 3, 2015 at 8:23 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Hey Joe,
>
> Can you elaborate what you mean by a stop the world change? In this
> protocol, we can target notifications to a subset of brokers in the cluster
> (controller if we need to). Is the AdminChangeNotification a ZK
> notification or a request type exposed by each broker?
>
> Thanks,
> Aditya
>
> ________________________________________
> From: Joe Stein [joe.st...@stealth.ly]
> Sent: Friday, May 01, 2015 5:25 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-21 Configuration Management
>
> Hi Aditya, thanks for the write up and focusing on this piece.
>
> Agreed we need something that we can do broker changes dynamically without
> rolling restarts.
>
> I think though if every broker is getting changes it with notifications it
> is going to limit which configs can be dynamic.
>
> We could never deliver a "stop the world" configuration change because then
> that would happen on the entire cluster to every broker on the same time.
>
> Can maybe just the controller get the notification?
>
> And we provide a layer for brokers to work with the controller to-do the
> config change operations at is discretion (so it can stop things if needs).
>
> controller gets notification, sends AdminChangeNotification to broker [X ..
> N] then brokers can do their things, even send a response for heartbeating
> while it takes the few milliseconds it needs or crashes. We need to go
> through both scenarios.
>
> I am worried we put this change in like this and it works for quotas and
> maybe a few other things but nothing else gets dynamic and we don't get far
> enough for almost no more rolling restarts.
>
> ~ Joe Stein
> - - - - - - - - - - - - - - - - -
>
>   http://www.stealth.ly
> - - - - - - - - - - - - - - - - -
>
> On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
>
> > >    1. I have deep concerns about managing configuration in ZooKeeper.
> > >    First, Producers and Consumers shouldn't depend on ZK at all, this
> > seems
> > >    to add back a dependency we are trying to get away from.
> >
> > The KIP probably needs to be clarified here - I don't think Aditya was
> > referring to client (producer/consumer) configs. These are global
> > client-id-specific configs that need to be managed centrally.
> > (Specifically, quota overrides on a per-client basis).
> >
> >
>

Reply via email to