I did initially think having everything in ZK was better than having the 
dichotomy Joel referred to primarily because all kafka configs can be managed 
consistently.

I guess the biggest disadvantage of driving broker config primarily from ZK is 
that it requires everyone to manage Kafka configuration separately from other 
services. Several people have separately mentioned integration issues with 
systems like Puppet and Chef. While they may support pluggable logic, it does 
require everyone to write that additional piece of logic specific to Kafka. We 
will have to implement group, fabric, tag hierarchy (as Ashish mentioned), 
auditing and ACL management. While this potential consistency is nice, perhaps 
the tradeoff isn't worth it given that the resulting system isn't much superior 
to pushing out new config files and is also quite disruptive. Since this 
impacts operations teams the most, I also think their input is probably the 
most valuable and should perhaps drive the outcome.

I also think it is fine to treat topic and client configuration separately 
because they are more like metadata than actual service configuration. 

Aditya
________________________________________
From: Joel Koshy [jjkosh...@gmail.com]
Sent: Monday, May 11, 2015 4:54 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

So the general concern here is the dichotomy of configs (which we
already have - i.e., in the form of broker config files vs topic
configs in zookeeper). We (at LinkedIn) had some discussions on this
last week and had this very question for the operations team whose
opinion is I think to a large degree a touchstone for this decision:
"Has the operations team at LinkedIn experienced any pain so far with
managing topic configs in ZooKeeper (while broker configs are
file-based)?" It turns out that ops overwhelmingly favors the current
approach. i.e., service configs as file-based configs and client/topic
configs in ZooKeeper is intuitive and works great. This may be
somewhat counter-intuitive to devs, but this is one of those decisions
for which ops input is very critical - because for all practical
purposes, they are the users in this discussion.

If we continue with this dichotomy and need to support dynamic config
for client/topic configs as well as select service configs then there
will need to be dichotomy in the config change mechanism as well.
i.e., client/topic configs will change via (say) a ZooKeeper watch and
the service config will change via a config file re-read (on SIGHUP)
after config changes have been pushed out to local files. Is this a
bad thing? Personally, I don't think it is - i.e. I'm in favor of this
approach. What do others think?

Thanks,

Joel

On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
> What Todd said :)
>
> (I think my ops background is showing...)
>
> On Mon, May 11, 2015 at 10:17 PM, Todd Palino <tpal...@gmail.com> wrote:
>
> > I understand your point here, Jay, but I disagree that we can't have two
> > configuration systems. We have two different types of configuration
> > information. We have configuration that relates to the service itself (the
> > Kafka broker), and we have configuration that relates to the content within
> > the service (topics). I would put the client configuration (quotas) in the
> > with the second part, as it is dynamic information. I just don't see a good
> > argument for effectively degrading the configuration for the service
> > because of trying to keep it paired with the configuration of dynamic
> > resources.
> >
> > -Todd
> >
> > On Mon, May 11, 2015 at 11:33 AM, Jay Kreps <jay.kr...@gmail.com> wrote:
> >
> > > I totally agree that ZK is not in-and-of-itself a configuration
> > management
> > > solution and it would be better if we could just keep all our config in
> > > files. Anyone who has followed the various config discussions over the
> > past
> > > few years of discussion knows I'm the biggest proponent of immutable
> > > file-driven config.
> > >
> > > The analogy to "normal unix services" isn't actually quite right though.
> > > The problem Kafka has is that a number of the configurable entities it
> > > manages are added dynamically--topics, clients, consumer groups, etc.
> > What
> > > this actually resembles is not a unix services like HTTPD but a database,
> > > and databases typically do manage config dynamically for exactly the same
> > > reason.
> > >
> > > The last few emails are arguing that files > ZK as a config solution. I
> > > agree with this, but that isn't really the question, right?The reality is
> > > that we need to be able to configure dynamically created entities and we
> > > won't get a satisfactory solution to that using files (e.g. rsync is not
> > an
> > > acceptable topic creation mechanism). What we are discussing is having a
> > > single config mechanism or multiple. If we have multiple you need to
> > solve
> > > the whole config lifecycle problem for both--management, audit, rollback,
> > > etc.
> > >
> > > Gwen, you were saying we couldn't get rid of the configuration file, not
> > > sure if I understand. Is that because we need to give the URL for ZK?
> > > Wouldn't the same argument work to say that we can't use configuration
> > > files because we have to specify the file path? I think we can just give
> > > the server the same --zookeeper argument we use everywhere else, right?
> > >
> > > -Jay
> > >
> > > On Sun, May 10, 2015 at 11:28 AM, Todd Palino <tpal...@gmail.com> wrote:
> > >
> > > > I've been watching this discussion for a while, and I have to jump in
> > and
> > > > side with Gwen here. I see no benefit to putting the configs into
> > > Zookeeper
> > > > entirely, and a lot of downside. The two biggest problems I have with
> > > this
> > > > are:
> > > >
> > > > 1) Configuration management. OK, so you can write glue for Chef to put
> > > > configs into Zookeeper. You also need to write glue for Puppet. And
> > > > Cfengine. And everything else out there. Files are an industry standard
> > > > practice, they're how just about everyone handles it, and there's
> > reasons
> > > > for that, not just "it's the way it's always been done".
> > > >
> > > > 2) Auditing. Configuration files can easily be managed in a source
> > > > repository system which tracks what changes were made and who made
> > them.
> > > It
> > > > also easily allows for rolling back to a previous version. Zookeeper
> > does
> > > > not.
> > > >
> > > > I see absolutely nothing wrong with putting the quota (client) configs
> > > and
> > > > the topic config overrides in Zookeeper, and keeping everything else
> > > > exactly where it is, in the configuration file. To handle
> > configurations
> > > > for the broker that can be changed at runtime without a restart, you
> > can
> > > > use the industry standard practice of catching SIGHUP and rereading the
> > > > configuration file at that point.
> > > >
> > > > -Todd
> > > >
> > > >
> > > > On Sun, May 10, 2015 at 4:00 AM, Gwen Shapira <gshap...@cloudera.com>
> > > > wrote:
> > > >
> > > > > I am still not clear about the benefits of managing configuration in
> > > > > ZooKeeper vs. keeping the local file and adding a "refresh" mechanism
> > > > > (signal, protocol, zookeeper, or other).
> > > > >
> > > > > Benefits of staying with configuration file:
> > > > > 1. In line with pretty much any Linux service that exists, so admins
> > > > have a
> > > > > lot of related experience.
> > > > > 2. Much smaller change to our code-base, so easier to patch, review
> > and
> > > > > test. Lower risk overall.
> > > > >
> > > > > Can you walk me over the benefits of using Zookeeper? Especially
> > since
> > > it
> > > > > looks like we can't get rid of the file entirely?
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Thu, May 7, 2015 at 3:33 AM, Jun Rao <j...@confluent.io> wrote:
> > > > >
> > > > > > One of the Chef users confirmed that Chef integration could still
> > > work
> > > > if
> > > > > > all configs are moved to ZK. My rough understanding of how Chef
> > works
> > > > is
> > > > > > that a user first registers a service host with a Chef server.
> > After
> > > > > that,
> > > > > > a Chef client will be run on the service host. The user can then
> > push
> > > > > > config changes intended for a service/host to the Chef server. The
> > > > server
> > > > > > is then responsible for pushing the changes to Chef clients. Chef
> > > > clients
> > > > > > support pluggable logic. For example, it can generate a config file
> > > > that
> > > > > > Kafka broker will take. If we move all configs to ZK, we can
> > > customize
> > > > > the
> > > > > > Chef client to use our config CLI to make the config changes in
> > > Kafka.
> > > > In
> > > > > > this model, one probably doesn't need to register every broker in
> > > Chef
> > > > > for
> > > > > > the config push. Not sure if Puppet works in a similar way.
> > > > > >
> > > > > > Also for storing the configs, we probably can't store the
> > > broker/global
> > > > > > level configs in Kafka itself (e.g. in a special topic). The reason
> > > is
> > > > > that
> > > > > > in order to start a broker, we likely need to make some broker
> > level
> > > > > config
> > > > > > changes (e.g., the default log.dir may not be present, the default
> > > port
> > > > > may
> > > > > > not be available, etc). If we need a broker to be up to make those
> > > > > changes,
> > > > > > we get into this chicken and egg problem.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira <
> > gshap...@cloudera.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Sorry I missed the call today :)
> > > > > > >
> > > > > > > I think an additional requirement would be:
> > > > > > > Make sure that traditional deployment tools (Puppet, Chef, etc)
> > are
> > > > > still
> > > > > > > capable of managing Kafka configuration.
> > > > > > >
> > > > > > > For this reason, I'd like the configuration refresh to be pretty
> > > > close
> > > > > to
> > > > > > > what most Linux services are doing to force a reload of
> > > > configuration.
> > > > > > > AFAIK, this involves handling HUP signal in the main thread to
> > > reload
> > > > > > > configuration. Then packaging scripts can add something nice like
> > > > > > "service
> > > > > > > kafka reload".
> > > > > > >
> > > > > > > (See Apache web server:
> > > > > > >
> > > https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101
> > > > )
> > > > > > >
> > > > > > > Gwen
> > > > > > >
> > > > > > >
> > > > > > > On Tue, May 5, 2015 at 8:54 AM, Joel Koshy <jjkosh...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Good discussion. Since we will be talking about this at 11am, I
> > > > > wanted
> > > > > > > > to organize these comments into requirements to see if we are
> > all
> > > > on
> > > > > > > > the same page.
> > > > > > > >
> > > > > > > > REQUIREMENT 1: Needs to accept dynamic config changes. This
> > needs
> > > > to
> > > > > > > > be general enough to work for all configs that we envision may
> > > need
> > > > > to
> > > > > > > > accept changes at runtime. e.g., log (topic), broker, client
> > > > > (quotas),
> > > > > > > > etc.. possible options include:
> > > > > > > > - ZooKeeper watcher
> > > > > > > > - Kafka topic
> > > > > > > > - Direct RPC to controller (or config coordinator)
> > > > > > > >
> > > > > > > > The current KIP is really focused on REQUIREMENT 1 and I think
> > > that
> > > > > is
> > > > > > > > reasonable as long as we don't come up with something that
> > > requires
> > > > > > > > significant re-engineering to support the other requirements.
> > > > > > > >
> > > > > > > > REQUIREMENT 2: Provide consistency of configs across brokers
> > > > (modulo
> > > > > > > > per-broker overrides) or at least be able to verify
> > consistency.
> > > > > What
> > > > > > > > this effectively means is that config changes must be seen by
> > all
> > > > > > > > brokers eventually and we should be able to easily compare the
> > > full
> > > > > > > > config of each broker.
> > > > > > > >
> > > > > > > > REQUIREMENT 3: Central config store. Needs to work with plain
> > > > > > > > file-based configs and other systems (e.g., puppet). Ideally,
> > > > should
> > > > > > > > not bring in other dependencies (e.g., a DB). Possible options:
> > > > > > > > - ZooKeeper
> > > > > > > > - Kafka topic
> > > > > > > > - other? E.g. making it pluggable?
> > > > > > > >
> > > > > > > > Any other requirements?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Joel
> > > > > > > >
> > > > > > > > On Tue, May 05, 2015 at 01:38:09AM +0000, Aditya Auradkar
> > wrote:
> > > > > > > > > Hey Neha,
> > > > > > > > >
> > > > > > > > > Thanks for the feedback.
> > > > > > > > > 1. In my earlier exchange with Jay, I mentioned the broker
> > > > writing
> > > > > > all
> > > > > > > > it's configs to ZK (while respecting the overrides). Then ZK
> > can
> > > be
> > > > > > used
> > > > > > > to
> > > > > > > > view all configs.
> > > > > > > > >
> > > > > > > > > 2. Need to think about this a bit more. Perhaps we can
> > discuss
> > > > this
> > > > > > > > during the hangout tomorrow?
> > > > > > > > >
> > > > > > > > > 3 & 4) I viewed these config changes as mainly administrative
> > > > > > > > operations. In the case, it may be reasonable to assume that
> > the
> > > ZK
> > > > > > port
> > > > > > > is
> > > > > > > > available for communication from the machine these commands are
> > > > run.
> > > > > > > Having
> > > > > > > > a ConfigChangeRequest (or similar) is nice to have but having a
> > > new
> > > > > API
> > > > > > > and
> > > > > > > > sending requests to controller also change how we do topic
> > based
> > > > > > > > configuration currently. I was hoping to keep this KIP as
> > minimal
> > > > as
> > > > > > > > possible and provide a means to represent and modify client and
> > > > > broker
> > > > > > > > based configs in a central place. Are there any concerns if we
> > > > tackle
> > > > > > > these
> > > > > > > > things in a later KIP?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Aditya
> > > > > > > > > ________________________________________
> > > > > > > > > From: Neha Narkhede [n...@confluent.io]
> > > > > > > > > Sent: Sunday, May 03, 2015 9:48 AM
> > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > Subject: Re: [DISCUSS] KIP-21 Configuration Management
> > > > > > > > >
> > > > > > > > > Thanks for starting this discussion, Aditya. Few
> > > > questions/comments
> > > > > > > > >
> > > > > > > > > 1. If you change the default values like it's mentioned in
> > the
> > > > KIP,
> > > > > > do
> > > > > > > > you
> > > > > > > > > also overwrite the local config file as part of updating the
> > > > > default
> > > > > > > > value?
> > > > > > > > > If not, where does the admin look to find the default values,
> > > ZK
> > > > or
> > > > > > > local
> > > > > > > > > Kafka config file? What if a config value is different in
> > both
> > > > > > places?
> > > > > > > > >
> > > > > > > > > 2. I share Gwen's concern around making sure that popular
> > > config
> > > > > > > > management
> > > > > > > > > tools continue to work with this change. Would love to see
> > how
> > > > each
> > > > > > of
> > > > > > > > > those would work with the proposal in the KIP. I don't know
> > > > enough
> > > > > > > about
> > > > > > > > > each of the tools but seems like in some of the tools, you
> > have
> > > > to
> > > > > > > define
> > > > > > > > > some sort of class with parameter names as config names. How
> > > will
> > > > > > such
> > > > > > > > > tools find out about the config values? In Puppet, if this
> > > means
> > > > > that
> > > > > > > > each
> > > > > > > > > Puppet agent has to read it from ZK, this means the ZK port
> > has
> > > > to
> > > > > be
> > > > > > > > open
> > > > > > > > > to pretty much every machine in the DC. This is a bummer and
> > a
> > > > very
> > > > > > > > > confusing requirement. Not sure if this is really a problem
> > or
> > > > not
> > > > > > > (each
> > > > > > > > of
> > > > > > > > > those tools might behave differently), though pointing out
> > that
> > > > > this
> > > > > > is
> > > > > > > > > something worth paying attention to.
> > > > > > > > >
> > > > > > > > > 3. The wrapper tools that let users read/change config tools
> > > > should
> > > > > > not
> > > > > > > > > depend on ZK for the reason mentioned above. It's a pain to
> > > > assume
> > > > > > that
> > > > > > > > the
> > > > > > > > > ZK port is open from any machine that needs to run this tool.
> > > > > Ideally
> > > > > > > > what
> > > > > > > > > users want is a REST API to the brokers to change or read the
> > > > > config
> > > > > > > (ala
> > > > > > > > > Elasticsearch), but in the absence of the REST API, we should
> > > > think
> > > > > > if
> > > > > > > we
> > > > > > > > > can write the tool such that it just requires talking to the
> > > > Kafka
> > > > > > > broker
> > > > > > > > > port. This will require a config RPC.
> > > > > > > > >
> > > > > > > > > 4. Not sure if KIP is the right place to discuss the design
> > of
> > > > > > > > propagating
> > > > > > > > > the config changes to the brokers, but have you thought about
> > > > just
> > > > > > > > letting
> > > > > > > > > the controller oversee the config changes and propagate via
> > RPC
> > > > to
> > > > > > the
> > > > > > > > > brokers? That way, there is an easier way to express config
> > > > changes
> > > > > > > that
> > > > > > > > > require all brokers to change it for it to be called
> > complete.
> > > > > Maybe
> > > > > > > this
> > > > > > > > > is not required, but it is hard to say if we don't discuss
> > the
> > > > full
> > > > > > set
> > > > > > > > of
> > > > > > > > > configs that need to be dynamic.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Neha
> > > > > > > > >
> > > > > > > > > On Fri, May 1, 2015 at 12:53 PM, Jay Kreps <
> > > jay.kr...@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hey Aditya,
> > > > > > > > > >
> > > > > > > > > > This is a great! A couple of comments:
> > > > > > > > > >
> > > > > > > > > > 1. Leaving the file config in place is definitely the least
> > > > > > > > disturbance.
> > > > > > > > > > But let's really think about getting rid of the files and
> > > just
> > > > > have
> > > > > > > one
> > > > > > > > > > config mechanism. There is always a tendency to make
> > > everything
> > > > > > > > pluggable
> > > > > > > > > > which so often just leads to two mediocre solutions. Can we
> > > do
> > > > > the
> > > > > > > > exercise
> > > > > > > > > > of trying to consider fully getting rid of file config and
> > > > seeing
> > > > > > > what
> > > > > > > > goes
> > > > > > > > > > wrong?
> > > > > > > > > >
> > > > > > > > > > 2. Do we need to model defaults? The current approach is
> > that
> > > > if
> > > > > > you
> > > > > > > > have a
> > > > > > > > > > global config x it is overridden for a topic xyz by
> > > > > /topics/xyz/x,
> > > > > > > and
> > > > > > > > I
> > > > > > > > > > think this could be extended to /brokers/0/x. I think this
> > is
> > > > > > > simpler.
> > > > > > > > We
> > > > > > > > > > need to specify the precedence for these overrides, e.g. if
> > > you
> > > > > > > > override at
> > > > > > > > > > the broker and topic level I think the topic level takes
> > > > > > precedence.
> > > > > > > > > >
> > > > > > > > > > 3. I recommend we have the producer and consumer config
> > just
> > > be
> > > > > an
> > > > > > > > override
> > > > > > > > > > under client.id. The override is by client id and we can
> > > have
> > > > > > > separate
> > > > > > > > > > properties for controlling quotas for producers and
> > > consumers.
> > > > > > > > > >
> > > > > > > > > > 4. Some configs can be changed just by updating the
> > > reference,
> > > > > > others
> > > > > > > > may
> > > > > > > > > > require some action. An example of this is if you want to
> > > > disable
> > > > > > log
> > > > > > > > > > compaction (assuming we wanted to make that dynamic) we
> > need
> > > to
> > > > > > call
> > > > > > > > > > shutdown() on the cleaner. I think it may be required to
> > > > > register a
> > > > > > > > > > listener callback that gets called when the config changes.
> > > > > > > > > >
> > > > > > > > > > 5. For handling the reference can you explain your plan a
> > > bit?
> > > > > > > > Currently we
> > > > > > > > > > have an immutable KafkaConfig object with a bunch of vals.
> > > That
> > > > > or
> > > > > > > > > > individual values in there get injected all over the code
> > > > base. I
> > > > > > was
> > > > > > > > > > thinking something like this:
> > > > > > > > > > a. We retain the KafkaConfig object as an immutable object
> > > just
> > > > > as
> > > > > > > > today.
> > > > > > > > > > b. It is no longer legit to grab values out fo that config
> > if
> > > > > they
> > > > > > > are
> > > > > > > > > > changeable.
> > > > > > > > > > c. Instead of making KafkaConfig itself mutable we make
> > > > > > > > KafkaConfiguration
> > > > > > > > > > which has a single volatile reference to the current
> > > > KafkaConfig.
> > > > > > > > > > KafkaConfiguration is what gets passed into various
> > > components.
> > > > > So
> > > > > > to
> > > > > > > > > > access a config you do something like
> > > config.instance.myValue.
> > > > > When
> > > > > > > the
> > > > > > > > > > config changes the config manager updates this reference.
> > > > > > > > > > d. The KafkaConfiguration is the thing that allows doing
> > the
> > > > > > > > > > configuration.onChange("my.config", callback)
> > > > > > > > > >
> > > > > > > > > > -Jay
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar <
> > > > > > > > > > aaurad...@linkedin.com.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hey everyone,
> > > > > > > > > > >
> > > > > > > > > > > Wrote up a KIP to update topic, client and broker configs
> > > > > > > > dynamically via
> > > > > > > > > > > Zookeeper.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> > > > > > > > > > >
> > > > > > > > > > > Please read and provide feedback.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Aditya
> > > > > > > > > > >
> > > > > > > > > > > PS: I've intentionally kept this discussion separate from
> > > > KIP-5
> > > > > > > > since I'm
> > > > > > > > > > > not sure if that is actively being worked on and I wanted
> > > to
> > > > > > start
> > > > > > > > with a
> > > > > > > > > > > clean slate.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Thanks,
> > > > > > > > > Neha
> > > > > > > >
> > > > > > > > --
> > > > > > > > Joel
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >

Reply via email to