I like this approach (obviously).
I am also OK with supporting broker re-read of config file based on ZK
watch instead of SIGHUP, if we see this as more consistent with the rest of
our code base.

Either is fine by me as long as brokers keep the file and just do refresh :)

On Tue, May 12, 2015 at 2:54 AM, Joel Koshy <jjkosh...@gmail.com> wrote:

> So the general concern here is the dichotomy of configs (which we
> already have - i.e., in the form of broker config files vs topic
> configs in zookeeper). We (at LinkedIn) had some discussions on this
> last week and had this very question for the operations team whose
> opinion is I think to a large degree a touchstone for this decision:
> "Has the operations team at LinkedIn experienced any pain so far with
> managing topic configs in ZooKeeper (while broker configs are
> file-based)?" It turns out that ops overwhelmingly favors the current
> approach. i.e., service configs as file-based configs and client/topic
> configs in ZooKeeper is intuitive and works great. This may be
> somewhat counter-intuitive to devs, but this is one of those decisions
> for which ops input is very critical - because for all practical
> purposes, they are the users in this discussion.
>
> If we continue with this dichotomy and need to support dynamic config
> for client/topic configs as well as select service configs then there
> will need to be dichotomy in the config change mechanism as well.
> i.e., client/topic configs will change via (say) a ZooKeeper watch and
> the service config will change via a config file re-read (on SIGHUP)
> after config changes have been pushed out to local files. Is this a
> bad thing? Personally, I don't think it is - i.e. I'm in favor of this
> approach. What do others think?
>
> Thanks,
>
> Joel
>
> On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
> > What Todd said :)
> >
> > (I think my ops background is showing...)
> >
> > On Mon, May 11, 2015 at 10:17 PM, Todd Palino <tpal...@gmail.com> wrote:
> >
> > > I understand your point here, Jay, but I disagree that we can't have
> two
> > > configuration systems. We have two different types of configuration
> > > information. We have configuration that relates to the service itself
> (the
> > > Kafka broker), and we have configuration that relates to the content
> within
> > > the service (topics). I would put the client configuration (quotas) in
> the
> > > with the second part, as it is dynamic information. I just don't see a
> good
> > > argument for effectively degrading the configuration for the service
> > > because of trying to keep it paired with the configuration of dynamic
> > > resources.
> > >
> > > -Todd
> > >
> > > On Mon, May 11, 2015 at 11:33 AM, Jay Kreps <jay.kr...@gmail.com>
> wrote:
> > >
> > > > I totally agree that ZK is not in-and-of-itself a configuration
> > > management
> > > > solution and it would be better if we could just keep all our config
> in
> > > > files. Anyone who has followed the various config discussions over
> the
> > > past
> > > > few years of discussion knows I'm the biggest proponent of immutable
> > > > file-driven config.
> > > >
> > > > The analogy to "normal unix services" isn't actually quite right
> though.
> > > > The problem Kafka has is that a number of the configurable entities
> it
> > > > manages are added dynamically--topics, clients, consumer groups, etc.
> > > What
> > > > this actually resembles is not a unix services like HTTPD but a
> database,
> > > > and databases typically do manage config dynamically for exactly the
> same
> > > > reason.
> > > >
> > > > The last few emails are arguing that files > ZK as a config
> solution. I
> > > > agree with this, but that isn't really the question, right?The
> reality is
> > > > that we need to be able to configure dynamically created entities
> and we
> > > > won't get a satisfactory solution to that using files (e.g. rsync is
> not
> > > an
> > > > acceptable topic creation mechanism). What we are discussing is
> having a
> > > > single config mechanism or multiple. If we have multiple you need to
> > > solve
> > > > the whole config lifecycle problem for both--management, audit,
> rollback,
> > > > etc.
> > > >
> > > > Gwen, you were saying we couldn't get rid of the configuration file,
> not
> > > > sure if I understand. Is that because we need to give the URL for ZK?
> > > > Wouldn't the same argument work to say that we can't use
> configuration
> > > > files because we have to specify the file path? I think we can just
> give
> > > > the server the same --zookeeper argument we use everywhere else,
> right?
> > > >
> > > > -Jay
> > > >
> > > > On Sun, May 10, 2015 at 11:28 AM, Todd Palino <tpal...@gmail.com>
> wrote:
> > > >
> > > > > I've been watching this discussion for a while, and I have to jump
> in
> > > and
> > > > > side with Gwen here. I see no benefit to putting the configs into
> > > > Zookeeper
> > > > > entirely, and a lot of downside. The two biggest problems I have
> with
> > > > this
> > > > > are:
> > > > >
> > > > > 1) Configuration management. OK, so you can write glue for Chef to
> put
> > > > > configs into Zookeeper. You also need to write glue for Puppet. And
> > > > > Cfengine. And everything else out there. Files are an industry
> standard
> > > > > practice, they're how just about everyone handles it, and there's
> > > reasons
> > > > > for that, not just "it's the way it's always been done".
> > > > >
> > > > > 2) Auditing. Configuration files can easily be managed in a source
> > > > > repository system which tracks what changes were made and who made
> > > them.
> > > > It
> > > > > also easily allows for rolling back to a previous version.
> Zookeeper
> > > does
> > > > > not.
> > > > >
> > > > > I see absolutely nothing wrong with putting the quota (client)
> configs
> > > > and
> > > > > the topic config overrides in Zookeeper, and keeping everything
> else
> > > > > exactly where it is, in the configuration file. To handle
> > > configurations
> > > > > for the broker that can be changed at runtime without a restart,
> you
> > > can
> > > > > use the industry standard practice of catching SIGHUP and
> rereading the
> > > > > configuration file at that point.
> > > > >
> > > > > -Todd
> > > > >
> > > > >
> > > > > On Sun, May 10, 2015 at 4:00 AM, Gwen Shapira <
> gshap...@cloudera.com>
> > > > > wrote:
> > > > >
> > > > > > I am still not clear about the benefits of managing
> configuration in
> > > > > > ZooKeeper vs. keeping the local file and adding a "refresh"
> mechanism
> > > > > > (signal, protocol, zookeeper, or other).
> > > > > >
> > > > > > Benefits of staying with configuration file:
> > > > > > 1. In line with pretty much any Linux service that exists, so
> admins
> > > > > have a
> > > > > > lot of related experience.
> > > > > > 2. Much smaller change to our code-base, so easier to patch,
> review
> > > and
> > > > > > test. Lower risk overall.
> > > > > >
> > > > > > Can you walk me over the benefits of using Zookeeper? Especially
> > > since
> > > > it
> > > > > > looks like we can't get rid of the file entirely?
> > > > > >
> > > > > > Gwen
> > > > > >
> > > > > > On Thu, May 7, 2015 at 3:33 AM, Jun Rao <j...@confluent.io>
> wrote:
> > > > > >
> > > > > > > One of the Chef users confirmed that Chef integration could
> still
> > > > work
> > > > > if
> > > > > > > all configs are moved to ZK. My rough understanding of how Chef
> > > works
> > > > > is
> > > > > > > that a user first registers a service host with a Chef server.
> > > After
> > > > > > that,
> > > > > > > a Chef client will be run on the service host. The user can
> then
> > > push
> > > > > > > config changes intended for a service/host to the Chef server.
> The
> > > > > server
> > > > > > > is then responsible for pushing the changes to Chef clients.
> Chef
> > > > > clients
> > > > > > > support pluggable logic. For example, it can generate a config
> file
> > > > > that
> > > > > > > Kafka broker will take. If we move all configs to ZK, we can
> > > > customize
> > > > > > the
> > > > > > > Chef client to use our config CLI to make the config changes in
> > > > Kafka.
> > > > > In
> > > > > > > this model, one probably doesn't need to register every broker
> in
> > > > Chef
> > > > > > for
> > > > > > > the config push. Not sure if Puppet works in a similar way.
> > > > > > >
> > > > > > > Also for storing the configs, we probably can't store the
> > > > broker/global
> > > > > > > level configs in Kafka itself (e.g. in a special topic). The
> reason
> > > > is
> > > > > > that
> > > > > > > in order to start a broker, we likely need to make some broker
> > > level
> > > > > > config
> > > > > > > changes (e.g., the default log.dir may not be present, the
> default
> > > > port
> > > > > > may
> > > > > > > not be available, etc). If we need a broker to be up to make
> those
> > > > > > changes,
> > > > > > > we get into this chicken and egg problem.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira <
> > > gshap...@cloudera.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Sorry I missed the call today :)
> > > > > > > >
> > > > > > > > I think an additional requirement would be:
> > > > > > > > Make sure that traditional deployment tools (Puppet, Chef,
> etc)
> > > are
> > > > > > still
> > > > > > > > capable of managing Kafka configuration.
> > > > > > > >
> > > > > > > > For this reason, I'd like the configuration refresh to be
> pretty
> > > > > close
> > > > > > to
> > > > > > > > what most Linux services are doing to force a reload of
> > > > > configuration.
> > > > > > > > AFAIK, this involves handling HUP signal in the main thread
> to
> > > > reload
> > > > > > > > configuration. Then packaging scripts can add something nice
> like
> > > > > > > "service
> > > > > > > > kafka reload".
> > > > > > > >
> > > > > > > > (See Apache web server:
> > > > > > > >
> > > > https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101
> > > > > )
> > > > > > > >
> > > > > > > > Gwen
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, May 5, 2015 at 8:54 AM, Joel Koshy <
> jjkosh...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Good discussion. Since we will be talking about this at
> 11am, I
> > > > > > wanted
> > > > > > > > > to organize these comments into requirements to see if we
> are
> > > all
> > > > > on
> > > > > > > > > the same page.
> > > > > > > > >
> > > > > > > > > REQUIREMENT 1: Needs to accept dynamic config changes. This
> > > needs
> > > > > to
> > > > > > > > > be general enough to work for all configs that we envision
> may
> > > > need
> > > > > > to
> > > > > > > > > accept changes at runtime. e.g., log (topic), broker,
> client
> > > > > > (quotas),
> > > > > > > > > etc.. possible options include:
> > > > > > > > > - ZooKeeper watcher
> > > > > > > > > - Kafka topic
> > > > > > > > > - Direct RPC to controller (or config coordinator)
> > > > > > > > >
> > > > > > > > > The current KIP is really focused on REQUIREMENT 1 and I
> think
> > > > that
> > > > > > is
> > > > > > > > > reasonable as long as we don't come up with something that
> > > > requires
> > > > > > > > > significant re-engineering to support the other
> requirements.
> > > > > > > > >
> > > > > > > > > REQUIREMENT 2: Provide consistency of configs across
> brokers
> > > > > (modulo
> > > > > > > > > per-broker overrides) or at least be able to verify
> > > consistency.
> > > > > > What
> > > > > > > > > this effectively means is that config changes must be seen
> by
> > > all
> > > > > > > > > brokers eventually and we should be able to easily compare
> the
> > > > full
> > > > > > > > > config of each broker.
> > > > > > > > >
> > > > > > > > > REQUIREMENT 3: Central config store. Needs to work with
> plain
> > > > > > > > > file-based configs and other systems (e.g., puppet).
> Ideally,
> > > > > should
> > > > > > > > > not bring in other dependencies (e.g., a DB). Possible
> options:
> > > > > > > > > - ZooKeeper
> > > > > > > > > - Kafka topic
> > > > > > > > > - other? E.g. making it pluggable?
> > > > > > > > >
> > > > > > > > > Any other requirements?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Joel
> > > > > > > > >
> > > > > > > > > On Tue, May 05, 2015 at 01:38:09AM +0000, Aditya Auradkar
> > > wrote:
> > > > > > > > > > Hey Neha,
> > > > > > > > > >
> > > > > > > > > > Thanks for the feedback.
> > > > > > > > > > 1. In my earlier exchange with Jay, I mentioned the
> broker
> > > > > writing
> > > > > > > all
> > > > > > > > > it's configs to ZK (while respecting the overrides). Then
> ZK
> > > can
> > > > be
> > > > > > > used
> > > > > > > > to
> > > > > > > > > view all configs.
> > > > > > > > > >
> > > > > > > > > > 2. Need to think about this a bit more. Perhaps we can
> > > discuss
> > > > > this
> > > > > > > > > during the hangout tomorrow?
> > > > > > > > > >
> > > > > > > > > > 3 & 4) I viewed these config changes as mainly
> administrative
> > > > > > > > > operations. In the case, it may be reasonable to assume
> that
> > > the
> > > > ZK
> > > > > > > port
> > > > > > > > is
> > > > > > > > > available for communication from the machine these
> commands are
> > > > > run.
> > > > > > > > Having
> > > > > > > > > a ConfigChangeRequest (or similar) is nice to have but
> having a
> > > > new
> > > > > > API
> > > > > > > > and
> > > > > > > > > sending requests to controller also change how we do topic
> > > based
> > > > > > > > > configuration currently. I was hoping to keep this KIP as
> > > minimal
> > > > > as
> > > > > > > > > possible and provide a means to represent and modify
> client and
> > > > > > broker
> > > > > > > > > based configs in a central place. Are there any concerns
> if we
> > > > > tackle
> > > > > > > > these
> > > > > > > > > things in a later KIP?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Aditya
> > > > > > > > > > ________________________________________
> > > > > > > > > > From: Neha Narkhede [n...@confluent.io]
> > > > > > > > > > Sent: Sunday, May 03, 2015 9:48 AM
> > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > > Subject: Re: [DISCUSS] KIP-21 Configuration Management
> > > > > > > > > >
> > > > > > > > > > Thanks for starting this discussion, Aditya. Few
> > > > > questions/comments
> > > > > > > > > >
> > > > > > > > > > 1. If you change the default values like it's mentioned
> in
> > > the
> > > > > KIP,
> > > > > > > do
> > > > > > > > > you
> > > > > > > > > > also overwrite the local config file as part of updating
> the
> > > > > > default
> > > > > > > > > value?
> > > > > > > > > > If not, where does the admin look to find the default
> values,
> > > > ZK
> > > > > or
> > > > > > > > local
> > > > > > > > > > Kafka config file? What if a config value is different in
> > > both
> > > > > > > places?
> > > > > > > > > >
> > > > > > > > > > 2. I share Gwen's concern around making sure that popular
> > > > config
> > > > > > > > > management
> > > > > > > > > > tools continue to work with this change. Would love to
> see
> > > how
> > > > > each
> > > > > > > of
> > > > > > > > > > those would work with the proposal in the KIP. I don't
> know
> > > > > enough
> > > > > > > > about
> > > > > > > > > > each of the tools but seems like in some of the tools,
> you
> > > have
> > > > > to
> > > > > > > > define
> > > > > > > > > > some sort of class with parameter names as config names.
> How
> > > > will
> > > > > > > such
> > > > > > > > > > tools find out about the config values? In Puppet, if
> this
> > > > means
> > > > > > that
> > > > > > > > > each
> > > > > > > > > > Puppet agent has to read it from ZK, this means the ZK
> port
> > > has
> > > > > to
> > > > > > be
> > > > > > > > > open
> > > > > > > > > > to pretty much every machine in the DC. This is a bummer
> and
> > > a
> > > > > very
> > > > > > > > > > confusing requirement. Not sure if this is really a
> problem
> > > or
> > > > > not
> > > > > > > > (each
> > > > > > > > > of
> > > > > > > > > > those tools might behave differently), though pointing
> out
> > > that
> > > > > > this
> > > > > > > is
> > > > > > > > > > something worth paying attention to.
> > > > > > > > > >
> > > > > > > > > > 3. The wrapper tools that let users read/change config
> tools
> > > > > should
> > > > > > > not
> > > > > > > > > > depend on ZK for the reason mentioned above. It's a pain
> to
> > > > > assume
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > ZK port is open from any machine that needs to run this
> tool.
> > > > > > Ideally
> > > > > > > > > what
> > > > > > > > > > users want is a REST API to the brokers to change or
> read the
> > > > > > config
> > > > > > > > (ala
> > > > > > > > > > Elasticsearch), but in the absence of the REST API, we
> should
> > > > > think
> > > > > > > if
> > > > > > > > we
> > > > > > > > > > can write the tool such that it just requires talking to
> the
> > > > > Kafka
> > > > > > > > broker
> > > > > > > > > > port. This will require a config RPC.
> > > > > > > > > >
> > > > > > > > > > 4. Not sure if KIP is the right place to discuss the
> design
> > > of
> > > > > > > > > propagating
> > > > > > > > > > the config changes to the brokers, but have you thought
> about
> > > > > just
> > > > > > > > > letting
> > > > > > > > > > the controller oversee the config changes and propagate
> via
> > > RPC
> > > > > to
> > > > > > > the
> > > > > > > > > > brokers? That way, there is an easier way to express
> config
> > > > > changes
> > > > > > > > that
> > > > > > > > > > require all brokers to change it for it to be called
> > > complete.
> > > > > > Maybe
> > > > > > > > this
> > > > > > > > > > is not required, but it is hard to say if we don't
> discuss
> > > the
> > > > > full
> > > > > > > set
> > > > > > > > > of
> > > > > > > > > > configs that need to be dynamic.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Neha
> > > > > > > > > >
> > > > > > > > > > On Fri, May 1, 2015 at 12:53 PM, Jay Kreps <
> > > > jay.kr...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hey Aditya,
> > > > > > > > > > >
> > > > > > > > > > > This is a great! A couple of comments:
> > > > > > > > > > >
> > > > > > > > > > > 1. Leaving the file config in place is definitely the
> least
> > > > > > > > > disturbance.
> > > > > > > > > > > But let's really think about getting rid of the files
> and
> > > > just
> > > > > > have
> > > > > > > > one
> > > > > > > > > > > config mechanism. There is always a tendency to make
> > > > everything
> > > > > > > > > pluggable
> > > > > > > > > > > which so often just leads to two mediocre solutions.
> Can we
> > > > do
> > > > > > the
> > > > > > > > > exercise
> > > > > > > > > > > of trying to consider fully getting rid of file config
> and
> > > > > seeing
> > > > > > > > what
> > > > > > > > > goes
> > > > > > > > > > > wrong?
> > > > > > > > > > >
> > > > > > > > > > > 2. Do we need to model defaults? The current approach
> is
> > > that
> > > > > if
> > > > > > > you
> > > > > > > > > have a
> > > > > > > > > > > global config x it is overridden for a topic xyz by
> > > > > > /topics/xyz/x,
> > > > > > > > and
> > > > > > > > > I
> > > > > > > > > > > think this could be extended to /brokers/0/x. I think
> this
> > > is
> > > > > > > > simpler.
> > > > > > > > > We
> > > > > > > > > > > need to specify the precedence for these overrides,
> e.g. if
> > > > you
> > > > > > > > > override at
> > > > > > > > > > > the broker and topic level I think the topic level
> takes
> > > > > > > precedence.
> > > > > > > > > > >
> > > > > > > > > > > 3. I recommend we have the producer and consumer config
> > > just
> > > > be
> > > > > > an
> > > > > > > > > override
> > > > > > > > > > > under client.id. The override is by client id and we
> can
> > > > have
> > > > > > > > separate
> > > > > > > > > > > properties for controlling quotas for producers and
> > > > consumers.
> > > > > > > > > > >
> > > > > > > > > > > 4. Some configs can be changed just by updating the
> > > > reference,
> > > > > > > others
> > > > > > > > > may
> > > > > > > > > > > require some action. An example of this is if you want
> to
> > > > > disable
> > > > > > > log
> > > > > > > > > > > compaction (assuming we wanted to make that dynamic) we
> > > need
> > > > to
> > > > > > > call
> > > > > > > > > > > shutdown() on the cleaner. I think it may be required
> to
> > > > > > register a
> > > > > > > > > > > listener callback that gets called when the config
> changes.
> > > > > > > > > > >
> > > > > > > > > > > 5. For handling the reference can you explain your
> plan a
> > > > bit?
> > > > > > > > > Currently we
> > > > > > > > > > > have an immutable KafkaConfig object with a bunch of
> vals.
> > > > That
> > > > > > or
> > > > > > > > > > > individual values in there get injected all over the
> code
> > > > > base. I
> > > > > > > was
> > > > > > > > > > > thinking something like this:
> > > > > > > > > > > a. We retain the KafkaConfig object as an immutable
> object
> > > > just
> > > > > > as
> > > > > > > > > today.
> > > > > > > > > > > b. It is no longer legit to grab values out fo that
> config
> > > if
> > > > > > they
> > > > > > > > are
> > > > > > > > > > > changeable.
> > > > > > > > > > > c. Instead of making KafkaConfig itself mutable we make
> > > > > > > > > KafkaConfiguration
> > > > > > > > > > > which has a single volatile reference to the current
> > > > > KafkaConfig.
> > > > > > > > > > > KafkaConfiguration is what gets passed into various
> > > > components.
> > > > > > So
> > > > > > > to
> > > > > > > > > > > access a config you do something like
> > > > config.instance.myValue.
> > > > > > When
> > > > > > > > the
> > > > > > > > > > > config changes the config manager updates this
> reference.
> > > > > > > > > > > d. The KafkaConfiguration is the thing that allows
> doing
> > > the
> > > > > > > > > > > configuration.onChange("my.config", callback)
> > > > > > > > > > >
> > > > > > > > > > > -Jay
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar <
> > > > > > > > > > > aaurad...@linkedin.com.invalid> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hey everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > Wrote up a KIP to update topic, client and broker
> configs
> > > > > > > > > dynamically via
> > > > > > > > > > > > Zookeeper.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> > > > > > > > > > > >
> > > > > > > > > > > > Please read and provide feedback.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Aditya
> > > > > > > > > > > >
> > > > > > > > > > > > PS: I've intentionally kept this discussion separate
> from
> > > > > KIP-5
> > > > > > > > > since I'm
> > > > > > > > > > > > not sure if that is actively being worked on and I
> wanted
> > > > to
> > > > > > > start
> > > > > > > > > with a
> > > > > > > > > > > > clean slate.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Thanks,
> > > > > > > > > > Neha
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Joel
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>
>

Reply via email to