I agree with the Joel's suggestion on keeping broker's configs in config file and clients/topics config in ZK. Few other projects, Apache Solr for one, also does something similar for its configurations.
On Monday, May 11, 2015, Gwen Shapira <gshap...@cloudera.com> wrote: > I like this approach (obviously). > I am also OK with supporting broker re-read of config file based on ZK > watch instead of SIGHUP, if we see this as more consistent with the rest of > our code base. > > Either is fine by me as long as brokers keep the file and just do refresh > :) > > On Tue, May 12, 2015 at 2:54 AM, Joel Koshy <jjkosh...@gmail.com > <javascript:;>> wrote: > > > So the general concern here is the dichotomy of configs (which we > > already have - i.e., in the form of broker config files vs topic > > configs in zookeeper). We (at LinkedIn) had some discussions on this > > last week and had this very question for the operations team whose > > opinion is I think to a large degree a touchstone for this decision: > > "Has the operations team at LinkedIn experienced any pain so far with > > managing topic configs in ZooKeeper (while broker configs are > > file-based)?" It turns out that ops overwhelmingly favors the current > > approach. i.e., service configs as file-based configs and client/topic > > configs in ZooKeeper is intuitive and works great. This may be > > somewhat counter-intuitive to devs, but this is one of those decisions > > for which ops input is very critical - because for all practical > > purposes, they are the users in this discussion. > > > > If we continue with this dichotomy and need to support dynamic config > > for client/topic configs as well as select service configs then there > > will need to be dichotomy in the config change mechanism as well. > > i.e., client/topic configs will change via (say) a ZooKeeper watch and > > the service config will change via a config file re-read (on SIGHUP) > > after config changes have been pushed out to local files. Is this a > > bad thing? Personally, I don't think it is - i.e. I'm in favor of this > > approach. What do others think? > > > > Thanks, > > > > Joel > > > > On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote: > > > What Todd said :) > > > > > > (I think my ops background is showing...) > > > > > > On Mon, May 11, 2015 at 10:17 PM, Todd Palino <tpal...@gmail.com > <javascript:;>> wrote: > > > > > > > I understand your point here, Jay, but I disagree that we can't have > > two > > > > configuration systems. We have two different types of configuration > > > > information. We have configuration that relates to the service itself > > (the > > > > Kafka broker), and we have configuration that relates to the content > > within > > > > the service (topics). I would put the client configuration (quotas) > in > > the > > > > with the second part, as it is dynamic information. I just don't see > a > > good > > > > argument for effectively degrading the configuration for the service > > > > because of trying to keep it paired with the configuration of dynamic > > > > resources. > > > > > > > > -Todd > > > > > > > > On Mon, May 11, 2015 at 11:33 AM, Jay Kreps <jay.kr...@gmail.com > <javascript:;>> > > wrote: > > > > > > > > > I totally agree that ZK is not in-and-of-itself a configuration > > > > management > > > > > solution and it would be better if we could just keep all our > config > > in > > > > > files. Anyone who has followed the various config discussions over > > the > > > > past > > > > > few years of discussion knows I'm the biggest proponent of > immutable > > > > > file-driven config. > > > > > > > > > > The analogy to "normal unix services" isn't actually quite right > > though. > > > > > The problem Kafka has is that a number of the configurable entities > > it > > > > > manages are added dynamically--topics, clients, consumer groups, > etc. > > > > What > > > > > this actually resembles is not a unix services like HTTPD but a > > database, > > > > > and databases typically do manage config dynamically for exactly > the > > same > > > > > reason. > > > > > > > > > > The last few emails are arguing that files > ZK as a config > > solution. I > > > > > agree with this, but that isn't really the question, right?The > > reality is > > > > > that we need to be able to configure dynamically created entities > > and we > > > > > won't get a satisfactory solution to that using files (e.g. rsync > is > > not > > > > an > > > > > acceptable topic creation mechanism). What we are discussing is > > having a > > > > > single config mechanism or multiple. If we have multiple you need > to > > > > solve > > > > > the whole config lifecycle problem for both--management, audit, > > rollback, > > > > > etc. > > > > > > > > > > Gwen, you were saying we couldn't get rid of the configuration > file, > > not > > > > > sure if I understand. Is that because we need to give the URL for > ZK? > > > > > Wouldn't the same argument work to say that we can't use > > configuration > > > > > files because we have to specify the file path? I think we can just > > give > > > > > the server the same --zookeeper argument we use everywhere else, > > right? > > > > > > > > > > -Jay > > > > > > > > > > On Sun, May 10, 2015 at 11:28 AM, Todd Palino <tpal...@gmail.com > <javascript:;>> > > wrote: > > > > > > > > > > > I've been watching this discussion for a while, and I have to > jump > > in > > > > and > > > > > > side with Gwen here. I see no benefit to putting the configs into > > > > > Zookeeper > > > > > > entirely, and a lot of downside. The two biggest problems I have > > with > > > > > this > > > > > > are: > > > > > > > > > > > > 1) Configuration management. OK, so you can write glue for Chef > to > > put > > > > > > configs into Zookeeper. You also need to write glue for Puppet. > And > > > > > > Cfengine. And everything else out there. Files are an industry > > standard > > > > > > practice, they're how just about everyone handles it, and there's > > > > reasons > > > > > > for that, not just "it's the way it's always been done". > > > > > > > > > > > > 2) Auditing. Configuration files can easily be managed in a > source > > > > > > repository system which tracks what changes were made and who > made > > > > them. > > > > > It > > > > > > also easily allows for rolling back to a previous version. > > Zookeeper > > > > does > > > > > > not. > > > > > > > > > > > > I see absolutely nothing wrong with putting the quota (client) > > configs > > > > > and > > > > > > the topic config overrides in Zookeeper, and keeping everything > > else > > > > > > exactly where it is, in the configuration file. To handle > > > > configurations > > > > > > for the broker that can be changed at runtime without a restart, > > you > > > > can > > > > > > use the industry standard practice of catching SIGHUP and > > rereading the > > > > > > configuration file at that point. > > > > > > > > > > > > -Todd > > > > > > > > > > > > > > > > > > On Sun, May 10, 2015 at 4:00 AM, Gwen Shapira < > > gshap...@cloudera.com <javascript:;>> > > > > > > wrote: > > > > > > > > > > > > > I am still not clear about the benefits of managing > > configuration in > > > > > > > ZooKeeper vs. keeping the local file and adding a "refresh" > > mechanism > > > > > > > (signal, protocol, zookeeper, or other). > > > > > > > > > > > > > > Benefits of staying with configuration file: > > > > > > > 1. In line with pretty much any Linux service that exists, so > > admins > > > > > > have a > > > > > > > lot of related experience. > > > > > > > 2. Much smaller change to our code-base, so easier to patch, > > review > > > > and > > > > > > > test. Lower risk overall. > > > > > > > > > > > > > > Can you walk me over the benefits of using Zookeeper? > Especially > > > > since > > > > > it > > > > > > > looks like we can't get rid of the file entirely? > > > > > > > > > > > > > > Gwen > > > > > > > > > > > > > > On Thu, May 7, 2015 at 3:33 AM, Jun Rao <j...@confluent.io > <javascript:;>> > > wrote: > > > > > > > > > > > > > > > One of the Chef users confirmed that Chef integration could > > still > > > > > work > > > > > > if > > > > > > > > all configs are moved to ZK. My rough understanding of how > Chef > > > > works > > > > > > is > > > > > > > > that a user first registers a service host with a Chef > server. > > > > After > > > > > > > that, > > > > > > > > a Chef client will be run on the service host. The user can > > then > > > > push > > > > > > > > config changes intended for a service/host to the Chef > server. > > The > > > > > > server > > > > > > > > is then responsible for pushing the changes to Chef clients. > > Chef > > > > > > clients > > > > > > > > support pluggable logic. For example, it can generate a > config > > file > > > > > > that > > > > > > > > Kafka broker will take. If we move all configs to ZK, we can > > > > > customize > > > > > > > the > > > > > > > > Chef client to use our config CLI to make the config changes > in > > > > > Kafka. > > > > > > In > > > > > > > > this model, one probably doesn't need to register every > broker > > in > > > > > Chef > > > > > > > for > > > > > > > > the config push. Not sure if Puppet works in a similar way. > > > > > > > > > > > > > > > > Also for storing the configs, we probably can't store the > > > > > broker/global > > > > > > > > level configs in Kafka itself (e.g. in a special topic). The > > reason > > > > > is > > > > > > > that > > > > > > > > in order to start a broker, we likely need to make some > broker > > > > level > > > > > > > config > > > > > > > > changes (e.g., the default log.dir may not be present, the > > default > > > > > port > > > > > > > may > > > > > > > > not be available, etc). If we need a broker to be up to make > > those > > > > > > > changes, > > > > > > > > we get into this chicken and egg problem. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira < > > > > gshap...@cloudera.com <javascript:;>> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Sorry I missed the call today :) > > > > > > > > > > > > > > > > > > I think an additional requirement would be: > > > > > > > > > Make sure that traditional deployment tools (Puppet, Chef, > > etc) > > > > are > > > > > > > still > > > > > > > > > capable of managing Kafka configuration. > > > > > > > > > > > > > > > > > > For this reason, I'd like the configuration refresh to be > > pretty > > > > > > close > > > > > > > to > > > > > > > > > what most Linux services are doing to force a reload of > > > > > > configuration. > > > > > > > > > AFAIK, this involves handling HUP signal in the main thread > > to > > > > > reload > > > > > > > > > configuration. Then packaging scripts can add something > nice > > like > > > > > > > > "service > > > > > > > > > kafka reload". > > > > > > > > > > > > > > > > > > (See Apache web server: > > > > > > > > > > > > > > > https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101 > > > > > > ) > > > > > > > > > > > > > > > > > > Gwen > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, May 5, 2015 at 8:54 AM, Joel Koshy < > > jjkosh...@gmail.com <javascript:;>> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Good discussion. Since we will be talking about this at > > 11am, I > > > > > > > wanted > > > > > > > > > > to organize these comments into requirements to see if we > > are > > > > all > > > > > > on > > > > > > > > > > the same page. > > > > > > > > > > > > > > > > > > > > REQUIREMENT 1: Needs to accept dynamic config changes. > This > > > > needs > > > > > > to > > > > > > > > > > be general enough to work for all configs that we > envision > > may > > > > > need > > > > > > > to > > > > > > > > > > accept changes at runtime. e.g., log (topic), broker, > > client > > > > > > > (quotas), > > > > > > > > > > etc.. possible options include: > > > > > > > > > > - ZooKeeper watcher > > > > > > > > > > - Kafka topic > > > > > > > > > > - Direct RPC to controller (or config coordinator) > > > > > > > > > > > > > > > > > > > > The current KIP is really focused on REQUIREMENT 1 and I > > think > > > > > that > > > > > > > is > > > > > > > > > > reasonable as long as we don't come up with something > that > > > > > requires > > > > > > > > > > significant re-engineering to support the other > > requirements. > > > > > > > > > > > > > > > > > > > > REQUIREMENT 2: Provide consistency of configs across > > brokers > > > > > > (modulo > > > > > > > > > > per-broker overrides) or at least be able to verify > > > > consistency. > > > > > > > What > > > > > > > > > > this effectively means is that config changes must be > seen > > by > > > > all > > > > > > > > > > brokers eventually and we should be able to easily > compare > > the > > > > > full > > > > > > > > > > config of each broker. > > > > > > > > > > > > > > > > > > > > REQUIREMENT 3: Central config store. Needs to work with > > plain > > > > > > > > > > file-based configs and other systems (e.g., puppet). > > Ideally, > > > > > > should > > > > > > > > > > not bring in other dependencies (e.g., a DB). Possible > > options: > > > > > > > > > > - ZooKeeper > > > > > > > > > > - Kafka topic > > > > > > > > > > - other? E.g. making it pluggable? > > > > > > > > > > > > > > > > > > > > Any other requirements? > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > Joel > > > > > > > > > > > > > > > > > > > > On Tue, May 05, 2015 at 01:38:09AM +0000, Aditya Auradkar > > > > wrote: > > > > > > > > > > > Hey Neha, > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback. > > > > > > > > > > > 1. In my earlier exchange with Jay, I mentioned the > > broker > > > > > > writing > > > > > > > > all > > > > > > > > > > it's configs to ZK (while respecting the overrides). Then > > ZK > > > > can > > > > > be > > > > > > > > used > > > > > > > > > to > > > > > > > > > > view all configs. > > > > > > > > > > > > > > > > > > > > > > 2. Need to think about this a bit more. Perhaps we can > > > > discuss > > > > > > this > > > > > > > > > > during the hangout tomorrow? > > > > > > > > > > > > > > > > > > > > > > 3 & 4) I viewed these config changes as mainly > > administrative > > > > > > > > > > operations. In the case, it may be reasonable to assume > > that > > > > the > > > > > ZK > > > > > > > > port > > > > > > > > > is > > > > > > > > > > available for communication from the machine these > > commands are > > > > > > run. > > > > > > > > > Having > > > > > > > > > > a ConfigChangeRequest (or similar) is nice to have but > > having a > > > > > new > > > > > > > API > > > > > > > > > and > > > > > > > > > > sending requests to controller also change how we do > topic > > > > based > > > > > > > > > > configuration currently. I was hoping to keep this KIP as > > > > minimal > > > > > > as > > > > > > > > > > possible and provide a means to represent and modify > > client and > > > > > > > broker > > > > > > > > > > based configs in a central place. Are there any concerns > > if we > > > > > > tackle > > > > > > > > > these > > > > > > > > > > things in a later KIP? > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > Aditya > > > > > > > > > > > ________________________________________ > > > > > > > > > > > From: Neha Narkhede [n...@confluent.io <javascript:;>] > > > > > > > > > > > Sent: Sunday, May 03, 2015 9:48 AM > > > > > > > > > > > To: dev@kafka.apache.org <javascript:;> > > > > > > > > > > > Subject: Re: [DISCUSS] KIP-21 Configuration Management > > > > > > > > > > > > > > > > > > > > > > Thanks for starting this discussion, Aditya. Few > > > > > > questions/comments > > > > > > > > > > > > > > > > > > > > > > 1. If you change the default values like it's mentioned > > in > > > > the > > > > > > KIP, > > > > > > > > do > > > > > > > > > > you > > > > > > > > > > > also overwrite the local config file as part of > updating > > the > > > > > > > default > > > > > > > > > > value? > > > > > > > > > > > If not, where does the admin look to find the default > > values, > > > > > ZK > > > > > > or > > > > > > > > > local > > > > > > > > > > > Kafka config file? What if a config value is different > in > > > > both > > > > > > > > places? > > > > > > > > > > > > > > > > > > > > > > 2. I share Gwen's concern around making sure that > popular > > > > > config > > > > > > > > > > management > > > > > > > > > > > tools continue to work with this change. Would love to > > see > > > > how > > > > > > each > > > > > > > > of > > > > > > > > > > > those would work with the proposal in the KIP. I don't > > know > > > > > > enough > > > > > > > > > about > > > > > > > > > > > each of the tools but seems like in some of the tools, > > you > > > > have > > > > > > to > > > > > > > > > define > > > > > > > > > > > some sort of class with parameter names as config > names. > > How > > > > > will > > > > > > > > such > > > > > > > > > > > tools find out about the config values? In Puppet, if > > this > > > > > means > > > > > > > that > > > > > > > > > > each > > > > > > > > > > > Puppet agent has to read it from ZK, this means the ZK > > port > > > > has > > > > > > to > > > > > > > be > > > > > > > > > > open > > > > > > > > > > > to pretty much every machine in the DC. This is a > bummer > > and > > > > a > > > > > > very > > > > > > > > > > > confusing requirement. Not sure if this is really a > > problem > > > > or > > > > > > not > > > > > > > > > (each > > > > > > > > > > of > > > > > > > > > > > those tools might behave differently), though pointing > > out > > > > that > > > > > > > this > > > > > > > > is > > > > > > > > > > > something worth paying attention to. > > > > > > > > > > > > > > > > > > > > > > 3. The wrapper tools that let users read/change config > > tools > > > > > > should > > > > > > > > not > > > > > > > > > > > depend on ZK for the reason mentioned above. It's a > pain > > to > > > > > > assume > > > > > > > > that > > > > > > > > > > the > > > > > > > > > > > ZK port is open from any machine that needs to run this > > tool. > > > > > > > Ideally > > > > > > > > > > what > > > > > > > > > > > users want is a REST API to the brokers to change or > > read the > > > > > > > config > > > > > > > > > (ala > > > > > > > > > > > Elasticsearch), but in the absence of the REST API, we > > should > > > > > > think > > > > > > > > if > > > > > > > > > we > > > > > > > > > > > can write the tool such that it just requires talking > to > > the > > > > > > Kafka > > > > > > > > > broker > > > > > > > > > > > port. This will require a config RPC. > > > > > > > > > > > > > > > > > > > > > > 4. Not sure if KIP is the right place to discuss the > > design > > > > of > > > > > > > > > > propagating > > > > > > > > > > > the config changes to the brokers, but have you thought > > about > > > > > > just > > > > > > > > > > letting > > > > > > > > > > > the controller oversee the config changes and propagate > > via > > > > RPC > > > > > > to > > > > > > > > the > > > > > > > > > > > brokers? That way, there is an easier way to express > > config > > > > > > changes > > > > > > > > > that > > > > > > > > > > > require all brokers to change it for it to be called > > > > complete. > > > > > > > Maybe > > > > > > > > > this > > > > > > > > > > > is not required, but it is hard to say if we don't > > discuss > > > > the > > > > > > full > > > > > > > > set > > > > > > > > > > of > > > > > > > > > > > configs that need to be dynamic. > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > Neha > > > > > > > > > > > > > > > > > > > > > > On Fri, May 1, 2015 at 12:53 PM, Jay Kreps < > > > > > jay.kr...@gmail.com <javascript:;>> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hey Aditya, > > > > > > > > > > > > > > > > > > > > > > > > This is a great! A couple of comments: > > > > > > > > > > > > > > > > > > > > > > > > 1. Leaving the file config in place is definitely the > > least > > > > > > > > > > disturbance. > > > > > > > > > > > > But let's really think about getting rid of the files > > and > > > > > just > > > > > > > have > > > > > > > > > one > > > > > > > > > > > > config mechanism. There is always a tendency to make > > > > > everything > > > > > > > > > > pluggable > > > > > > > > > > > > which so often just leads to two mediocre solutions. > > Can we > > > > > do > > > > > > > the > > > > > > > > > > exercise > > > > > > > > > > > > of trying to consider fully getting rid of file > config > > and > > > > > > seeing > > > > > > > > > what > > > > > > > > > > goes > > > > > > > > > > > > wrong? > > > > > > > > > > > > > > > > > > > > > > > > 2. Do we need to model defaults? The current approach > > is > > > > that > > > > > > if > > > > > > > > you > > > > > > > > > > have a > > > > > > > > > > > > global config x it is overridden for a topic xyz by > > > > > > > /topics/xyz/x, > > > > > > > > > and > > > > > > > > > > I > > > > > > > > > > > > think this could be extended to /brokers/0/x. I think > > this > > > > is > > > > > > > > > simpler. > > > > > > > > > > We > > > > > > > > > > > > need to specify the precedence for these overrides, > > e.g. if > > > > > you > > > > > > > > > > override at > > > > > > > > > > > > the broker and topic level I think the topic level > > takes > > > > > > > > precedence. > > > > > > > > > > > > > > > > > > > > > > > > 3. I recommend we have the producer and consumer > config > > > > just > > > > > be > > > > > > > an > > > > > > > > > > override > > > > > > > > > > > > under client.id. The override is by client id and we > > can > > > > > have > > > > > > > > > separate > > > > > > > > > > > > properties for controlling quotas for producers and > > > > > consumers. > > > > > > > > > > > > > > > > > > > > > > > > 4. Some configs can be changed just by updating the > > > > > reference, > > > > > > > > others > > > > > > > > > > may > > > > > > > > > > > > require some action. An example of this is if you > want > > to > > > > > > disable > > > > > > > > log > > > > > > > > > > > > compaction (assuming we wanted to make that dynamic) > we > > > > need > > > > > to > > > > > > > > call > > > > > > > > > > > > shutdown() on the cleaner. I think it may be required > > to > > > > > > > register a > > > > > > > > > > > > listener callback that gets called when the config > > changes. > > > > > > > > > > > > > > > > > > > > > > > > 5. For handling the reference can you explain your > > plan a > > > > > bit? > > > > > > > > > > Currently we > > > > > > > > > > > > have an immutable KafkaConfig object with a bunch of > > vals. > > > > > That > > > > > > > or > > > > > > > > > > > > individual values in there get injected all over the > > code > > > > > > base. I > > > > > > > > was > > > > > > > > > > > > thinking something like this: > > > > > > > > > > > > a. We retain the KafkaConfig object as an immutable > > object > > > > > just > > > > > > > as > > > > > > > > > > today. > > > > > > > > > > > > b. It is no longer legit to grab values out fo that > > config > > > > if > > > > > > > they > > > > > > > > > are > > > > > > > > > > > > changeable. > > > > > > > > > > > > c. Instead of making KafkaConfig itself mutable we > make > > > > > > > > > > KafkaConfiguration > > > > > > > > > > > > which has a single volatile reference to the current > > > > > > KafkaConfig. > > > > > > > > > > > > KafkaConfiguration is what gets passed into various > > > > > components. > > > > > > > So > > > > > > > > to > > > > > > > > > > > > access a config you do something like > > > > > config.instance.myValue. > > > > > > > When > > > > > > > > > the > > > > > > > > > > > > config changes the config manager updates this > > reference. > > > > > > > > > > > > d. The KafkaConfiguration is the thing that allows > > doing > > > > the > > > > > > > > > > > > configuration.onChange("my.config", callback) > > > > > > > > > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar < > > > > > > > > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hey everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > Wrote up a KIP to update topic, client and broker > > configs > > > > > > > > > > dynamically via > > > > > > > > > > > > > Zookeeper. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration > > > > > > > > > > > > > > > > > > > > > > > > > > Please read and provide feedback. > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > Aditya > > > > > > > > > > > > > > > > > > > > > > > > > > PS: I've intentionally kept this discussion > separate > > from > > > > > > KIP-5 > > > > > > > > > > since I'm > > > > > > > > > > > > > not sure if that is actively being worked on and I > > wanted > > > > > to > > > > > > > > start > > > > > > > > > > with a > > > > > > > > > > > > > clean slate. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Thanks, > > > > > > > > > > > Neha > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Joel > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Ashish 🎤h