Re: [DISCUSS] KIP-21 Configuration Management
Aditya, In the following, we should encode the config properties as a json map to be consistent with topic config. Internally, the znodes are comma-separated key-value pairs where key represents the configuration property to change. {version: x, config : {X1=Y1, X2=Y2..}} Thanks, Jun On Fri, May 15, 2015 at 1:09 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Yes we did. I just overlooked that line.. cleaning it up now. Aditya From: Gwen Shapira [gshap...@cloudera.com] Sent: Friday, May 15, 2015 12:55 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management The wiki says: There will be 3 paths within config /config/clients/client_id /config/topics/topic_name /config/brokers/broker_id Didn't we decide that brokers will not be configured dynamically, rather we will keep the config in the file? On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Updated the wiki to capture our recent discussions. Please read. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Thanks, Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Tuesday, May 12, 2015 1:09 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management The lack of audit could be addressed to some degree by an internal __config_changes topic which can have very long retention. Also, per the hangout summary that Gwen sent out it appears that we decided against supporting SIGHUP/dynamic configs for the broker. Thanks, Joel On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote: Thanks for chiming in, Todd! Agree that the lack of audit and rollback is a major downside of moving all configs to ZooKeeper. Being able to configure dynamically created entities in Kafka is required though. So I think what Todd suggested is a good solution to managing all configs - catching SIGHUP for broker configs and storing dynamic configs in ZK like we do today. On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote: Hmm, here is how I think we can change the split brain proposal to make it a bit better: 1. Get rid of broker overrides, this is just done in the config file. This makes the precedence chain a lot clearer (e.g. zk always overrides file on a per-entity basis). 2. Get rid of the notion of dynamic configs in ConfigDef and in the broker. All overrides are dynamic and all server configs are static. 3. Create an equivalent of LogConfig for ClientConfig and any future config type we make. 4. Generalize the TopicConfigManager to handle multiple types of overrides. What we haven't done is try to think through how the pure zk approach would work. -Jay On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com wrote: I agree with the Joel's suggestion on keeping broker's configs in config file and clients/topics config in ZK. Few other projects, Apache Solr for one, also does something similar for its configurations. On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com wrote: I like this approach (obviously). I am also OK with supporting broker re-read of config file based on ZK watch instead of SIGHUP, if we see this as more consistent with the rest of our code base. Either is fine by me as long as brokers keep the file and just do refresh :) On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com javascript:; wrote: So the general concern here is the dichotomy of configs (which we already have - i.e., in the form of broker config files vs topic configs in zookeeper). We (at LinkedIn) had some discussions on this last week and had this very question for the operations team whose opinion is I think to a large degree a touchstone for this decision: Has the operations team at LinkedIn experienced any pain so far with managing topic configs in ZooKeeper (while broker configs are file-based)? It turns out that ops overwhelmingly favors the current approach. i.e., service configs as file-based configs and client/topic configs in ZooKeeper is intuitive and works great. This may be somewhat counter-intuitive to devs, but this is one of those decisions for which ops input is very critical - because for all practical purposes, they are the users in this discussion. If we continue with this dichotomy and need to support dynamic config for client/topic configs as well as select service configs then there will need to be dichotomy in the config change mechanism as well. i.e
RE: [DISCUSS] KIP-21 Configuration Management
Updated the wiki to capture our recent discussions. Please read. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Thanks, Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Tuesday, May 12, 2015 1:09 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management The lack of audit could be addressed to some degree by an internal __config_changes topic which can have very long retention. Also, per the hangout summary that Gwen sent out it appears that we decided against supporting SIGHUP/dynamic configs for the broker. Thanks, Joel On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote: Thanks for chiming in, Todd! Agree that the lack of audit and rollback is a major downside of moving all configs to ZooKeeper. Being able to configure dynamically created entities in Kafka is required though. So I think what Todd suggested is a good solution to managing all configs - catching SIGHUP for broker configs and storing dynamic configs in ZK like we do today. On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote: Hmm, here is how I think we can change the split brain proposal to make it a bit better: 1. Get rid of broker overrides, this is just done in the config file. This makes the precedence chain a lot clearer (e.g. zk always overrides file on a per-entity basis). 2. Get rid of the notion of dynamic configs in ConfigDef and in the broker. All overrides are dynamic and all server configs are static. 3. Create an equivalent of LogConfig for ClientConfig and any future config type we make. 4. Generalize the TopicConfigManager to handle multiple types of overrides. What we haven't done is try to think through how the pure zk approach would work. -Jay On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com wrote: I agree with the Joel's suggestion on keeping broker's configs in config file and clients/topics config in ZK. Few other projects, Apache Solr for one, also does something similar for its configurations. On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com wrote: I like this approach (obviously). I am also OK with supporting broker re-read of config file based on ZK watch instead of SIGHUP, if we see this as more consistent with the rest of our code base. Either is fine by me as long as brokers keep the file and just do refresh :) On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com javascript:; wrote: So the general concern here is the dichotomy of configs (which we already have - i.e., in the form of broker config files vs topic configs in zookeeper). We (at LinkedIn) had some discussions on this last week and had this very question for the operations team whose opinion is I think to a large degree a touchstone for this decision: Has the operations team at LinkedIn experienced any pain so far with managing topic configs in ZooKeeper (while broker configs are file-based)? It turns out that ops overwhelmingly favors the current approach. i.e., service configs as file-based configs and client/topic configs in ZooKeeper is intuitive and works great. This may be somewhat counter-intuitive to devs, but this is one of those decisions for which ops input is very critical - because for all practical purposes, they are the users in this discussion. If we continue with this dichotomy and need to support dynamic config for client/topic configs as well as select service configs then there will need to be dichotomy in the config change mechanism as well. i.e., client/topic configs will change via (say) a ZooKeeper watch and the service config will change via a config file re-read (on SIGHUP) after config changes have been pushed out to local files. Is this a bad thing? Personally, I don't think it is - i.e. I'm in favor of this approach. What do others think? Thanks, Joel On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote: What Todd said :) (I think my ops background is showing...) On Mon, May 11, 2015 at 10:17 PM, Todd Palino tpal...@gmail.com javascript:; wrote: I understand your point here, Jay, but I disagree that we can't have two configuration systems. We have two different types of configuration information. We have configuration that relates to the service itself (the Kafka broker), and we have configuration that relates to the content within the service (topics). I would put the client configuration (quotas) in the with the second part, as it is dynamic information. I just don't see a good argument for effectively degrading the configuration
Re: [DISCUSS] KIP-21 Configuration Management
The wiki says: There will be 3 paths within config /config/clients/client_id /config/topics/topic_name /config/brokers/broker_id Didn't we decide that brokers will not be configured dynamically, rather we will keep the config in the file? On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Updated the wiki to capture our recent discussions. Please read. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Thanks, Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Tuesday, May 12, 2015 1:09 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management The lack of audit could be addressed to some degree by an internal __config_changes topic which can have very long retention. Also, per the hangout summary that Gwen sent out it appears that we decided against supporting SIGHUP/dynamic configs for the broker. Thanks, Joel On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote: Thanks for chiming in, Todd! Agree that the lack of audit and rollback is a major downside of moving all configs to ZooKeeper. Being able to configure dynamically created entities in Kafka is required though. So I think what Todd suggested is a good solution to managing all configs - catching SIGHUP for broker configs and storing dynamic configs in ZK like we do today. On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote: Hmm, here is how I think we can change the split brain proposal to make it a bit better: 1. Get rid of broker overrides, this is just done in the config file. This makes the precedence chain a lot clearer (e.g. zk always overrides file on a per-entity basis). 2. Get rid of the notion of dynamic configs in ConfigDef and in the broker. All overrides are dynamic and all server configs are static. 3. Create an equivalent of LogConfig for ClientConfig and any future config type we make. 4. Generalize the TopicConfigManager to handle multiple types of overrides. What we haven't done is try to think through how the pure zk approach would work. -Jay On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com wrote: I agree with the Joel's suggestion on keeping broker's configs in config file and clients/topics config in ZK. Few other projects, Apache Solr for one, also does something similar for its configurations. On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com wrote: I like this approach (obviously). I am also OK with supporting broker re-read of config file based on ZK watch instead of SIGHUP, if we see this as more consistent with the rest of our code base. Either is fine by me as long as brokers keep the file and just do refresh :) On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com javascript:; wrote: So the general concern here is the dichotomy of configs (which we already have - i.e., in the form of broker config files vs topic configs in zookeeper). We (at LinkedIn) had some discussions on this last week and had this very question for the operations team whose opinion is I think to a large degree a touchstone for this decision: Has the operations team at LinkedIn experienced any pain so far with managing topic configs in ZooKeeper (while broker configs are file-based)? It turns out that ops overwhelmingly favors the current approach. i.e., service configs as file-based configs and client/topic configs in ZooKeeper is intuitive and works great. This may be somewhat counter-intuitive to devs, but this is one of those decisions for which ops input is very critical - because for all practical purposes, they are the users in this discussion. If we continue with this dichotomy and need to support dynamic config for client/topic configs as well as select service configs then there will need to be dichotomy in the config change mechanism as well. i.e., client/topic configs will change via (say) a ZooKeeper watch and the service config will change via a config file re-read (on SIGHUP) after config changes have been pushed out to local files. Is this a bad thing? Personally, I don't think it is - i.e. I'm in favor of this approach. What do others think? Thanks, Joel On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote: What Todd said :) (I think my ops background is showing...) On Mon, May 11, 2015 at 10:17 PM, Todd Palino tpal...@gmail.com javascript:; wrote: I understand your point here, Jay, but I disagree that we can't have two configuration systems. We have two different
RE: [DISCUSS] KIP-21 Configuration Management
Yes we did. I just overlooked that line.. cleaning it up now. Aditya From: Gwen Shapira [gshap...@cloudera.com] Sent: Friday, May 15, 2015 12:55 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management The wiki says: There will be 3 paths within config /config/clients/client_id /config/topics/topic_name /config/brokers/broker_id Didn't we decide that brokers will not be configured dynamically, rather we will keep the config in the file? On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Updated the wiki to capture our recent discussions. Please read. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Thanks, Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Tuesday, May 12, 2015 1:09 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management The lack of audit could be addressed to some degree by an internal __config_changes topic which can have very long retention. Also, per the hangout summary that Gwen sent out it appears that we decided against supporting SIGHUP/dynamic configs for the broker. Thanks, Joel On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote: Thanks for chiming in, Todd! Agree that the lack of audit and rollback is a major downside of moving all configs to ZooKeeper. Being able to configure dynamically created entities in Kafka is required though. So I think what Todd suggested is a good solution to managing all configs - catching SIGHUP for broker configs and storing dynamic configs in ZK like we do today. On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote: Hmm, here is how I think we can change the split brain proposal to make it a bit better: 1. Get rid of broker overrides, this is just done in the config file. This makes the precedence chain a lot clearer (e.g. zk always overrides file on a per-entity basis). 2. Get rid of the notion of dynamic configs in ConfigDef and in the broker. All overrides are dynamic and all server configs are static. 3. Create an equivalent of LogConfig for ClientConfig and any future config type we make. 4. Generalize the TopicConfigManager to handle multiple types of overrides. What we haven't done is try to think through how the pure zk approach would work. -Jay On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com wrote: I agree with the Joel's suggestion on keeping broker's configs in config file and clients/topics config in ZK. Few other projects, Apache Solr for one, also does something similar for its configurations. On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com wrote: I like this approach (obviously). I am also OK with supporting broker re-read of config file based on ZK watch instead of SIGHUP, if we see this as more consistent with the rest of our code base. Either is fine by me as long as brokers keep the file and just do refresh :) On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com javascript:; wrote: So the general concern here is the dichotomy of configs (which we already have - i.e., in the form of broker config files vs topic configs in zookeeper). We (at LinkedIn) had some discussions on this last week and had this very question for the operations team whose opinion is I think to a large degree a touchstone for this decision: Has the operations team at LinkedIn experienced any pain so far with managing topic configs in ZooKeeper (while broker configs are file-based)? It turns out that ops overwhelmingly favors the current approach. i.e., service configs as file-based configs and client/topic configs in ZooKeeper is intuitive and works great. This may be somewhat counter-intuitive to devs, but this is one of those decisions for which ops input is very critical - because for all practical purposes, they are the users in this discussion. If we continue with this dichotomy and need to support dynamic config for client/topic configs as well as select service configs then there will need to be dichotomy in the config change mechanism as well. i.e., client/topic configs will change via (say) a ZooKeeper watch and the service config will change via a config file re-read (on SIGHUP) after config changes have been pushed out to local files. Is this a bad thing? Personally, I don't think it is - i.e. I'm in favor of this approach. What do others think? Thanks, Joel On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote: What Todd said :) (I think my ops background
Re: [DISCUSS] KIP-21 Configuration Management
the call today :) I think an additional requirement would be: Make sure that traditional deployment tools (Puppet, Chef, etc) are still capable of managing Kafka configuration. For this reason, I'd like the configuration refresh to be pretty close to what most Linux services are doing to force a reload of configuration. AFAIK, this involves handling HUP signal in the main thread to reload configuration. Then packaging scripts can add something nice like service kafka reload. (See Apache web server: https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101 ) Gwen On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com javascript:; wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io javascript:;] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org javascript:; Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value
Re: [DISCUSS] KIP-21 Configuration Management
more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io javascript:;] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org javascript:; Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com javascript
Re: [DISCUSS] KIP-21 Configuration Management
: Sorry I missed the call today :) I think an additional requirement would be: Make sure that traditional deployment tools (Puppet, Chef, etc) are still capable of managing Kafka configuration. For this reason, I'd like the configuration refresh to be pretty close to what most Linux services are doing to force a reload of configuration. AFAIK, this involves handling HUP signal in the main thread to reload configuration. Then packaging scripts can add something nice like service kafka reload. (See Apache web server: https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101 ) Gwen On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should
Re: [DISCUSS] KIP-21 Configuration Management
. The server is then responsible for pushing the changes to Chef clients. Chef clients support pluggable logic. For example, it can generate a config file that Kafka broker will take. If we move all configs to ZK, we can customize the Chef client to use our config CLI to make the config changes in Kafka. In this model, one probably doesn't need to register every broker in Chef for the config push. Not sure if Puppet works in a similar way. Also for storing the configs, we probably can't store the broker/global level configs in Kafka itself (e.g. in a special topic). The reason is that in order to start a broker, we likely need to make some broker level config changes (e.g., the default log.dir may not be present, the default port may not be available, etc). If we need a broker to be up to make those changes, we get into this chicken and egg problem. Thanks, Jun On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira gshap...@cloudera.com wrote: Sorry I missed the call today :) I think an additional requirement would be: Make sure that traditional deployment tools (Puppet, Chef, etc) are still capable of managing Kafka configuration. For this reason, I'd like the configuration refresh to be pretty close to what most Linux services are doing to force a reload of configuration. AFAIK, this involves handling HUP signal in the main thread to reload configuration. Then packaging scripts can add something nice like service kafka reload. (See Apache web server: https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101 ) Gwen On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few
Re: [DISCUSS] KIP-21 Configuration Management
generate a config file that Kafka broker will take. If we move all configs to ZK, we can customize the Chef client to use our config CLI to make the config changes in Kafka. In this model, one probably doesn't need to register every broker in Chef for the config push. Not sure if Puppet works in a similar way. Also for storing the configs, we probably can't store the broker/global level configs in Kafka itself (e.g. in a special topic). The reason is that in order to start a broker, we likely need to make some broker level config changes (e.g., the default log.dir may not be present, the default port may not be available, etc). If we need a broker to be up to make those changes, we get into this chicken and egg problem. Thanks, Jun On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira gshap...@cloudera.com wrote: Sorry I missed the call today :) I think an additional requirement would be: Make sure that traditional deployment tools (Puppet, Chef, etc) are still capable of managing Kafka configuration. For this reason, I'd like the configuration refresh to be pretty close to what most Linux services are doing to force a reload of configuration. AFAIK, this involves handling HUP signal in the main thread to reload configuration. Then packaging scripts can add something nice like service kafka reload. (See Apache web server: https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101 ) Gwen On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config
RE: [DISCUSS] KIP-21 Configuration Management
I did initially think having everything in ZK was better than having the dichotomy Joel referred to primarily because all kafka configs can be managed consistently. I guess the biggest disadvantage of driving broker config primarily from ZK is that it requires everyone to manage Kafka configuration separately from other services. Several people have separately mentioned integration issues with systems like Puppet and Chef. While they may support pluggable logic, it does require everyone to write that additional piece of logic specific to Kafka. We will have to implement group, fabric, tag hierarchy (as Ashish mentioned), auditing and ACL management. While this potential consistency is nice, perhaps the tradeoff isn't worth it given that the resulting system isn't much superior to pushing out new config files and is also quite disruptive. Since this impacts operations teams the most, I also think their input is probably the most valuable and should perhaps drive the outcome. I also think it is fine to treat topic and client configuration separately because they are more like metadata than actual service configuration. Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Monday, May 11, 2015 4:54 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management So the general concern here is the dichotomy of configs (which we already have - i.e., in the form of broker config files vs topic configs in zookeeper). We (at LinkedIn) had some discussions on this last week and had this very question for the operations team whose opinion is I think to a large degree a touchstone for this decision: Has the operations team at LinkedIn experienced any pain so far with managing topic configs in ZooKeeper (while broker configs are file-based)? It turns out that ops overwhelmingly favors the current approach. i.e., service configs as file-based configs and client/topic configs in ZooKeeper is intuitive and works great. This may be somewhat counter-intuitive to devs, but this is one of those decisions for which ops input is very critical - because for all practical purposes, they are the users in this discussion. If we continue with this dichotomy and need to support dynamic config for client/topic configs as well as select service configs then there will need to be dichotomy in the config change mechanism as well. i.e., client/topic configs will change via (say) a ZooKeeper watch and the service config will change via a config file re-read (on SIGHUP) after config changes have been pushed out to local files. Is this a bad thing? Personally, I don't think it is - i.e. I'm in favor of this approach. What do others think? Thanks, Joel On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote: What Todd said :) (I think my ops background is showing...) On Mon, May 11, 2015 at 10:17 PM, Todd Palino tpal...@gmail.com wrote: I understand your point here, Jay, but I disagree that we can't have two configuration systems. We have two different types of configuration information. We have configuration that relates to the service itself (the Kafka broker), and we have configuration that relates to the content within the service (topics). I would put the client configuration (quotas) in the with the second part, as it is dynamic information. I just don't see a good argument for effectively degrading the configuration for the service because of trying to keep it paired with the configuration of dynamic resources. -Todd On Mon, May 11, 2015 at 11:33 AM, Jay Kreps jay.kr...@gmail.com wrote: I totally agree that ZK is not in-and-of-itself a configuration management solution and it would be better if we could just keep all our config in files. Anyone who has followed the various config discussions over the past few years of discussion knows I'm the biggest proponent of immutable file-driven config. The analogy to normal unix services isn't actually quite right though. The problem Kafka has is that a number of the configurable entities it manages are added dynamically--topics, clients, consumer groups, etc. What this actually resembles is not a unix services like HTTPD but a database, and databases typically do manage config dynamically for exactly the same reason. The last few emails are arguing that files ZK as a config solution. I agree with this, but that isn't really the question, right?The reality is that we need to be able to configure dynamically created entities and we won't get a satisfactory solution to that using files (e.g. rsync is not an acceptable topic creation mechanism). What we are discussing is having a single config mechanism or multiple. If we have multiple you need to solve the whole config lifecycle problem for both--management, audit, rollback, etc. Gwen, you were saying we couldn't get rid of the configuration file
Re: [DISCUSS] KIP-21 Configuration Management
: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two
Re: [DISCUSS] KIP-21 Configuration Management
into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io javascript:;] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org javascript:; Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools
Re: [DISCUSS] KIP-21 Configuration Management
re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way
Re: [DISCUSS] KIP-21 Configuration Management
. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration which has a single volatile reference to the current KafkaConfig. KafkaConfiguration is what gets passed into various components. So to access a config you do something like config.instance.myValue. When
Re: [DISCUSS] KIP-21 Configuration Management
and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We
RE: [DISCUSS] KIP-21 Configuration Management
Theoretically, using just the broker id and zk connect string it should be possible for the broker to read all configs from Zookeeper. Like Ashish said, we should probably take a look and make sure. Additionally, we've spoken about making config changes only through a broker API. However, we also need a way to change properties even if a specific broker, controller or entire cluster is down or unable to accept config change requests for any reason. This implies that we need a mechanism to make config changes by talking to zookeeper directly and that we cant rely solely on the broker/controller API. Aditya From: Ashish Singh [asi...@cloudera.com] Sent: Thursday, May 07, 2015 8:19 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Agreed :). However, the other concerns remain. Do you think just providing zk info to broker will be sufficient? I will myself spend some to look into the existing required confine. On Thursday, May 7, 2015, Jun Rao j...@confluent.io wrote: Ashish, 3. This is true. However, using files has the same problem. You can't store the location of the file in the file itself. The location of the file has to be passed out of band into Kafka. Thanks, Jun On Wed, May 6, 2015 at 6:34 PM, Ashish Singh asi...@cloudera.com javascript:; wrote: Hey Jun, Where does the broker get the info, which zk it needs to talk to? On Wednesday, May 6, 2015, Jun Rao j...@confluent.io javascript:; wrote: Ashish, 3. Just want to clarify. Why can't you store ZK connection config in ZK? This is a property for ZK clients, not ZK server. Thanks, Jun On Wed, May 6, 2015 at 5:48 PM, Ashish Singh asi...@cloudera.com javascript:; javascript:; wrote: I too would like to share some concerns that we came up with while discussing the effect of moving configs to zookeeper will have. 1. Kafka will start to become a configuration management tool to some degree, and be subject to all the things such tools are commonly asked to do. Kafka'll likely need to re-implement the role / group / service hierarchy that CM uses. Kafka'll need some way to conveniently dump its configs so they can be re-imported later, as a backup tool. People will want this to be audited, which means you'd need distinct logins for different people, and user management. You can try to push some of this stuff onto tools like CM, but this is Kafka going out of its way to be difficult to manage, and most projects don't want to do that. Being unique in how configuration is done is strictly a bad thing for both integration and usability. Probably lots of other stuff. Seems like a bad direction. 2. Where would the default config live? If we decide on keeping the config files around just for getting the default config, then I think on restart, the config file will be ignored. This creates an obnoxious asymmetry for how to configure Kafka the first time and how you update it. You have to learn 2 ways of making config changes. If there was a mistake in your original config file, you can't just edit the config file and restart, you have to go through the API. Reading configs is also more irritating. This all creates a learning curve for users of Kafka that will make it harder to use than other projects. This is also a backwards-incompatible change. 3. All Kafka configs living in ZK is strictly impossible, since at the very least ZK connection configs cannot be stored in ZK. So you will have a file where some values are in effect but others are not, which is again confusing. Also, since you are still reading the config file on first start, there are still multiple sources of truth, or at least the appearance of such to the user. On Wed, May 6, 2015 at 5:33 PM, Jun Rao j...@confluent.io javascript:; javascript:; wrote: One of the Chef users confirmed that Chef integration could still work if all configs are moved to ZK. My rough understanding of how Chef works is that a user first registers a service host with a Chef server. After that, a Chef client will be run on the service host. The user can then push config changes intended for a service/host to the Chef server. The server is then responsible for pushing the changes to Chef clients. Chef clients support pluggable logic. For example, it can generate a config file that Kafka broker will take. If we move all configs to ZK, we can customize the Chef client to use our config CLI to make the config changes in Kafka. In this model, one probably doesn't need to register every broker in Chef for the config push. Not sure if Puppet works in a similar way. Also for storing the configs, we
Re: [DISCUSS] KIP-21 Configuration Management
) Gwen On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com javascript:; wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io javascript:;] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org javascript:; Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port
Re: [DISCUSS] KIP-21 Configuration Management
envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io javascript:;] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org javascript:; Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic
Re: [DISCUSS] KIP-21 Configuration Management
that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do
Re: [DISCUSS] KIP-21 Configuration Management
must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level
Re: [DISCUSS] KIP-21 Configuration Management
, Joel Koshy jjkosh...@gmail.com wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes
Re: [DISCUSS] KIP-21 Configuration Management
One of the Chef users confirmed that Chef integration could still work if all configs are moved to ZK. My rough understanding of how Chef works is that a user first registers a service host with a Chef server. After that, a Chef client will be run on the service host. The user can then push config changes intended for a service/host to the Chef server. The server is then responsible for pushing the changes to Chef clients. Chef clients support pluggable logic. For example, it can generate a config file that Kafka broker will take. If we move all configs to ZK, we can customize the Chef client to use our config CLI to make the config changes in Kafka. In this model, one probably doesn't need to register every broker in Chef for the config push. Not sure if Puppet works in a similar way. Also for storing the configs, we probably can't store the broker/global level configs in Kafka itself (e.g. in a special topic). The reason is that in order to start a broker, we likely need to make some broker level config changes (e.g., the default log.dir may not be present, the default port may not be available, etc). If we need a broker to be up to make those changes, we get into this chicken and egg problem. Thanks, Jun On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira gshap...@cloudera.com wrote: Sorry I missed the call today :) I think an additional requirement would be: Make sure that traditional deployment tools (Puppet, Chef, etc) are still capable of managing Kafka configuration. For this reason, I'd like the configuration refresh to be pretty close to what most Linux services are doing to force a reload of configuration. AFAIK, this involves handling HUP signal in the main thread to reload configuration. Then packaging scripts can add something nice like service kafka reload. (See Apache web server: https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101) Gwen On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern
Re: [DISCUSS] KIP-21 Configuration Management
Sorry I missed the call today :) I think an additional requirement would be: Make sure that traditional deployment tools (Puppet, Chef, etc) are still capable of managing Kafka configuration. For this reason, I'd like the configuration refresh to be pretty close to what most Linux services are doing to force a reload of configuration. AFAIK, this involves handling HUP signal in the main thread to reload configuration. Then packaging scripts can add something nice like service kafka reload. (See Apache web server: https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101) Gwen On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you
Re: [DISCUSS] KIP-21 Configuration Management
Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable
Re: [DISCUSS] KIP-21 Configuration Management
Joel, thanks for summarizing the requirements. It makes sense for the KIP to focus on Req #1, unless any future configs as dynamic ones warrants a completely different design. My main concern is going with a design by keeping only quotas in mind and then continue shoehorning other dynamic configs into that model even if it doesn't work that well. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. My concern with supporting both is that it will be confusing for anyone to know where to look to find the final value of a config or be able to tell if a particular broker hasn't picked up a config value. Maybe you have thought about this, I am unclear about the fix you have in mind. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? This isn't relevant for LI but is important for a lot of users. So we should definitely state how those tools would continue to work with this change in the KIP. 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. I'm not so sure about this assumption. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? I don't have concerns about reducing the scope of this KIP as long as we are sure the approach we pick is the right direction for dynamic config and further changes are just incremental, not a redesign. On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote: Good discussion. Since we will be talking about this at 11am, I wanted to organize these comments into requirements to see if we are all on the same page. REQUIREMENT 1: Needs to accept dynamic config changes. This needs to be general enough to work for all configs that we envision may need to accept changes at runtime. e.g., log (topic), broker, client (quotas), etc.. possible options include: - ZooKeeper watcher - Kafka topic - Direct RPC to controller (or config coordinator) The current KIP is really focused on REQUIREMENT 1 and I think that is reasonable as long as we don't come up with something that requires significant re-engineering to support the other requirements. REQUIREMENT 2: Provide consistency of configs across brokers (modulo per-broker overrides) or at least be able to verify consistency. What this effectively means is that config changes must be seen by all brokers eventually and we should be able to easily compare the full config of each broker. REQUIREMENT 3: Central config store. Needs to work with plain file-based configs and other systems (e.g., puppet). Ideally, should not bring in other dependencies (e.g., a DB). Possible options: - ZooKeeper - Kafka topic - other? E.g. making it pluggable? Any other requirements? Thanks, Joel On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote: Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work
Re: [DISCUSS] KIP-21 Configuration Management
Aditya, when I think about the motivation of not having to restart brokers to change a config I think about all of the configurations I have seen having to get changed in brokers and restarted (which is just about all of them). What I mean by stop the world is when producers and/or consumers will not be able to use the broker(s) for a period of time or something within the broker holds/blocks everything for the changes to take affect and LeaderElection is going to occur or ISR change. Lets say someone wanted to change replicaFetchMaxBytes or replicaFetchBackoffMs dynamically you would have to stop the ReplicaFetcherManager. If you use a watcher then then all brokers at the same time will have to stop and (hopefully) start ReplicaFetcherManager at the same time. Or lets say someone wanted to change NumNetworkThreads, the entire SocketServer for every broker at the same time would have to stop and (hopefully) start.I believe most of the configurations fall into this category and using a watcher notification to every broker without some control is going to be a problem. If the notification just goes to the controller and the controller is able to managing the processing for every broker that might work but doesn't solve all the problems to be worked on. We would also have to think about what to-do for the controller broker also itself (unless we make the controller maybe not a broker as possible) as well as how to deal with some of these changes that could take brokers in and out of the ISR or cause Leader Election. If we can make these changes without stopping the world (not just a matter of having the controller managing the broker by broker restart) so that Brokers that are leaders would still be leaders (perhaps the connections for producing / consuming get buffered or something) when (if) they come back online. The thing is that lots of folks want all (as many as possible) the configuration to be dynamic and I am concerned that if we don't code for the harder cases then we only have one or two configurations able to be dynamic. If that is the motivation for this KIP so quotas work that is ok. The more I think about it I am not sure just labeling certain configs to be dynamic is going to be helpful for folks because they are still having to manage the updates for all the configurations, restarting brokers and now a new burden to understand dynamic properties. I think we need to add solutions for folks where we can to make things easier without having to add new items for them to contend with. Thanks! ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Sun, May 3, 2015 at 8:23 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey Joe, Can you elaborate what you mean by a stop the world change? In this protocol, we can target notifications to a subset of brokers in the cluster (controller if we need to). Is the AdminChangeNotification a ZK notification or a request type exposed by each broker? Thanks, Aditya From: Joe Stein [joe.st...@stealth.ly] Sent: Friday, May 01, 2015 5:25 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Hi Aditya, thanks for the write up and focusing on this piece. Agreed we need something that we can do broker changes dynamically without rolling restarts. I think though if every broker is getting changes it with notifications it is going to limit which configs can be dynamic. We could never deliver a stop the world configuration change because then that would happen on the entire cluster to every broker on the same time. Can maybe just the controller get the notification? And we provide a layer for brokers to work with the controller to-do the config change operations at is discretion (so it can stop things if needs). controller gets notification, sends AdminChangeNotification to broker [X .. N] then brokers can do their things, even send a response for heartbeating while it takes the few milliseconds it needs or crashes. We need to go through both scenarios. I am worried we put this change in like this and it works for quotas and maybe a few other things but nothing else gets dynamic and we don't get far enough for almost no more rolling restarts. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote: 1. I have deep concerns about managing configuration in ZooKeeper. First, Producers and Consumers shouldn't depend on ZK at all, this seems to add back a dependency we are trying to get away from. The KIP probably needs to be clarified here - I don't think Aditya was referring to client (producer/consumer) configs. These are global client-id-specific configs that need to be managed centrally. (Specifically, quota overrides on a per-client basis).
RE: [DISCUSS] KIP-21 Configuration Management
Hey Neha, Thanks for the feedback. 1. In my earlier exchange with Jay, I mentioned the broker writing all it's configs to ZK (while respecting the overrides). Then ZK can be used to view all configs. 2. Need to think about this a bit more. Perhaps we can discuss this during the hangout tomorrow? 3 4) I viewed these config changes as mainly administrative operations. In the case, it may be reasonable to assume that the ZK port is available for communication from the machine these commands are run. Having a ConfigChangeRequest (or similar) is nice to have but having a new API and sending requests to controller also change how we do topic based configuration currently. I was hoping to keep this KIP as minimal as possible and provide a means to represent and modify client and broker based configs in a central place. Are there any concerns if we tackle these things in a later KIP? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b
RE: [DISCUSS] KIP-21 Configuration Management
1. Essentially, I think removing the option to configure properties via file may be a big change for everyone. Having said that, your points are very valid. I guess we can discuss this a bit more during the KIP hangout. 4. Yes, we will need to make some changes to update the MetricConfig for any metric. I left it out because I felt it wasn't strictly related to the KIP. Thanks for the nice summary on the implementation breakdown. Basically, the KIP should provide a uniform mechanism to change any type of config dynamically but the work to actually convert configs can be out of scope. Thanks, Aditya From: Jay Kreps [jay.kr...@gmail.com] Sent: Monday, May 04, 2015 2:00 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Hey Aditya, 1. I would argue for either staying with what we have or else moving to a better solution, but not doing both. A solution that uses both is going to be quite complex to figure out what is configured and where it comes from. If you think this is needed let's try to construct the argument for why it is needed. Like in the workflow you described think how confusing that will be--the first time the broker starts it uses the file, but then after that if you change the file nothing happens because it has copied the file into ZK. Instead let's do the research to figure out why people would object to a pure-zk solution and then see if we can't address all those concerns. 4. I think it is fine to implement this in a second phase, but I think you will need it even to be able to update the MetricConfigs to execute the quota change, right? Not sure if I follow your question on implementation. I think what you are saying might be doing something like this as a first pass a. Generalize the TopicConfigManager to handle any type of config override as described in the doc (broker, topic, client, etc) b. Implement the concept of mutable configs in ConfigDef c. Add the client-level overrides for quotas and make sure the topic overrides all still work d. Not actually do the work to make most of the broker configs mutable since that will involve passing around the KafkaConfiguration object much more broadly in places where we have static wiring now. This would be done as-needed as we made more configs dynamic. Is that right? Personally I think that makes sense. The only concern is just that we get to a good stopping point so that people aren't left in some half-way state that we end up redoing in the next release. So I think getting to a final state with the configuration infrastructure is important, but actually making all the configs mutable can be done gradually. -Jay On Mon, May 4, 2015 at 1:31 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey Jay, Thanks for the feedback. 1. We can certainly discuss what it means to remove the file configuration as a thought exercise. However, is this something we want to do for real? IMO, we can remove file configuration by having all configs stored in zookeeper. The flow can be: - Broker starts and reads all the configs from ZK. (overrides) - Apply them on top of the defaults that are hardcoded within the broker. This should simulate file based config behavior as it is currently. - Potentially, we can write back all the merged configs to zookeeper (defaults + overrides). This means that the entire config of that broker is in ZK. Thoughts? 2. Good point. All overridden configs (topic, client level) will have a corresponding broker config that serves as a default. It should be sufficient to change that broker config dynamically and that effectively means that the default has been changed. The overrides on a per topic/client basis still take precedence. So yeah, I don't think we need to model defaults explicitly. Using an example to be sure we are on the same page, lets say we wanted to increase the log retention time for all topics without having to create a separate override for each topic, we could simply change the log.retention.time under /brokers/broker_id to the desired value and that should change the default log retention for everyone (apart from the explicitly overridden ones on a per-topic basis). 3. I thought it was cleaner to simply separate the producer and consumer configs but I guess if they present the same clientId, they are essentially the same client. I'll follow your suggestion. 4. Interesting that you mention this. I actually thought about having callbacks but I left it out of the initial proposal since I wanted to keep it relatively simple. The only configs we can change by checking references are the ones we check frequently while processing requests or (or something periodic). I shall incorporate this on the KIP. 5. What you are proposing sounds good. Initially, I was planning to push down everything to KafkaConfig by not having immutable vals within. Having a wrapper (KafkaConfiguration
Re: [DISCUSS] KIP-21 Configuration Management
to a subset of brokers in the cluster (controller if we need to). Is the AdminChangeNotification a ZK notification or a request type exposed by each broker? Thanks, Aditya From: Joe Stein [joe.st...@stealth.ly] Sent: Friday, May 01, 2015 5:25 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Hi Aditya, thanks for the write up and focusing on this piece. Agreed we need something that we can do broker changes dynamically without rolling restarts. I think though if every broker is getting changes it with notifications it is going to limit which configs can be dynamic. We could never deliver a stop the world configuration change because then that would happen on the entire cluster to every broker on the same time. Can maybe just the controller get the notification? And we provide a layer for brokers to work with the controller to-do the config change operations at is discretion (so it can stop things if needs). controller gets notification, sends AdminChangeNotification to broker [X .. N] then brokers can do their things, even send a response for heartbeating while it takes the few milliseconds it needs or crashes. We need to go through both scenarios. I am worried we put this change in like this and it works for quotas and maybe a few other things but nothing else gets dynamic and we don't get far enough for almost no more rolling restarts. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote: 1. I have deep concerns about managing configuration in ZooKeeper. First, Producers and Consumers shouldn't depend on ZK at all, this seems to add back a dependency we are trying to get away from. The KIP probably needs to be clarified here - I don't think Aditya was referring to client (producer/consumer) configs. These are global client-id-specific configs that need to be managed centrally. (Specifically, quota overrides on a per-client basis).
RE: [DISCUSS] KIP-21 Configuration Management
Hey Jay, Thanks for the feedback. 1. We can certainly discuss what it means to remove the file configuration as a thought exercise. However, is this something we want to do for real? IMO, we can remove file configuration by having all configs stored in zookeeper. The flow can be: - Broker starts and reads all the configs from ZK. (overrides) - Apply them on top of the defaults that are hardcoded within the broker. This should simulate file based config behavior as it is currently. - Potentially, we can write back all the merged configs to zookeeper (defaults + overrides). This means that the entire config of that broker is in ZK. Thoughts? 2. Good point. All overridden configs (topic, client level) will have a corresponding broker config that serves as a default. It should be sufficient to change that broker config dynamically and that effectively means that the default has been changed. The overrides on a per topic/client basis still take precedence. So yeah, I don't think we need to model defaults explicitly. Using an example to be sure we are on the same page, lets say we wanted to increase the log retention time for all topics without having to create a separate override for each topic, we could simply change the log.retention.time under /brokers/broker_id to the desired value and that should change the default log retention for everyone (apart from the explicitly overridden ones on a per-topic basis). 3. I thought it was cleaner to simply separate the producer and consumer configs but I guess if they present the same clientId, they are essentially the same client. I'll follow your suggestion. 4. Interesting that you mention this. I actually thought about having callbacks but I left it out of the initial proposal since I wanted to keep it relatively simple. The only configs we can change by checking references are the ones we check frequently while processing requests or (or something periodic). I shall incorporate this on the KIP. 5. What you are proposing sounds good. Initially, I was planning to push down everything to KafkaConfig by not having immutable vals within. Having a wrapper (KafkaConfiguration) like you suggest is probably cleaner. One implementation detail. There don't appear to be any concerns wrt the client based config section (and the topic config already exists). Are there any concerns if we keep implementation of the per-client config piece and generalizing the code in TopicConfigManager separate from the broker config section? Client configs are an immediate requirement to operationalize quotas (perhaps can be used to manage authorization also for security). The broker side changes to mark configs dynamic, implement callbacks etc.. can be implemented as a followup task since it will take longer to identity which configs can be made dynamic and actually doing the work to make them so. I think that once we have reasonable agreement on the overall picture, we can implement these things piece by piece. Thanks, Aditya From: Jay Kreps [jay.kr...@gmail.com] Sent: Friday, May 01, 2015 12:53 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead
Re: [DISCUSS] KIP-21 Configuration Management
these things piece by piece. Thanks, Aditya From: Jay Kreps [jay.kr...@gmail.com] Sent: Friday, May 01, 2015 12:53 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration which has a single volatile reference to the current KafkaConfig. KafkaConfiguration is what gets passed into various components. So to access a config you do something like config.instance.myValue. When the config changes the config manager updates this reference. d. The KafkaConfiguration is the thing that allows doing the configuration.onChange(my.config, callback) -Jay On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey everyone, Wrote up a KIP to update topic, client and broker configs dynamically via Zookeeper. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Please read and provide feedback. Thanks, Aditya PS: I've intentionally kept this discussion separate from KIP-5 since I'm not sure if that is actively being worked on and I wanted to start with a clean slate.
RE: [DISCUSS] KIP-21 Configuration Management
Hey Joe, Can you elaborate what you mean by a stop the world change? In this protocol, we can target notifications to a subset of brokers in the cluster (controller if we need to). Is the AdminChangeNotification a ZK notification or a request type exposed by each broker? Thanks, Aditya From: Joe Stein [joe.st...@stealth.ly] Sent: Friday, May 01, 2015 5:25 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Hi Aditya, thanks for the write up and focusing on this piece. Agreed we need something that we can do broker changes dynamically without rolling restarts. I think though if every broker is getting changes it with notifications it is going to limit which configs can be dynamic. We could never deliver a stop the world configuration change because then that would happen on the entire cluster to every broker on the same time. Can maybe just the controller get the notification? And we provide a layer for brokers to work with the controller to-do the config change operations at is discretion (so it can stop things if needs). controller gets notification, sends AdminChangeNotification to broker [X .. N] then brokers can do their things, even send a response for heartbeating while it takes the few milliseconds it needs or crashes. We need to go through both scenarios. I am worried we put this change in like this and it works for quotas and maybe a few other things but nothing else gets dynamic and we don't get far enough for almost no more rolling restarts. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote: 1. I have deep concerns about managing configuration in ZooKeeper. First, Producers and Consumers shouldn't depend on ZK at all, this seems to add back a dependency we are trying to get away from. The KIP probably needs to be clarified here - I don't think Aditya was referring to client (producer/consumer) configs. These are global client-id-specific configs that need to be managed centrally. (Specifically, quota overrides on a per-client basis).
Re: [DISCUSS] KIP-21 Configuration Management
Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration which has a single volatile reference to the current KafkaConfig. KafkaConfiguration is what gets passed into various components. So to access a config you do something like config.instance.myValue. When the config changes the config manager updates this reference. d. The KafkaConfiguration is the thing that allows doing the configuration.onChange(my.config, callback) -Jay On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey everyone, Wrote up a KIP to update topic, client and broker configs dynamically via Zookeeper. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Please read and provide feedback. Thanks, Aditya PS: I've intentionally kept this discussion separate from KIP-5 since I'm not sure if that is actively being worked on and I wanted to start with a clean slate. -- Thanks, Neha
RE: [DISCUSS] KIP-21 Configuration Management
Hey everyone, Thanks for the comments. I'll respond to each one-by-one. In the meantime, can we put this on the agenda for the KIP hangout for next week? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration which has a single volatile reference to the current KafkaConfig. KafkaConfiguration is what gets passed into various components. So to access a config you do something like config.instance.myValue. When the config changes the config manager updates this reference. d. The KafkaConfiguration is the thing that allows doing the configuration.onChange(my.config, callback) -Jay On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey everyone, Wrote up a KIP to update topic, client and broker configs dynamically via Zookeeper
Re: [DISCUSS] KIP-21 Configuration Management
Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration which has a single volatile reference to the current KafkaConfig. KafkaConfiguration is what gets passed into various components. So to access a config you do something like config.instance.myValue. When the config changes the config manager updates this reference. d. The KafkaConfiguration is the thing that allows doing the configuration.onChange(my.config, callback) -Jay On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey everyone, Wrote up a KIP to update topic, client and broker configs dynamically via Zookeeper. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Please read and provide feedback. Thanks, Aditya PS: I've intentionally kept this discussion separate from KIP-5 since I'm not sure if that is actively being worked on and I wanted to start with a clean slate.
RE: [DISCUSS] KIP-21 Configuration Management
Hey Gwen, Thanks for the feedback. As Joel said, these client configs do not introduce a producer/consumer zk dependency. It is configuration that is needed by the broker. From your comments, I gather that you are more worried about managing broker internal configs via Zookeeper since we already have a file. So why have two mechanisms? Given that we already manage topic specific configuration in ZK, it seems a good fit to at least have client configs there since these config parameters aren't really driven through a file anyway. It also maintains consistency. Even for broker configs, it seems very consistent to have all the overridden configs in one place which is easy to view and change. As you mentioned user's shouldn't ever have to fiddle with Zookeeper directly, our tooling should provide the ability to view and modify configs on a per-broker basis. I do like your suggestion of reloading config files but I'm not sure this works easily for everyone. For example, often these per-host overrides in config files are managed by hostname but what we really want are broker level overrides which means that it should ideally be tied to a broker-id which is a Kafka detail. In addition, sometimes these configs pushed to individual hosts aren't the properties files themselves.. rather some company specific stuff that also contains the Kafka configs. I guess the point I'm trying to make is that people may not be able to reload configs directly from file without doing some additional work in many cases. As far as propogating configuration changes, perhaps I can clarify this section a bit more. Also, we can also do a pass over all the configs in KafkaConfig and have a list of properties that can be converted slowly. Thanks, Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Thursday, April 30, 2015 5:14 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management 1. I have deep concerns about managing configuration in ZooKeeper. First, Producers and Consumers shouldn't depend on ZK at all, this seems to add back a dependency we are trying to get away from. The KIP probably needs to be clarified here - I don't think Aditya was referring to client (producer/consumer) configs. These are global client-id-specific configs that need to be managed centrally. (Specifically, quota overrides on a per-client basis).
Re: [DISCUSS] KIP-21 Configuration Management
Hi Aditya, thanks for the write up and focusing on this piece. Agreed we need something that we can do broker changes dynamically without rolling restarts. I think though if every broker is getting changes it with notifications it is going to limit which configs can be dynamic. We could never deliver a stop the world configuration change because then that would happen on the entire cluster to every broker on the same time. Can maybe just the controller get the notification? And we provide a layer for brokers to work with the controller to-do the config change operations at is discretion (so it can stop things if needs). controller gets notification, sends AdminChangeNotification to broker [X .. N] then brokers can do their things, even send a response for heartbeating while it takes the few milliseconds it needs or crashes. We need to go through both scenarios. I am worried we put this change in like this and it works for quotas and maybe a few other things but nothing else gets dynamic and we don't get far enough for almost no more rolling restarts. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote: 1. I have deep concerns about managing configuration in ZooKeeper. First, Producers and Consumers shouldn't depend on ZK at all, this seems to add back a dependency we are trying to get away from. The KIP probably needs to be clarified here - I don't think Aditya was referring to client (producer/consumer) configs. These are global client-id-specific configs that need to be managed centrally. (Specifically, quota overrides on a per-client basis).
Re: [DISCUSS] KIP-21 Configuration Management
1. I have deep concerns about managing configuration in ZooKeeper. First, Producers and Consumers shouldn't depend on ZK at all, this seems to add back a dependency we are trying to get away from. The KIP probably needs to be clarified here - I don't think Aditya was referring to client (producer/consumer) configs. These are global client-id-specific configs that need to be managed centrally. (Specifically, quota overrides on a per-client basis).