Re: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Jun Rao
Aditya,

In the following, we should encode the config properties as a json map to
be consistent with topic config.

Internally, the znodes are comma-separated key-value pairs where key
represents the configuration property to change.
{version: x, config : {X1=Y1, X2=Y2..}}

Thanks,

Jun

On Fri, May 15, 2015 at 1:09 PM, Aditya Auradkar 
aaurad...@linkedin.com.invalid wrote:

 Yes we did. I just overlooked that line.. cleaning it up now.

 Aditya

 
 From: Gwen Shapira [gshap...@cloudera.com]
 Sent: Friday, May 15, 2015 12:55 PM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 The wiki says:
 There will be 3 paths within config
 /config/clients/client_id
 /config/topics/topic_name
 /config/brokers/broker_id
 Didn't we decide that brokers will not be configured dynamically, rather we
 will keep the config in the file?

 On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar 
 aaurad...@linkedin.com.invalid wrote:

  Updated the wiki to capture our recent discussions. Please read.
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
 
  Thanks,
  Aditya
 
  
  From: Joel Koshy [jjkosh...@gmail.com]
  Sent: Tuesday, May 12, 2015 1:09 PM
  To: dev@kafka.apache.org
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  The lack of audit could be addressed to some degree by an internal
  __config_changes topic which can have very long retention. Also, per
  the hangout summary that Gwen sent out it appears that we decided
  against supporting SIGHUP/dynamic configs for the broker.
 
  Thanks,
 
  Joel
 
  On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
   Thanks for chiming in, Todd!
  
   Agree that the lack of audit and rollback is a major downside of moving
  all
   configs to ZooKeeper. Being able to configure dynamically created
  entities
   in Kafka is required though. So I think what Todd suggested is a good
   solution to managing all configs - catching SIGHUP for broker configs
 and
   storing dynamic configs in ZK like we do today.
  
   On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com
 wrote:
  
Hmm, here is how I think we can change the split brain proposal to
  make it
a bit better:
1. Get rid of broker overrides, this is just done in the config file.
  This
makes the precedence chain a lot clearer (e.g. zk always overrides
  file on
a per-entity basis).
2. Get rid of the notion of dynamic configs in ConfigDef and in the
  broker.
All overrides are dynamic and all server configs are static.
3. Create an equivalent of LogConfig for ClientConfig and any future
  config
type we make.
4. Generalize the TopicConfigManager to handle multiple types of
  overrides.
   
What we haven't done is try to think through how the pure zk approach
  would
work.
   
-Jay
   
   
   
On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com
wrote:
   
 I agree with the Joel's suggestion on keeping broker's configs in
 config file and clients/topics config in ZK. Few other projects,
  Apache
 Solr for one, also does something similar for its configurations.

 On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com
 wrote:

  I like this approach (obviously).
  I am also OK with supporting broker re-read of config file based
  on ZK
  watch instead of SIGHUP, if we see this as more consistent with
 the
rest
 of
  our code base.
 
  Either is fine by me as long as brokers keep the file and just do
refresh
  :)
 
  On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com
  javascript:; wrote:
 
   So the general concern here is the dichotomy of configs (which
 we
   already have - i.e., in the form of broker config files vs
 topic
   configs in zookeeper). We (at LinkedIn) had some discussions on
  this
   last week and had this very question for the operations team
  whose
   opinion is I think to a large degree a touchstone for this
  decision:
   Has the operations team at LinkedIn experienced any pain so
 far
  with
   managing topic configs in ZooKeeper (while broker configs are
   file-based)? It turns out that ops overwhelmingly favors the
  current
   approach. i.e., service configs as file-based configs and
client/topic
   configs in ZooKeeper is intuitive and works great. This may be
   somewhat counter-intuitive to devs, but this is one of those
decisions
   for which ops input is very critical - because for all
 practical
   purposes, they are the users in this discussion.
  
   If we continue with this dichotomy and need to support dynamic
  config
   for client/topic configs as well as select service configs then
  there
   will need to be dichotomy in the config change mechanism as
 well.
   i.e

RE: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Aditya Auradkar
Updated the wiki to capture our recent discussions. Please read.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration

Thanks,
Aditya


From: Joel Koshy [jjkosh...@gmail.com]
Sent: Tuesday, May 12, 2015 1:09 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

The lack of audit could be addressed to some degree by an internal
__config_changes topic which can have very long retention. Also, per
the hangout summary that Gwen sent out it appears that we decided
against supporting SIGHUP/dynamic configs for the broker.

Thanks,

Joel

On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
 Thanks for chiming in, Todd!

 Agree that the lack of audit and rollback is a major downside of moving all
 configs to ZooKeeper. Being able to configure dynamically created entities
 in Kafka is required though. So I think what Todd suggested is a good
 solution to managing all configs - catching SIGHUP for broker configs and
 storing dynamic configs in ZK like we do today.

 On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote:

  Hmm, here is how I think we can change the split brain proposal to make it
  a bit better:
  1. Get rid of broker overrides, this is just done in the config file. This
  makes the precedence chain a lot clearer (e.g. zk always overrides file on
  a per-entity basis).
  2. Get rid of the notion of dynamic configs in ConfigDef and in the broker.
  All overrides are dynamic and all server configs are static.
  3. Create an equivalent of LogConfig for ClientConfig and any future config
  type we make.
  4. Generalize the TopicConfigManager to handle multiple types of overrides.
 
  What we haven't done is try to think through how the pure zk approach would
  work.
 
  -Jay
 
 
 
  On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com
  wrote:
 
   I agree with the Joel's suggestion on keeping broker's configs in
   config file and clients/topics config in ZK. Few other projects, Apache
   Solr for one, also does something similar for its configurations.
  
   On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com wrote:
  
I like this approach (obviously).
I am also OK with supporting broker re-read of config file based on ZK
watch instead of SIGHUP, if we see this as more consistent with the
  rest
   of
our code base.
   
Either is fine by me as long as brokers keep the file and just do
  refresh
:)
   
On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com
javascript:; wrote:
   
 So the general concern here is the dichotomy of configs (which we
 already have - i.e., in the form of broker config files vs topic
 configs in zookeeper). We (at LinkedIn) had some discussions on this
 last week and had this very question for the operations team whose
 opinion is I think to a large degree a touchstone for this decision:
 Has the operations team at LinkedIn experienced any pain so far with
 managing topic configs in ZooKeeper (while broker configs are
 file-based)? It turns out that ops overwhelmingly favors the current
 approach. i.e., service configs as file-based configs and
  client/topic
 configs in ZooKeeper is intuitive and works great. This may be
 somewhat counter-intuitive to devs, but this is one of those
  decisions
 for which ops input is very critical - because for all practical
 purposes, they are the users in this discussion.

 If we continue with this dichotomy and need to support dynamic config
 for client/topic configs as well as select service configs then there
 will need to be dichotomy in the config change mechanism as well.
 i.e., client/topic configs will change via (say) a ZooKeeper watch
  and
 the service config will change via a config file re-read (on SIGHUP)
 after config changes have been pushed out to local files. Is this a
 bad thing? Personally, I don't think it is - i.e. I'm in favor of
  this
 approach. What do others think?

 Thanks,

 Joel

 On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
  What Todd said :)
 
  (I think my ops background is showing...)
 
  On Mon, May 11, 2015 at 10:17 PM, Todd Palino tpal...@gmail.com
javascript:; wrote:
 
   I understand your point here, Jay, but I disagree that we can't
   have
 two
   configuration systems. We have two different types of
  configuration
   information. We have configuration that relates to the service
   itself
 (the
   Kafka broker), and we have configuration that relates to the
   content
 within
   the service (topics). I would put the client configuration
  (quotas)
in
 the
   with the second part, as it is dynamic information. I just don't
   see
a
 good
   argument for effectively degrading the configuration

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Gwen Shapira
The wiki says:
There will be 3 paths within config
/config/clients/client_id
/config/topics/topic_name
/config/brokers/broker_id
Didn't we decide that brokers will not be configured dynamically, rather we
will keep the config in the file?

On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar 
aaurad...@linkedin.com.invalid wrote:

 Updated the wiki to capture our recent discussions. Please read.

 https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration

 Thanks,
 Aditya

 
 From: Joel Koshy [jjkosh...@gmail.com]
 Sent: Tuesday, May 12, 2015 1:09 PM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 The lack of audit could be addressed to some degree by an internal
 __config_changes topic which can have very long retention. Also, per
 the hangout summary that Gwen sent out it appears that we decided
 against supporting SIGHUP/dynamic configs for the broker.

 Thanks,

 Joel

 On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
  Thanks for chiming in, Todd!
 
  Agree that the lack of audit and rollback is a major downside of moving
 all
  configs to ZooKeeper. Being able to configure dynamically created
 entities
  in Kafka is required though. So I think what Todd suggested is a good
  solution to managing all configs - catching SIGHUP for broker configs and
  storing dynamic configs in ZK like we do today.
 
  On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote:
 
   Hmm, here is how I think we can change the split brain proposal to
 make it
   a bit better:
   1. Get rid of broker overrides, this is just done in the config file.
 This
   makes the precedence chain a lot clearer (e.g. zk always overrides
 file on
   a per-entity basis).
   2. Get rid of the notion of dynamic configs in ConfigDef and in the
 broker.
   All overrides are dynamic and all server configs are static.
   3. Create an equivalent of LogConfig for ClientConfig and any future
 config
   type we make.
   4. Generalize the TopicConfigManager to handle multiple types of
 overrides.
  
   What we haven't done is try to think through how the pure zk approach
 would
   work.
  
   -Jay
  
  
  
   On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com
   wrote:
  
I agree with the Joel's suggestion on keeping broker's configs in
config file and clients/topics config in ZK. Few other projects,
 Apache
Solr for one, also does something similar for its configurations.
   
On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com wrote:
   
 I like this approach (obviously).
 I am also OK with supporting broker re-read of config file based
 on ZK
 watch instead of SIGHUP, if we see this as more consistent with the
   rest
of
 our code base.

 Either is fine by me as long as brokers keep the file and just do
   refresh
 :)

 On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com
 javascript:; wrote:

  So the general concern here is the dichotomy of configs (which we
  already have - i.e., in the form of broker config files vs topic
  configs in zookeeper). We (at LinkedIn) had some discussions on
 this
  last week and had this very question for the operations team
 whose
  opinion is I think to a large degree a touchstone for this
 decision:
  Has the operations team at LinkedIn experienced any pain so far
 with
  managing topic configs in ZooKeeper (while broker configs are
  file-based)? It turns out that ops overwhelmingly favors the
 current
  approach. i.e., service configs as file-based configs and
   client/topic
  configs in ZooKeeper is intuitive and works great. This may be
  somewhat counter-intuitive to devs, but this is one of those
   decisions
  for which ops input is very critical - because for all practical
  purposes, they are the users in this discussion.
 
  If we continue with this dichotomy and need to support dynamic
 config
  for client/topic configs as well as select service configs then
 there
  will need to be dichotomy in the config change mechanism as well.
  i.e., client/topic configs will change via (say) a ZooKeeper
 watch
   and
  the service config will change via a config file re-read (on
 SIGHUP)
  after config changes have been pushed out to local files. Is
 this a
  bad thing? Personally, I don't think it is - i.e. I'm in favor of
   this
  approach. What do others think?
 
  Thanks,
 
  Joel
 
  On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
   What Todd said :)
  
   (I think my ops background is showing...)
  
   On Mon, May 11, 2015 at 10:17 PM, Todd Palino 
 tpal...@gmail.com
 javascript:; wrote:
  
I understand your point here, Jay, but I disagree that we
 can't
have
  two
configuration systems. We have two different

RE: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Aditya Auradkar
Yes we did. I just overlooked that line.. cleaning it up now.

Aditya


From: Gwen Shapira [gshap...@cloudera.com]
Sent: Friday, May 15, 2015 12:55 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

The wiki says:
There will be 3 paths within config
/config/clients/client_id
/config/topics/topic_name
/config/brokers/broker_id
Didn't we decide that brokers will not be configured dynamically, rather we
will keep the config in the file?

On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar 
aaurad...@linkedin.com.invalid wrote:

 Updated the wiki to capture our recent discussions. Please read.

 https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration

 Thanks,
 Aditya

 
 From: Joel Koshy [jjkosh...@gmail.com]
 Sent: Tuesday, May 12, 2015 1:09 PM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 The lack of audit could be addressed to some degree by an internal
 __config_changes topic which can have very long retention. Also, per
 the hangout summary that Gwen sent out it appears that we decided
 against supporting SIGHUP/dynamic configs for the broker.

 Thanks,

 Joel

 On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
  Thanks for chiming in, Todd!
 
  Agree that the lack of audit and rollback is a major downside of moving
 all
  configs to ZooKeeper. Being able to configure dynamically created
 entities
  in Kafka is required though. So I think what Todd suggested is a good
  solution to managing all configs - catching SIGHUP for broker configs and
  storing dynamic configs in ZK like we do today.
 
  On Tue, May 12, 2015 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote:
 
   Hmm, here is how I think we can change the split brain proposal to
 make it
   a bit better:
   1. Get rid of broker overrides, this is just done in the config file.
 This
   makes the precedence chain a lot clearer (e.g. zk always overrides
 file on
   a per-entity basis).
   2. Get rid of the notion of dynamic configs in ConfigDef and in the
 broker.
   All overrides are dynamic and all server configs are static.
   3. Create an equivalent of LogConfig for ClientConfig and any future
 config
   type we make.
   4. Generalize the TopicConfigManager to handle multiple types of
 overrides.
  
   What we haven't done is try to think through how the pure zk approach
 would
   work.
  
   -Jay
  
  
  
   On Mon, May 11, 2015 at 10:53 PM, Ashish Singh asi...@cloudera.com
   wrote:
  
I agree with the Joel's suggestion on keeping broker's configs in
config file and clients/topics config in ZK. Few other projects,
 Apache
Solr for one, also does something similar for its configurations.
   
On Monday, May 11, 2015, Gwen Shapira gshap...@cloudera.com wrote:
   
 I like this approach (obviously).
 I am also OK with supporting broker re-read of config file based
 on ZK
 watch instead of SIGHUP, if we see this as more consistent with the
   rest
of
 our code base.

 Either is fine by me as long as brokers keep the file and just do
   refresh
 :)

 On Tue, May 12, 2015 at 2:54 AM, Joel Koshy jjkosh...@gmail.com
 javascript:; wrote:

  So the general concern here is the dichotomy of configs (which we
  already have - i.e., in the form of broker config files vs topic
  configs in zookeeper). We (at LinkedIn) had some discussions on
 this
  last week and had this very question for the operations team
 whose
  opinion is I think to a large degree a touchstone for this
 decision:
  Has the operations team at LinkedIn experienced any pain so far
 with
  managing topic configs in ZooKeeper (while broker configs are
  file-based)? It turns out that ops overwhelmingly favors the
 current
  approach. i.e., service configs as file-based configs and
   client/topic
  configs in ZooKeeper is intuitive and works great. This may be
  somewhat counter-intuitive to devs, but this is one of those
   decisions
  for which ops input is very critical - because for all practical
  purposes, they are the users in this discussion.
 
  If we continue with this dichotomy and need to support dynamic
 config
  for client/topic configs as well as select service configs then
 there
  will need to be dichotomy in the config change mechanism as well.
  i.e., client/topic configs will change via (say) a ZooKeeper
 watch
   and
  the service config will change via a config file re-read (on
 SIGHUP)
  after config changes have been pushed out to local files. Is
 this a
  bad thing? Personally, I don't think it is - i.e. I'm in favor of
   this
  approach. What do others think?
 
  Thanks,
 
  Joel
 
  On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
   What Todd said :)
  
   (I think my ops background

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-12 Thread Jay Kreps
 the call today :)
 
  I think an additional requirement would be:
  Make sure that traditional deployment tools (Puppet,
 Chef,
   etc)
 are
still
  capable of managing Kafka configuration.
 
  For this reason, I'd like the configuration refresh to be
   pretty
   close
to
  what most Linux services are doing to force a reload of
   configuration.
  AFAIK, this involves handling HUP signal in the main
 thread
   to
  reload
  configuration. Then packaging scripts can add something
  nice
   like
 service
  kafka reload.
 
  (See Apache web server:
 
 
  https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101
   )
 
  Gwen
 
 
  On Tue, May 5, 2015 at 8:54 AM, Joel Koshy 
   jjkosh...@gmail.com javascript:;
wrote:
 
   Good discussion. Since we will be talking about this at
   11am, I
wanted
   to organize these comments into requirements to see if
 we
   are
 all
   on
   the same page.
  
   REQUIREMENT 1: Needs to accept dynamic config changes.
  This
 needs
   to
   be general enough to work for all configs that we
  envision
   may
  need
to
   accept changes at runtime. e.g., log (topic), broker,
   client
(quotas),
   etc.. possible options include:
   - ZooKeeper watcher
   - Kafka topic
   - Direct RPC to controller (or config coordinator)
  
   The current KIP is really focused on REQUIREMENT 1 and
 I
   think
  that
is
   reasonable as long as we don't come up with something
  that
  requires
   significant re-engineering to support the other
   requirements.
  
   REQUIREMENT 2: Provide consistency of configs across
   brokers
   (modulo
   per-broker overrides) or at least be able to verify
 consistency.
What
   this effectively means is that config changes must be
  seen
   by
 all
   brokers eventually and we should be able to easily
  compare
   the
  full
   config of each broker.
  
   REQUIREMENT 3: Central config store. Needs to work with
   plain
   file-based configs and other systems (e.g., puppet).
   Ideally,
   should
   not bring in other dependencies (e.g., a DB). Possible
   options:
   - ZooKeeper
   - Kafka topic
   - other? E.g. making it pluggable?
  
   Any other requirements?
  
   Thanks,
  
   Joel
  
   On Tue, May 05, 2015 at 01:38:09AM +, Aditya
 Auradkar
 wrote:
Hey Neha,
   
Thanks for the feedback.
1. In my earlier exchange with Jay, I mentioned the
   broker
   writing
 all
   it's configs to ZK (while respecting the overrides).
 Then
   ZK
 can
  be
 used
  to
   view all configs.
   
2. Need to think about this a bit more. Perhaps we
 can
 discuss
   this
   during the hangout tomorrow?
   
3  4) I viewed these config changes as mainly
   administrative
   operations. In the case, it may be reasonable to assume
   that
 the
  ZK
 port
  is
   available for communication from the machine these
   commands are
   run.
  Having
   a ConfigChangeRequest (or similar) is nice to have but
   having a
  new
API
  and
   sending requests to controller also change how we do
  topic
 based
   configuration currently. I was hoping to keep this KIP
 as
 minimal
   as
   possible and provide a means to represent and modify
   client and
broker
   based configs in a central place. Are there any
 concerns
   if we
   tackle
  these
   things in a later KIP?
   
Thanks,
Aditya

From: Neha Narkhede [n...@confluent.io
 javascript:;]
Sent: Sunday, May 03, 2015 9:48 AM
To: dev@kafka.apache.org javascript:;
Subject: Re: [DISCUSS] KIP-21 Configuration
 Management
   
Thanks for starting this discussion, Aditya. Few
   questions/comments
   
1. If you change the default values like it's
 mentioned
   in
 the
   KIP,
 do
   you
also overwrite the local config file as part of
  updating
   the
default
   value?
If not, where does the admin look to find the default
   values,
  ZK
   or
  local
Kafka config file? What if a config value

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-12 Thread Joel Koshy
 more. Perhaps we
   can
   discuss
 this
 during the hangout tomorrow?
 
  3  4) I viewed these config changes as mainly
 administrative
 operations. In the case, it may be reasonable to
  assume
 that
   the
ZK
   port
is
 available for communication from the machine these
 commands are
 run.
Having
 a ConfigChangeRequest (or similar) is nice to have
  but
 having a
new
  API
and
 sending requests to controller also change how we do
topic
   based
 configuration currently. I was hoping to keep this
  KIP
   as
   minimal
 as
 possible and provide a means to represent and modify
 client and
  broker
 based configs in a central place. Are there any
   concerns
 if we
 tackle
these
 things in a later KIP?
 
  Thanks,
  Aditya
  
  From: Neha Narkhede [n...@confluent.io
   javascript:;]
  Sent: Sunday, May 03, 2015 9:48 AM
  To: dev@kafka.apache.org javascript:;
  Subject: Re: [DISCUSS] KIP-21 Configuration
   Management
 
  Thanks for starting this discussion, Aditya. Few
 questions/comments
 
  1. If you change the default values like it's
   mentioned
 in
   the
 KIP,
   do
 you
  also overwrite the local config file as part of
updating
 the
  default
 value?
  If not, where does the admin look to find the
  default
 values,
ZK
 or
local
  Kafka config file? What if a config value is
   different
in
   both
   places?
 
  2. I share Gwen's concern around making sure that
popular
config
 management
  tools continue to work with this change. Would love
   to
 see
   how
 each
   of
  those would work with the proposal in the KIP. I
   don't
 know
 enough
about
  each of the tools but seems like in some of the
   tools,
 you
   have
 to
define
  some sort of class with parameter names as config
names.
 How
will
   such
  tools find out about the config values? In Puppet,
  if
 this
means
  that
 each
  Puppet agent has to read it from ZK, this means the
   ZK
 port
   has
 to
  be
 open
  to pretty much every machine in the DC. This is a
bummer
 and
   a
 very
  confusing requirement. Not sure if this is really a
 problem
   or
 not
(each
 of
  those tools might behave differently), though
   pointing
 out
   that
  this
   is
  something worth paying attention to.
 
  3. The wrapper tools that let users read/change
   config
 tools
 should
   not
  depend on ZK for the reason mentioned above. It's a
pain
 to
 assume
   that
 the
  ZK port is open from any machine that needs to run
   this
 tool.
  Ideally
 what
  users want is a REST API to the brokers to change
  or
 read the
  config
(ala
  Elasticsearch), but in the absence of the REST API,
   we
 should
 think
   if
we
  can write the tool such that it just requires
  talking
to
 the
 Kafka
broker
  port. This will require a config RPC.
 
  4. Not sure if KIP is the right place to discuss
  the
 design
   of
 propagating
  the config changes to the brokers, but have you
   thought
 about
 just
 letting
  the controller oversee the config changes and
   propagate
 via
   RPC
 to
   the
  brokers? That way, there is an easier way to
  express
 config
 changes
that
  require all brokers to change it for it to be
  called
   complete.
  Maybe
this
  is not required, but it is hard to say if we don't
 discuss
   the
 full
   set
 of
  configs that need to be dynamic.
 
  Thanks,
  Neha
 
  On Fri, May 1, 2015 at 12:53 PM, Jay Kreps 
jay.kr...@gmail.com javascript

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-11 Thread Jay Kreps
:
  
Sorry I missed the call today :)
   
I think an additional requirement would be:
Make sure that traditional deployment tools (Puppet, Chef, etc) are
  still
capable of managing Kafka configuration.
   
For this reason, I'd like the configuration refresh to be pretty
 close
  to
what most Linux services are doing to force a reload of
 configuration.
AFAIK, this involves handling HUP signal in the main thread to reload
configuration. Then packaging scripts can add something nice like
   service
kafka reload.
   
(See Apache web server:
https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101
 )
   
Gwen
   
   
On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com
  wrote:
   
 Good discussion. Since we will be talking about this at 11am, I
  wanted
 to organize these comments into requirements to see if we are all
 on
 the same page.

 REQUIREMENT 1: Needs to accept dynamic config changes. This needs
 to
 be general enough to work for all configs that we envision may need
  to
 accept changes at runtime. e.g., log (topic), broker, client
  (quotas),
 etc.. possible options include:
 - ZooKeeper watcher
 - Kafka topic
 - Direct RPC to controller (or config coordinator)

 The current KIP is really focused on REQUIREMENT 1 and I think that
  is
 reasonable as long as we don't come up with something that requires
 significant re-engineering to support the other requirements.

 REQUIREMENT 2: Provide consistency of configs across brokers
 (modulo
 per-broker overrides) or at least be able to verify consistency.
  What
 this effectively means is that config changes must be seen by all
 brokers eventually and we should be able to easily compare the full
 config of each broker.

 REQUIREMENT 3: Central config store. Needs to work with plain
 file-based configs and other systems (e.g., puppet). Ideally,
 should
 not bring in other dependencies (e.g., a DB). Possible options:
 - ZooKeeper
 - Kafka topic
 - other? E.g. making it pluggable?

 Any other requirements?

 Thanks,

 Joel

 On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
  Hey Neha,
 
  Thanks for the feedback.
  1. In my earlier exchange with Jay, I mentioned the broker
 writing
   all
 it's configs to ZK (while respecting the overrides). Then ZK can be
   used
to
 view all configs.
 
  2. Need to think about this a bit more. Perhaps we can discuss
 this
 during the hangout tomorrow?
 
  3  4) I viewed these config changes as mainly administrative
 operations. In the case, it may be reasonable to assume that the ZK
   port
is
 available for communication from the machine these commands are
 run.
Having
 a ConfigChangeRequest (or similar) is nice to have but having a new
  API
and
 sending requests to controller also change how we do topic based
 configuration currently. I was hoping to keep this KIP as minimal
 as
 possible and provide a means to represent and modify client and
  broker
 based configs in a central place. Are there any concerns if we
 tackle
these
 things in a later KIP?
 
  Thanks,
  Aditya
  
  From: Neha Narkhede [n...@confluent.io]
  Sent: Sunday, May 03, 2015 9:48 AM
  To: dev@kafka.apache.org
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  Thanks for starting this discussion, Aditya. Few
 questions/comments
 
  1. If you change the default values like it's mentioned in the
 KIP,
   do
 you
  also overwrite the local config file as part of updating the
  default
 value?
  If not, where does the admin look to find the default values, ZK
 or
local
  Kafka config file? What if a config value is different in both
   places?
 
  2. I share Gwen's concern around making sure that popular config
 management
  tools continue to work with this change. Would love to see how
 each
   of
  those would work with the proposal in the KIP. I don't know
 enough
about
  each of the tools but seems like in some of the tools, you have
 to
define
  some sort of class with parameter names as config names. How will
   such
  tools find out about the config values? In Puppet, if this means
  that
 each
  Puppet agent has to read it from ZK, this means the ZK port has
 to
  be
 open
  to pretty much every machine in the DC. This is a bummer and a
 very
  confusing requirement. Not sure if this is really a problem or
 not
(each
 of
  those tools might behave differently), though pointing out that
  this
   is
  something worth paying attention to.
 
  3. The wrapper tools that let users read/change config tools
 should

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-11 Thread Gwen Shapira
. The
   server
 is then responsible for pushing the changes to Chef clients. Chef
   clients
 support pluggable logic. For example, it can generate a config file
   that
 Kafka broker will take. If we move all configs to ZK, we can
  customize
the
 Chef client to use our config CLI to make the config changes in
  Kafka.
   In
 this model, one probably doesn't need to register every broker in
  Chef
for
 the config push. Not sure if Puppet works in a similar way.

 Also for storing the configs, we probably can't store the
  broker/global
 level configs in Kafka itself (e.g. in a special topic). The reason
  is
that
 in order to start a broker, we likely need to make some broker
 level
config
 changes (e.g., the default log.dir may not be present, the default
  port
may
 not be available, etc). If we need a broker to be up to make those
changes,
 we get into this chicken and egg problem.

 Thanks,

 Jun

 On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira 
 gshap...@cloudera.com
 wrote:

  Sorry I missed the call today :)
 
  I think an additional requirement would be:
  Make sure that traditional deployment tools (Puppet, Chef, etc)
 are
still
  capable of managing Kafka configuration.
 
  For this reason, I'd like the configuration refresh to be pretty
   close
to
  what most Linux services are doing to force a reload of
   configuration.
  AFAIK, this involves handling HUP signal in the main thread to
  reload
  configuration. Then packaging scripts can add something nice like
 service
  kafka reload.
 
  (See Apache web server:
 
  https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101
   )
 
  Gwen
 
 
  On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com
wrote:
 
   Good discussion. Since we will be talking about this at 11am, I
wanted
   to organize these comments into requirements to see if we are
 all
   on
   the same page.
  
   REQUIREMENT 1: Needs to accept dynamic config changes. This
 needs
   to
   be general enough to work for all configs that we envision may
  need
to
   accept changes at runtime. e.g., log (topic), broker, client
(quotas),
   etc.. possible options include:
   - ZooKeeper watcher
   - Kafka topic
   - Direct RPC to controller (or config coordinator)
  
   The current KIP is really focused on REQUIREMENT 1 and I think
  that
is
   reasonable as long as we don't come up with something that
  requires
   significant re-engineering to support the other requirements.
  
   REQUIREMENT 2: Provide consistency of configs across brokers
   (modulo
   per-broker overrides) or at least be able to verify
 consistency.
What
   this effectively means is that config changes must be seen by
 all
   brokers eventually and we should be able to easily compare the
  full
   config of each broker.
  
   REQUIREMENT 3: Central config store. Needs to work with plain
   file-based configs and other systems (e.g., puppet). Ideally,
   should
   not bring in other dependencies (e.g., a DB). Possible options:
   - ZooKeeper
   - Kafka topic
   - other? E.g. making it pluggable?
  
   Any other requirements?
  
   Thanks,
  
   Joel
  
   On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar
 wrote:
Hey Neha,
   
Thanks for the feedback.
1. In my earlier exchange with Jay, I mentioned the broker
   writing
 all
   it's configs to ZK (while respecting the overrides). Then ZK
 can
  be
 used
  to
   view all configs.
   
2. Need to think about this a bit more. Perhaps we can
 discuss
   this
   during the hangout tomorrow?
   
3  4) I viewed these config changes as mainly administrative
   operations. In the case, it may be reasonable to assume that
 the
  ZK
 port
  is
   available for communication from the machine these commands are
   run.
  Having
   a ConfigChangeRequest (or similar) is nice to have but having a
  new
API
  and
   sending requests to controller also change how we do topic
 based
   configuration currently. I was hoping to keep this KIP as
 minimal
   as
   possible and provide a means to represent and modify client and
broker
   based configs in a central place. Are there any concerns if we
   tackle
  these
   things in a later KIP?
   
Thanks,
Aditya

From: Neha Narkhede [n...@confluent.io]
Sent: Sunday, May 03, 2015 9:48 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management
   
Thanks for starting this discussion, Aditya. Few

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-11 Thread Gwen Shapira
 generate a config file
  that
Kafka broker will take. If we move all configs to ZK, we can
 customize
   the
Chef client to use our config CLI to make the config changes in
 Kafka.
  In
this model, one probably doesn't need to register every broker in
 Chef
   for
the config push. Not sure if Puppet works in a similar way.
   
Also for storing the configs, we probably can't store the
 broker/global
level configs in Kafka itself (e.g. in a special topic). The reason
 is
   that
in order to start a broker, we likely need to make some broker level
   config
changes (e.g., the default log.dir may not be present, the default
 port
   may
not be available, etc). If we need a broker to be up to make those
   changes,
we get into this chicken and egg problem.
   
Thanks,
   
Jun
   
On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira gshap...@cloudera.com
wrote:
   
 Sorry I missed the call today :)

 I think an additional requirement would be:
 Make sure that traditional deployment tools (Puppet, Chef, etc) are
   still
 capable of managing Kafka configuration.

 For this reason, I'd like the configuration refresh to be pretty
  close
   to
 what most Linux services are doing to force a reload of
  configuration.
 AFAIK, this involves handling HUP signal in the main thread to
 reload
 configuration. Then packaging scripts can add something nice like
service
 kafka reload.

 (See Apache web server:

 https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101
  )

 Gwen


 On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com
   wrote:

  Good discussion. Since we will be talking about this at 11am, I
   wanted
  to organize these comments into requirements to see if we are all
  on
  the same page.
 
  REQUIREMENT 1: Needs to accept dynamic config changes. This needs
  to
  be general enough to work for all configs that we envision may
 need
   to
  accept changes at runtime. e.g., log (topic), broker, client
   (quotas),
  etc.. possible options include:
  - ZooKeeper watcher
  - Kafka topic
  - Direct RPC to controller (or config coordinator)
 
  The current KIP is really focused on REQUIREMENT 1 and I think
 that
   is
  reasonable as long as we don't come up with something that
 requires
  significant re-engineering to support the other requirements.
 
  REQUIREMENT 2: Provide consistency of configs across brokers
  (modulo
  per-broker overrides) or at least be able to verify consistency.
   What
  this effectively means is that config changes must be seen by all
  brokers eventually and we should be able to easily compare the
 full
  config of each broker.
 
  REQUIREMENT 3: Central config store. Needs to work with plain
  file-based configs and other systems (e.g., puppet). Ideally,
  should
  not bring in other dependencies (e.g., a DB). Possible options:
  - ZooKeeper
  - Kafka topic
  - other? E.g. making it pluggable?
 
  Any other requirements?
 
  Thanks,
 
  Joel
 
  On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
   Hey Neha,
  
   Thanks for the feedback.
   1. In my earlier exchange with Jay, I mentioned the broker
  writing
all
  it's configs to ZK (while respecting the overrides). Then ZK can
 be
used
 to
  view all configs.
  
   2. Need to think about this a bit more. Perhaps we can discuss
  this
  during the hangout tomorrow?
  
   3  4) I viewed these config changes as mainly administrative
  operations. In the case, it may be reasonable to assume that the
 ZK
port
 is
  available for communication from the machine these commands are
  run.
 Having
  a ConfigChangeRequest (or similar) is nice to have but having a
 new
   API
 and
  sending requests to controller also change how we do topic based
  configuration currently. I was hoping to keep this KIP as minimal
  as
  possible and provide a means to represent and modify client and
   broker
  based configs in a central place. Are there any concerns if we
  tackle
 these
  things in a later KIP?
  
   Thanks,
   Aditya
   
   From: Neha Narkhede [n...@confluent.io]
   Sent: Sunday, May 03, 2015 9:48 AM
   To: dev@kafka.apache.org
   Subject: Re: [DISCUSS] KIP-21 Configuration Management
  
   Thanks for starting this discussion, Aditya. Few
  questions/comments
  
   1. If you change the default values like it's mentioned in the
  KIP,
do
  you
   also overwrite the local config file as part of updating the
   default
  value?
   If not, where does the admin look to find the default values,
 ZK
  or
 local
   Kafka config

RE: [DISCUSS] KIP-21 Configuration Management

2015-05-11 Thread Aditya Auradkar
I did initially think having everything in ZK was better than having the 
dichotomy Joel referred to primarily because all kafka configs can be managed 
consistently.

I guess the biggest disadvantage of driving broker config primarily from ZK is 
that it requires everyone to manage Kafka configuration separately from other 
services. Several people have separately mentioned integration issues with 
systems like Puppet and Chef. While they may support pluggable logic, it does 
require everyone to write that additional piece of logic specific to Kafka. We 
will have to implement group, fabric, tag hierarchy (as Ashish mentioned), 
auditing and ACL management. While this potential consistency is nice, perhaps 
the tradeoff isn't worth it given that the resulting system isn't much superior 
to pushing out new config files and is also quite disruptive. Since this 
impacts operations teams the most, I also think their input is probably the 
most valuable and should perhaps drive the outcome.

I also think it is fine to treat topic and client configuration separately 
because they are more like metadata than actual service configuration. 

Aditya

From: Joel Koshy [jjkosh...@gmail.com]
Sent: Monday, May 11, 2015 4:54 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

So the general concern here is the dichotomy of configs (which we
already have - i.e., in the form of broker config files vs topic
configs in zookeeper). We (at LinkedIn) had some discussions on this
last week and had this very question for the operations team whose
opinion is I think to a large degree a touchstone for this decision:
Has the operations team at LinkedIn experienced any pain so far with
managing topic configs in ZooKeeper (while broker configs are
file-based)? It turns out that ops overwhelmingly favors the current
approach. i.e., service configs as file-based configs and client/topic
configs in ZooKeeper is intuitive and works great. This may be
somewhat counter-intuitive to devs, but this is one of those decisions
for which ops input is very critical - because for all practical
purposes, they are the users in this discussion.

If we continue with this dichotomy and need to support dynamic config
for client/topic configs as well as select service configs then there
will need to be dichotomy in the config change mechanism as well.
i.e., client/topic configs will change via (say) a ZooKeeper watch and
the service config will change via a config file re-read (on SIGHUP)
after config changes have been pushed out to local files. Is this a
bad thing? Personally, I don't think it is - i.e. I'm in favor of this
approach. What do others think?

Thanks,

Joel

On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
 What Todd said :)

 (I think my ops background is showing...)

 On Mon, May 11, 2015 at 10:17 PM, Todd Palino tpal...@gmail.com wrote:

  I understand your point here, Jay, but I disagree that we can't have two
  configuration systems. We have two different types of configuration
  information. We have configuration that relates to the service itself (the
  Kafka broker), and we have configuration that relates to the content within
  the service (topics). I would put the client configuration (quotas) in the
  with the second part, as it is dynamic information. I just don't see a good
  argument for effectively degrading the configuration for the service
  because of trying to keep it paired with the configuration of dynamic
  resources.
 
  -Todd
 
  On Mon, May 11, 2015 at 11:33 AM, Jay Kreps jay.kr...@gmail.com wrote:
 
   I totally agree that ZK is not in-and-of-itself a configuration
  management
   solution and it would be better if we could just keep all our config in
   files. Anyone who has followed the various config discussions over the
  past
   few years of discussion knows I'm the biggest proponent of immutable
   file-driven config.
  
   The analogy to normal unix services isn't actually quite right though.
   The problem Kafka has is that a number of the configurable entities it
   manages are added dynamically--topics, clients, consumer groups, etc.
  What
   this actually resembles is not a unix services like HTTPD but a database,
   and databases typically do manage config dynamically for exactly the same
   reason.
  
   The last few emails are arguing that files  ZK as a config solution. I
   agree with this, but that isn't really the question, right?The reality is
   that we need to be able to configure dynamically created entities and we
   won't get a satisfactory solution to that using files (e.g. rsync is not
  an
   acceptable topic creation mechanism). What we are discussing is having a
   single config mechanism or multiple. If we have multiple you need to
  solve
   the whole config lifecycle problem for both--management, audit, rollback,
   etc.
  
   Gwen, you were saying we couldn't get rid of the configuration file

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-11 Thread Joel Koshy
:
- ZooKeeper
- Kafka topic
- other? E.g. making it pluggable?
   
Any other requirements?
   
Thanks,
   
Joel
   
On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar
  wrote:
 Hey Neha,

 Thanks for the feedback.
 1. In my earlier exchange with Jay, I mentioned the broker
writing
  all
it's configs to ZK (while respecting the overrides). Then ZK
  can
   be
  used
   to
view all configs.

 2. Need to think about this a bit more. Perhaps we can
  discuss
this
during the hangout tomorrow?

 3  4) I viewed these config changes as mainly administrative
operations. In the case, it may be reasonable to assume that
  the
   ZK
  port
   is
available for communication from the machine these commands are
run.
   Having
a ConfigChangeRequest (or similar) is nice to have but having a
   new
 API
   and
sending requests to controller also change how we do topic
  based
configuration currently. I was hoping to keep this KIP as
  minimal
as
possible and provide a means to represent and modify client and
 broker
based configs in a central place. Are there any concerns if we
tackle
   these
things in a later KIP?

 Thanks,
 Aditya
 
 From: Neha Narkhede [n...@confluent.io]
 Sent: Sunday, May 03, 2015 9:48 AM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 Thanks for starting this discussion, Aditya. Few
questions/comments

 1. If you change the default values like it's mentioned in
  the
KIP,
  do
you
 also overwrite the local config file as part of updating the
 default
value?
 If not, where does the admin look to find the default values,
   ZK
or
   local
 Kafka config file? What if a config value is different in
  both
  places?

 2. I share Gwen's concern around making sure that popular
   config
management
 tools continue to work with this change. Would love to see
  how
each
  of
 those would work with the proposal in the KIP. I don't know
enough
   about
 each of the tools but seems like in some of the tools, you
  have
to
   define
 some sort of class with parameter names as config names. How
   will
  such
 tools find out about the config values? In Puppet, if this
   means
 that
each
 Puppet agent has to read it from ZK, this means the ZK port
  has
to
 be
open
 to pretty much every machine in the DC. This is a bummer and
  a
very
 confusing requirement. Not sure if this is really a problem
  or
not
   (each
of
 those tools might behave differently), though pointing out
  that
 this
  is
 something worth paying attention to.

 3. The wrapper tools that let users read/change config tools
should
  not
 depend on ZK for the reason mentioned above. It's a pain to
assume
  that
the
 ZK port is open from any machine that needs to run this tool.
 Ideally
what
 users want is a REST API to the brokers to change or read the
 config
   (ala
 Elasticsearch), but in the absence of the REST API, we should
think
  if
   we
 can write the tool such that it just requires talking to the
Kafka
   broker
 port. This will require a config RPC.

 4. Not sure if KIP is the right place to discuss the design
  of
propagating
 the config changes to the brokers, but have you thought about
just
letting
 the controller oversee the config changes and propagate via
  RPC
to
  the
 brokers? That way, there is an easier way to express config
changes
   that
 require all brokers to change it for it to be called
  complete.
 Maybe
   this
 is not required, but it is hard to say if we don't discuss
  the
full
  set
of
 configs that need to be dynamic.

 Thanks,
 Neha

 On Fri, May 1, 2015 at 12:53 PM, Jay Kreps 
   jay.kr...@gmail.com
   wrote:

  Hey Aditya,
 
  This is a great! A couple of comments:
 
  1. Leaving the file config in place is definitely the least
disturbance.
  But let's really think about getting rid of the files and
   just
 have
   one
  config mechanism. There is always a tendency to make
   everything
pluggable
  which so often just leads to two

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-11 Thread Ashish Singh
 into requirements to see if we
  are
all
  on
  the same page.
 
  REQUIREMENT 1: Needs to accept dynamic config changes.
 This
needs
  to
  be general enough to work for all configs that we
 envision
  may
 need
   to
  accept changes at runtime. e.g., log (topic), broker,
  client
   (quotas),
  etc.. possible options include:
  - ZooKeeper watcher
  - Kafka topic
  - Direct RPC to controller (or config coordinator)
 
  The current KIP is really focused on REQUIREMENT 1 and I
  think
 that
   is
  reasonable as long as we don't come up with something
 that
 requires
  significant re-engineering to support the other
  requirements.
 
  REQUIREMENT 2: Provide consistency of configs across
  brokers
  (modulo
  per-broker overrides) or at least be able to verify
consistency.
   What
  this effectively means is that config changes must be
 seen
  by
all
  brokers eventually and we should be able to easily
 compare
  the
 full
  config of each broker.
 
  REQUIREMENT 3: Central config store. Needs to work with
  plain
  file-based configs and other systems (e.g., puppet).
  Ideally,
  should
  not bring in other dependencies (e.g., a DB). Possible
  options:
  - ZooKeeper
  - Kafka topic
  - other? E.g. making it pluggable?
 
  Any other requirements?
 
  Thanks,
 
  Joel
 
  On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar
wrote:
   Hey Neha,
  
   Thanks for the feedback.
   1. In my earlier exchange with Jay, I mentioned the
  broker
  writing
all
  it's configs to ZK (while respecting the overrides). Then
  ZK
can
 be
used
 to
  view all configs.
  
   2. Need to think about this a bit more. Perhaps we can
discuss
  this
  during the hangout tomorrow?
  
   3  4) I viewed these config changes as mainly
  administrative
  operations. In the case, it may be reasonable to assume
  that
the
 ZK
port
 is
  available for communication from the machine these
  commands are
  run.
 Having
  a ConfigChangeRequest (or similar) is nice to have but
  having a
 new
   API
 and
  sending requests to controller also change how we do
 topic
based
  configuration currently. I was hoping to keep this KIP as
minimal
  as
  possible and provide a means to represent and modify
  client and
   broker
  based configs in a central place. Are there any concerns
  if we
  tackle
 these
  things in a later KIP?
  
   Thanks,
   Aditya
   
   From: Neha Narkhede [n...@confluent.io javascript:;]
   Sent: Sunday, May 03, 2015 9:48 AM
   To: dev@kafka.apache.org javascript:;
   Subject: Re: [DISCUSS] KIP-21 Configuration Management
  
   Thanks for starting this discussion, Aditya. Few
  questions/comments
  
   1. If you change the default values like it's mentioned
  in
the
  KIP,
do
  you
   also overwrite the local config file as part of
 updating
  the
   default
  value?
   If not, where does the admin look to find the default
  values,
 ZK
  or
 local
   Kafka config file? What if a config value is different
 in
both
places?
  
   2. I share Gwen's concern around making sure that
 popular
 config
  management
   tools continue to work with this change. Would love to
  see
how
  each
of
   those would work with the proposal in the KIP. I don't
  know
  enough
 about
   each of the tools but seems like in some of the tools,
  you
have
  to
 define
   some sort of class with parameter names as config
 names.
  How
 will
such
   tools find out about the config values? In Puppet, if
  this
 means
   that
  each
   Puppet agent has to read it from ZK, this means the ZK
  port
has
  to
   be
  open
   to pretty much every machine in the DC. This is a
 bummer
  and
a
  very
   confusing requirement. Not sure if this is really a
  problem
or
  not
 (each
  of
   those tools might behave differently), though pointing
  out
that
   this
is
   something worth paying attention to.
  
   3. The wrapper tools that let users read/change config
  tools

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-11 Thread Gwen Shapira
 re-engineering to support the other
 requirements.

 REQUIREMENT 2: Provide consistency of configs across
 brokers
 (modulo
 per-broker overrides) or at least be able to verify
   consistency.
  What
 this effectively means is that config changes must be seen
 by
   all
 brokers eventually and we should be able to easily compare
 the
full
 config of each broker.

 REQUIREMENT 3: Central config store. Needs to work with
 plain
 file-based configs and other systems (e.g., puppet).
 Ideally,
 should
 not bring in other dependencies (e.g., a DB). Possible
 options:
 - ZooKeeper
 - Kafka topic
 - other? E.g. making it pluggable?

 Any other requirements?

 Thanks,

 Joel

 On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar
   wrote:
  Hey Neha,
 
  Thanks for the feedback.
  1. In my earlier exchange with Jay, I mentioned the
 broker
 writing
   all
 it's configs to ZK (while respecting the overrides). Then
 ZK
   can
be
   used
to
 view all configs.
 
  2. Need to think about this a bit more. Perhaps we can
   discuss
 this
 during the hangout tomorrow?
 
  3  4) I viewed these config changes as mainly
 administrative
 operations. In the case, it may be reasonable to assume
 that
   the
ZK
   port
is
 available for communication from the machine these
 commands are
 run.
Having
 a ConfigChangeRequest (or similar) is nice to have but
 having a
new
  API
and
 sending requests to controller also change how we do topic
   based
 configuration currently. I was hoping to keep this KIP as
   minimal
 as
 possible and provide a means to represent and modify
 client and
  broker
 based configs in a central place. Are there any concerns
 if we
 tackle
these
 things in a later KIP?
 
  Thanks,
  Aditya
  
  From: Neha Narkhede [n...@confluent.io]
  Sent: Sunday, May 03, 2015 9:48 AM
  To: dev@kafka.apache.org
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  Thanks for starting this discussion, Aditya. Few
 questions/comments
 
  1. If you change the default values like it's mentioned
 in
   the
 KIP,
   do
 you
  also overwrite the local config file as part of updating
 the
  default
 value?
  If not, where does the admin look to find the default
 values,
ZK
 or
local
  Kafka config file? What if a config value is different in
   both
   places?
 
  2. I share Gwen's concern around making sure that popular
config
 management
  tools continue to work with this change. Would love to
 see
   how
 each
   of
  those would work with the proposal in the KIP. I don't
 know
 enough
about
  each of the tools but seems like in some of the tools,
 you
   have
 to
define
  some sort of class with parameter names as config names.
 How
will
   such
  tools find out about the config values? In Puppet, if
 this
means
  that
 each
  Puppet agent has to read it from ZK, this means the ZK
 port
   has
 to
  be
 open
  to pretty much every machine in the DC. This is a bummer
 and
   a
 very
  confusing requirement. Not sure if this is really a
 problem
   or
 not
(each
 of
  those tools might behave differently), though pointing
 out
   that
  this
   is
  something worth paying attention to.
 
  3. The wrapper tools that let users read/change config
 tools
 should
   not
  depend on ZK for the reason mentioned above. It's a pain
 to
 assume
   that
 the
  ZK port is open from any machine that needs to run this
 tool.
  Ideally
 what
  users want is a REST API to the brokers to change or
 read the
  config
(ala
  Elasticsearch), but in the absence of the REST API, we
 should
 think
   if
we
  can write the tool such that it just requires talking to
 the
 Kafka
broker
  port. This will require a config RPC.
 
  4. Not sure if KIP is the right place to discuss the
 design
   of
 propagating
  the config changes to the brokers, but have you thought
 about
 just
 letting
  the controller oversee the config changes and propagate
 via
   RPC
 to
   the
  brokers? That way

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-10 Thread Gwen Shapira
. Are there any concerns if we tackle
  these
   things in a later KIP?
   
Thanks,
Aditya

From: Neha Narkhede [n...@confluent.io]
Sent: Sunday, May 03, 2015 9:48 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management
   
Thanks for starting this discussion, Aditya. Few questions/comments
   
1. If you change the default values like it's mentioned in the KIP,
 do
   you
also overwrite the local config file as part of updating the default
   value?
If not, where does the admin look to find the default values, ZK or
  local
Kafka config file? What if a config value is different in both
 places?
   
2. I share Gwen's concern around making sure that popular config
   management
tools continue to work with this change. Would love to see how each
 of
those would work with the proposal in the KIP. I don't know enough
  about
each of the tools but seems like in some of the tools, you have to
  define
some sort of class with parameter names as config names. How will
 such
tools find out about the config values? In Puppet, if this means that
   each
Puppet agent has to read it from ZK, this means the ZK port has to be
   open
to pretty much every machine in the DC. This is a bummer and a very
confusing requirement. Not sure if this is really a problem or not
  (each
   of
those tools might behave differently), though pointing out that this
 is
something worth paying attention to.
   
3. The wrapper tools that let users read/change config tools should
 not
depend on ZK for the reason mentioned above. It's a pain to assume
 that
   the
ZK port is open from any machine that needs to run this tool. Ideally
   what
users want is a REST API to the brokers to change or read the config
  (ala
Elasticsearch), but in the absence of the REST API, we should think
 if
  we
can write the tool such that it just requires talking to the Kafka
  broker
port. This will require a config RPC.
   
4. Not sure if KIP is the right place to discuss the design of
   propagating
the config changes to the brokers, but have you thought about just
   letting
the controller oversee the config changes and propagate via RPC to
 the
brokers? That way, there is an easier way to express config changes
  that
require all brokers to change it for it to be called complete. Maybe
  this
is not required, but it is hard to say if we don't discuss the full
 set
   of
configs that need to be dynamic.
   
Thanks,
Neha
   
On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com
  wrote:
   
 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least
   disturbance.
 But let's really think about getting rid of the files and just have
  one
 config mechanism. There is always a tendency to make everything
   pluggable
 which so often just leads to two mediocre solutions. Can we do the
   exercise
 of trying to consider fully getting rid of file config and seeing
  what
   goes
 wrong?

 2. Do we need to model defaults? The current approach is that if
 you
   have a
 global config x it is overridden for a topic xyz by /topics/xyz/x,
  and
   I
 think this could be extended to /brokers/0/x. I think this is
  simpler.
   We
 need to specify the precedence for these overrides, e.g. if you
   override at
 the broker and topic level I think the topic level takes
 precedence.

 3. I recommend we have the producer and consumer config just be an
   override
 under client.id. The override is by client id and we can have
  separate
 properties for controlling quotas for producers and consumers.

 4. Some configs can be changed just by updating the reference,
 others
   may
 require some action. An example of this is if you want to disable
 log
 compaction (assuming we wanted to make that dynamic) we need to
 call
 shutdown() on the cleaner. I think it may be required to register a
 listener callback that gets called when the config changes.

 5. For handling the reference can you explain your plan a bit?
   Currently we
 have an immutable KafkaConfig object with a bunch of vals. That or
 individual values in there get injected all over the code base. I
 was
 thinking something like this:
 a. We retain the KafkaConfig object as an immutable object just as
   today.
 b. It is no longer legit to grab values out fo that config if they
  are
 changeable.
 c. Instead of making KafkaConfig itself mutable we make
   KafkaConfiguration
 which has a single volatile reference to the current KafkaConfig.
 KafkaConfiguration is what gets passed into various components. So
 to
 access a config you do something like config.instance.myValue. When

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-10 Thread Todd Palino
 and we should be able to easily compare the full
config of each broker.
   
REQUIREMENT 3: Central config store. Needs to work with plain
file-based configs and other systems (e.g., puppet). Ideally, should
not bring in other dependencies (e.g., a DB). Possible options:
- ZooKeeper
- Kafka topic
- other? E.g. making it pluggable?
   
Any other requirements?
   
Thanks,
   
Joel
   
On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
 Hey Neha,

 Thanks for the feedback.
 1. In my earlier exchange with Jay, I mentioned the broker writing
  all
it's configs to ZK (while respecting the overrides). Then ZK can be
  used
   to
view all configs.

 2. Need to think about this a bit more. Perhaps we can discuss this
during the hangout tomorrow?

 3  4) I viewed these config changes as mainly administrative
operations. In the case, it may be reasonable to assume that the ZK
  port
   is
available for communication from the machine these commands are run.
   Having
a ConfigChangeRequest (or similar) is nice to have but having a new
 API
   and
sending requests to controller also change how we do topic based
configuration currently. I was hoping to keep this KIP as minimal as
possible and provide a means to represent and modify client and
 broker
based configs in a central place. Are there any concerns if we tackle
   these
things in a later KIP?

 Thanks,
 Aditya
 
 From: Neha Narkhede [n...@confluent.io]
 Sent: Sunday, May 03, 2015 9:48 AM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 Thanks for starting this discussion, Aditya. Few questions/comments

 1. If you change the default values like it's mentioned in the KIP,
  do
you
 also overwrite the local config file as part of updating the
 default
value?
 If not, where does the admin look to find the default values, ZK or
   local
 Kafka config file? What if a config value is different in both
  places?

 2. I share Gwen's concern around making sure that popular config
management
 tools continue to work with this change. Would love to see how each
  of
 those would work with the proposal in the KIP. I don't know enough
   about
 each of the tools but seems like in some of the tools, you have to
   define
 some sort of class with parameter names as config names. How will
  such
 tools find out about the config values? In Puppet, if this means
 that
each
 Puppet agent has to read it from ZK, this means the ZK port has to
 be
open
 to pretty much every machine in the DC. This is a bummer and a very
 confusing requirement. Not sure if this is really a problem or not
   (each
of
 those tools might behave differently), though pointing out that
 this
  is
 something worth paying attention to.

 3. The wrapper tools that let users read/change config tools should
  not
 depend on ZK for the reason mentioned above. It's a pain to assume
  that
the
 ZK port is open from any machine that needs to run this tool.
 Ideally
what
 users want is a REST API to the brokers to change or read the
 config
   (ala
 Elasticsearch), but in the absence of the REST API, we should think
  if
   we
 can write the tool such that it just requires talking to the Kafka
   broker
 port. This will require a config RPC.

 4. Not sure if KIP is the right place to discuss the design of
propagating
 the config changes to the brokers, but have you thought about just
letting
 the controller oversee the config changes and propagate via RPC to
  the
 brokers? That way, there is an easier way to express config changes
   that
 require all brokers to change it for it to be called complete.
 Maybe
   this
 is not required, but it is hard to say if we don't discuss the full
  set
of
 configs that need to be dynamic.

 Thanks,
 Neha

 On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com
   wrote:

  Hey Aditya,
 
  This is a great! A couple of comments:
 
  1. Leaving the file config in place is definitely the least
disturbance.
  But let's really think about getting rid of the files and just
 have
   one
  config mechanism. There is always a tendency to make everything
pluggable
  which so often just leads to two mediocre solutions. Can we do
 the
exercise
  of trying to consider fully getting rid of file config and seeing
   what
goes
  wrong?
 
  2. Do we need to model defaults? The current approach is that if
  you
have a
  global config x it is overridden for a topic xyz by
 /topics/xyz/x,
   and
I
  think this could be extended to /brokers/0/x. I think this is
   simpler.
We

RE: [DISCUSS] KIP-21 Configuration Management

2015-05-07 Thread Aditya Auradkar
Theoretically, using just the broker id and zk connect string it should be 
possible for the broker to read all configs from Zookeeper. Like Ashish said, 
we should probably take a look and make sure.

Additionally, we've spoken about making config changes only through a broker 
API. However, we also need a way to change properties even if a specific 
broker, controller or entire cluster is down or unable to accept config change 
requests for any reason. This implies that we need a mechanism to make config 
changes by talking to zookeeper directly and that we cant rely solely on the 
broker/controller API.

Aditya


From: Ashish Singh [asi...@cloudera.com]
Sent: Thursday, May 07, 2015 8:19 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Agreed :). However, the other concerns remain. Do you think just providing
zk info to broker will be sufficient? I will myself spend some to look into
the existing required confine.

On Thursday, May 7, 2015, Jun Rao j...@confluent.io wrote:

 Ashish,

 3. This is true. However, using files has the same problem. You can't store
 the location of the file in the file itself. The location of the file has
 to be passed out of band into Kafka.

 Thanks,

 Jun

 On Wed, May 6, 2015 at 6:34 PM, Ashish Singh asi...@cloudera.com
 javascript:; wrote:

  Hey Jun,
 
  Where does the broker get the info, which zk it needs to talk to?
 
  On Wednesday, May 6, 2015, Jun Rao j...@confluent.io javascript:;
 wrote:
 
   Ashish,
  
   3. Just want to clarify. Why can't you store ZK connection config in
 ZK?
   This is a property for ZK clients, not ZK server.
  
   Thanks,
  
   Jun
  
   On Wed, May 6, 2015 at 5:48 PM, Ashish Singh asi...@cloudera.com
 javascript:;
   javascript:; wrote:
  
I too would like to share some concerns that we came up with while
discussing the effect of moving configs to zookeeper will have.
   
1. Kafka will start to become a configuration management tool to some
degree, and be subject to all the things such tools are commonly
 asked
  to
do. Kafka'll likely need to re-implement the role / group / service
hierarchy that CM uses. Kafka'll need some way to conveniently dump
 its
configs so they can be re-imported later, as a backup tool. People
 will
want this to be audited, which means you'd need distinct logins for
different people, and user management. You can try to push some of
 this
stuff onto tools like CM, but this is Kafka going out of its way to
 be
difficult to manage, and most projects don't want to do that. Being
   unique
in how configuration is done is strictly a bad thing for both
  integration
and usability. Probably lots of other stuff. Seems like a bad
  direction.
   
2. Where would the default config live? If we decide on keeping the
   config
files around just for getting the default config, then I think on
   restart,
the config file will be ignored. This creates an obnoxious asymmetry
  for
how to configure Kafka the first time and how you update it. You have
  to
learn 2 ways of making config changes. If there was a mistake in your
original config file, you can't just edit the config file and
 restart,
   you
have to go through the API. Reading configs is also more irritating.
  This
all creates a learning curve for users of Kafka that will make it
  harder
   to
use than other projects. This is also a backwards-incompatible
 change.
   
3. All Kafka configs living in ZK is strictly impossible, since at
 the
   very
least ZK connection configs cannot be stored in ZK. So you will have
 a
   file
where some values are in effect but others are not, which is again
confusing. Also, since you are still reading the config file on first
start, there are still multiple sources of truth, or at least the
appearance of such to the user.
   
On Wed, May 6, 2015 at 5:33 PM, Jun Rao j...@confluent.io
 javascript:;
  javascript:;
   wrote:
   
 One of the Chef users confirmed that Chef integration could still
  work
   if
 all configs are moved to ZK. My rough understanding of how Chef
 works
   is
 that a user first registers a service host with a Chef server.
 After
that,
 a Chef client will be run on the service host. The user can then
 push
 config changes intended for a service/host to the Chef server. The
   server
 is then responsible for pushing the changes to Chef clients. Chef
   clients
 support pluggable logic. For example, it can generate a config file
   that
 Kafka broker will take. If we move all configs to ZK, we can
  customize
the
 Chef client to use our config CLI to make the config changes in
  Kafka.
   In
 this model, one probably doesn't need to register every broker in
  Chef
for
 the config push. Not sure if Puppet works in a similar way.

 Also for storing the configs, we

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-07 Thread Jun Rao
  )

 Gwen


 On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com
  javascript:;
   wrote:

  Good discussion. Since we will be talking about this at 11am, I
   wanted
  to organize these comments into requirements to see if we are all
  on
  the same page.
 
  REQUIREMENT 1: Needs to accept dynamic config changes. This needs
  to
  be general enough to work for all configs that we envision may
 need
   to
  accept changes at runtime. e.g., log (topic), broker, client
   (quotas),
  etc.. possible options include:
  - ZooKeeper watcher
  - Kafka topic
  - Direct RPC to controller (or config coordinator)
 
  The current KIP is really focused on REQUIREMENT 1 and I think
 that
   is
  reasonable as long as we don't come up with something that
 requires
  significant re-engineering to support the other requirements.
 
  REQUIREMENT 2: Provide consistency of configs across brokers
  (modulo
  per-broker overrides) or at least be able to verify consistency.
   What
  this effectively means is that config changes must be seen by all
  brokers eventually and we should be able to easily compare the
 full
  config of each broker.
 
  REQUIREMENT 3: Central config store. Needs to work with plain
  file-based configs and other systems (e.g., puppet). Ideally,
  should
  not bring in other dependencies (e.g., a DB). Possible options:
  - ZooKeeper
  - Kafka topic
  - other? E.g. making it pluggable?
 
  Any other requirements?
 
  Thanks,
 
  Joel
 
  On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
   Hey Neha,
  
   Thanks for the feedback.
   1. In my earlier exchange with Jay, I mentioned the broker
  writing
all
  it's configs to ZK (while respecting the overrides). Then ZK can
 be
used
 to
  view all configs.
  
   2. Need to think about this a bit more. Perhaps we can discuss
  this
  during the hangout tomorrow?
  
   3  4) I viewed these config changes as mainly administrative
  operations. In the case, it may be reasonable to assume that the
 ZK
port
 is
  available for communication from the machine these commands are
  run.
 Having
  a ConfigChangeRequest (or similar) is nice to have but having a
 new
   API
 and
  sending requests to controller also change how we do topic based
  configuration currently. I was hoping to keep this KIP as minimal
  as
  possible and provide a means to represent and modify client and
   broker
  based configs in a central place. Are there any concerns if we
  tackle
 these
  things in a later KIP?
  
   Thanks,
   Aditya
   
   From: Neha Narkhede [n...@confluent.io javascript:;]
   Sent: Sunday, May 03, 2015 9:48 AM
   To: dev@kafka.apache.org javascript:;
   Subject: Re: [DISCUSS] KIP-21 Configuration Management
  
   Thanks for starting this discussion, Aditya. Few
  questions/comments
  
   1. If you change the default values like it's mentioned in the
  KIP,
do
  you
   also overwrite the local config file as part of updating the
   default
  value?
   If not, where does the admin look to find the default values,
 ZK
  or
 local
   Kafka config file? What if a config value is different in both
places?
  
   2. I share Gwen's concern around making sure that popular
 config
  management
   tools continue to work with this change. Would love to see how
  each
of
   those would work with the proposal in the KIP. I don't know
  enough
 about
   each of the tools but seems like in some of the tools, you have
  to
 define
   some sort of class with parameter names as config names. How
 will
such
   tools find out about the config values? In Puppet, if this
 means
   that
  each
   Puppet agent has to read it from ZK, this means the ZK port has
  to
   be
  open
   to pretty much every machine in the DC. This is a bummer and a
  very
   confusing requirement. Not sure if this is really a problem or
  not
 (each
  of
   those tools might behave differently), though pointing out that
   this
is
   something worth paying attention to.
  
   3. The wrapper tools that let users read/change config tools
  should
not
   depend on ZK for the reason mentioned above. It's a pain to
  assume
that
  the
   ZK port is open from any machine that needs to run this tool.
   Ideally
  what
   users want is a REST API to the brokers to change or read the
   config
 (ala
   Elasticsearch), but in the absence of the REST API, we should
  think
if
 we
   can write the tool such that it just requires talking to the
  Kafka
 broker
   port

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-06 Thread Ashish Singh
 envision may need
  to
 accept changes at runtime. e.g., log (topic), broker, client
  (quotas),
 etc.. possible options include:
 - ZooKeeper watcher
 - Kafka topic
 - Direct RPC to controller (or config coordinator)

 The current KIP is really focused on REQUIREMENT 1 and I think that
  is
 reasonable as long as we don't come up with something that requires
 significant re-engineering to support the other requirements.

 REQUIREMENT 2: Provide consistency of configs across brokers
 (modulo
 per-broker overrides) or at least be able to verify consistency.
  What
 this effectively means is that config changes must be seen by all
 brokers eventually and we should be able to easily compare the full
 config of each broker.

 REQUIREMENT 3: Central config store. Needs to work with plain
 file-based configs and other systems (e.g., puppet). Ideally,
 should
 not bring in other dependencies (e.g., a DB). Possible options:
 - ZooKeeper
 - Kafka topic
 - other? E.g. making it pluggable?

 Any other requirements?

 Thanks,

 Joel

 On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
  Hey Neha,
 
  Thanks for the feedback.
  1. In my earlier exchange with Jay, I mentioned the broker
 writing
   all
 it's configs to ZK (while respecting the overrides). Then ZK can be
   used
to
 view all configs.
 
  2. Need to think about this a bit more. Perhaps we can discuss
 this
 during the hangout tomorrow?
 
  3  4) I viewed these config changes as mainly administrative
 operations. In the case, it may be reasonable to assume that the ZK
   port
is
 available for communication from the machine these commands are
 run.
Having
 a ConfigChangeRequest (or similar) is nice to have but having a new
  API
and
 sending requests to controller also change how we do topic based
 configuration currently. I was hoping to keep this KIP as minimal
 as
 possible and provide a means to represent and modify client and
  broker
 based configs in a central place. Are there any concerns if we
 tackle
these
 things in a later KIP?
 
  Thanks,
  Aditya
  
  From: Neha Narkhede [n...@confluent.io javascript:;]
  Sent: Sunday, May 03, 2015 9:48 AM
  To: dev@kafka.apache.org javascript:;
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  Thanks for starting this discussion, Aditya. Few
 questions/comments
 
  1. If you change the default values like it's mentioned in the
 KIP,
   do
 you
  also overwrite the local config file as part of updating the
  default
 value?
  If not, where does the admin look to find the default values, ZK
 or
local
  Kafka config file? What if a config value is different in both
   places?
 
  2. I share Gwen's concern around making sure that popular config
 management
  tools continue to work with this change. Would love to see how
 each
   of
  those would work with the proposal in the KIP. I don't know
 enough
about
  each of the tools but seems like in some of the tools, you have
 to
define
  some sort of class with parameter names as config names. How will
   such
  tools find out about the config values? In Puppet, if this means
  that
 each
  Puppet agent has to read it from ZK, this means the ZK port has
 to
  be
 open
  to pretty much every machine in the DC. This is a bummer and a
 very
  confusing requirement. Not sure if this is really a problem or
 not
(each
 of
  those tools might behave differently), though pointing out that
  this
   is
  something worth paying attention to.
 
  3. The wrapper tools that let users read/change config tools
 should
   not
  depend on ZK for the reason mentioned above. It's a pain to
 assume
   that
 the
  ZK port is open from any machine that needs to run this tool.
  Ideally
 what
  users want is a REST API to the brokers to change or read the
  config
(ala
  Elasticsearch), but in the absence of the REST API, we should
 think
   if
we
  can write the tool such that it just requires talking to the
 Kafka
broker
  port. This will require a config RPC.
 
  4. Not sure if KIP is the right place to discuss the design of
 propagating
  the config changes to the brokers, but have you thought about
 just
 letting
  the controller oversee the config changes and propagate via RPC
 to
   the
  brokers? That way, there is an easier way to express config
 changes
that
  require all brokers to change it for it to be called complete.
  Maybe
this
  is not required, but it is hard to say if we don't discuss the
 full
   set
 of
  configs that need to be dynamic

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-06 Thread Jun Rao
 that
 is
reasonable as long as we don't come up with something that requires
significant re-engineering to support the other requirements.
   
REQUIREMENT 2: Provide consistency of configs across brokers (modulo
per-broker overrides) or at least be able to verify consistency.
 What
this effectively means is that config changes must be seen by all
brokers eventually and we should be able to easily compare the full
config of each broker.
   
REQUIREMENT 3: Central config store. Needs to work with plain
file-based configs and other systems (e.g., puppet). Ideally, should
not bring in other dependencies (e.g., a DB). Possible options:
- ZooKeeper
- Kafka topic
- other? E.g. making it pluggable?
   
Any other requirements?
   
Thanks,
   
Joel
   
On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
 Hey Neha,

 Thanks for the feedback.
 1. In my earlier exchange with Jay, I mentioned the broker writing
  all
it's configs to ZK (while respecting the overrides). Then ZK can be
  used
   to
view all configs.

 2. Need to think about this a bit more. Perhaps we can discuss this
during the hangout tomorrow?

 3  4) I viewed these config changes as mainly administrative
operations. In the case, it may be reasonable to assume that the ZK
  port
   is
available for communication from the machine these commands are run.
   Having
a ConfigChangeRequest (or similar) is nice to have but having a new
 API
   and
sending requests to controller also change how we do topic based
configuration currently. I was hoping to keep this KIP as minimal as
possible and provide a means to represent and modify client and
 broker
based configs in a central place. Are there any concerns if we tackle
   these
things in a later KIP?

 Thanks,
 Aditya
 
 From: Neha Narkhede [n...@confluent.io]
 Sent: Sunday, May 03, 2015 9:48 AM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 Thanks for starting this discussion, Aditya. Few questions/comments

 1. If you change the default values like it's mentioned in the KIP,
  do
you
 also overwrite the local config file as part of updating the
 default
value?
 If not, where does the admin look to find the default values, ZK or
   local
 Kafka config file? What if a config value is different in both
  places?

 2. I share Gwen's concern around making sure that popular config
management
 tools continue to work with this change. Would love to see how each
  of
 those would work with the proposal in the KIP. I don't know enough
   about
 each of the tools but seems like in some of the tools, you have to
   define
 some sort of class with parameter names as config names. How will
  such
 tools find out about the config values? In Puppet, if this means
 that
each
 Puppet agent has to read it from ZK, this means the ZK port has to
 be
open
 to pretty much every machine in the DC. This is a bummer and a very
 confusing requirement. Not sure if this is really a problem or not
   (each
of
 those tools might behave differently), though pointing out that
 this
  is
 something worth paying attention to.

 3. The wrapper tools that let users read/change config tools should
  not
 depend on ZK for the reason mentioned above. It's a pain to assume
  that
the
 ZK port is open from any machine that needs to run this tool.
 Ideally
what
 users want is a REST API to the brokers to change or read the
 config
   (ala
 Elasticsearch), but in the absence of the REST API, we should think
  if
   we
 can write the tool such that it just requires talking to the Kafka
   broker
 port. This will require a config RPC.

 4. Not sure if KIP is the right place to discuss the design of
propagating
 the config changes to the brokers, but have you thought about just
letting
 the controller oversee the config changes and propagate via RPC to
  the
 brokers? That way, there is an easier way to express config changes
   that
 require all brokers to change it for it to be called complete.
 Maybe
   this
 is not required, but it is hard to say if we don't discuss the full
  set
of
 configs that need to be dynamic.

 Thanks,
 Neha

 On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com
   wrote:

  Hey Aditya,
 
  This is a great! A couple of comments:
 
  1. Leaving the file config in place is definitely the least
disturbance.
  But let's really think about getting rid of the files and just
 have
   one
  config mechanism. There is always a tendency to make everything
pluggable
  which so often just leads to two mediocre solutions. Can we do

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-06 Thread Ashish Singh
 must be seen by all
   brokers eventually and we should be able to easily compare the full
   config of each broker.
  
   REQUIREMENT 3: Central config store. Needs to work with plain
   file-based configs and other systems (e.g., puppet). Ideally, should
   not bring in other dependencies (e.g., a DB). Possible options:
   - ZooKeeper
   - Kafka topic
   - other? E.g. making it pluggable?
  
   Any other requirements?
  
   Thanks,
  
   Joel
  
   On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
Hey Neha,
   
Thanks for the feedback.
1. In my earlier exchange with Jay, I mentioned the broker writing
 all
   it's configs to ZK (while respecting the overrides). Then ZK can be
 used
  to
   view all configs.
   
2. Need to think about this a bit more. Perhaps we can discuss this
   during the hangout tomorrow?
   
3  4) I viewed these config changes as mainly administrative
   operations. In the case, it may be reasonable to assume that the ZK
 port
  is
   available for communication from the machine these commands are run.
  Having
   a ConfigChangeRequest (or similar) is nice to have but having a new API
  and
   sending requests to controller also change how we do topic based
   configuration currently. I was hoping to keep this KIP as minimal as
   possible and provide a means to represent and modify client and broker
   based configs in a central place. Are there any concerns if we tackle
  these
   things in a later KIP?
   
Thanks,
Aditya

From: Neha Narkhede [n...@confluent.io]
Sent: Sunday, May 03, 2015 9:48 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management
   
Thanks for starting this discussion, Aditya. Few questions/comments
   
1. If you change the default values like it's mentioned in the KIP,
 do
   you
also overwrite the local config file as part of updating the default
   value?
If not, where does the admin look to find the default values, ZK or
  local
Kafka config file? What if a config value is different in both
 places?
   
2. I share Gwen's concern around making sure that popular config
   management
tools continue to work with this change. Would love to see how each
 of
those would work with the proposal in the KIP. I don't know enough
  about
each of the tools but seems like in some of the tools, you have to
  define
some sort of class with parameter names as config names. How will
 such
tools find out about the config values? In Puppet, if this means that
   each
Puppet agent has to read it from ZK, this means the ZK port has to be
   open
to pretty much every machine in the DC. This is a bummer and a very
confusing requirement. Not sure if this is really a problem or not
  (each
   of
those tools might behave differently), though pointing out that this
 is
something worth paying attention to.
   
3. The wrapper tools that let users read/change config tools should
 not
depend on ZK for the reason mentioned above. It's a pain to assume
 that
   the
ZK port is open from any machine that needs to run this tool. Ideally
   what
users want is a REST API to the brokers to change or read the config
  (ala
Elasticsearch), but in the absence of the REST API, we should think
 if
  we
can write the tool such that it just requires talking to the Kafka
  broker
port. This will require a config RPC.
   
4. Not sure if KIP is the right place to discuss the design of
   propagating
the config changes to the brokers, but have you thought about just
   letting
the controller oversee the config changes and propagate via RPC to
 the
brokers? That way, there is an easier way to express config changes
  that
require all brokers to change it for it to be called complete. Maybe
  this
is not required, but it is hard to say if we don't discuss the full
 set
   of
configs that need to be dynamic.
   
Thanks,
Neha
   
On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com
  wrote:
   
 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least
   disturbance.
 But let's really think about getting rid of the files and just have
  one
 config mechanism. There is always a tendency to make everything
   pluggable
 which so often just leads to two mediocre solutions. Can we do the
   exercise
 of trying to consider fully getting rid of file config and seeing
  what
   goes
 wrong?

 2. Do we need to model defaults? The current approach is that if
 you
   have a
 global config x it is overridden for a topic xyz by /topics/xyz/x,
  and
   I
 think this could be extended to /brokers/0/x. I think this is
  simpler.
   We
 need to specify the precedence for these overrides, e.g. if you
   override at
 the broker and topic level

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-06 Thread Ashish Singh
, Joel Koshy jjkosh...@gmail.com
  wrote:
   
 Good discussion. Since we will be talking about this at 11am, I
  wanted
 to organize these comments into requirements to see if we are all
 on
 the same page.

 REQUIREMENT 1: Needs to accept dynamic config changes. This needs
 to
 be general enough to work for all configs that we envision may
 need
  to
 accept changes at runtime. e.g., log (topic), broker, client
  (quotas),
 etc.. possible options include:
 - ZooKeeper watcher
 - Kafka topic
 - Direct RPC to controller (or config coordinator)

 The current KIP is really focused on REQUIREMENT 1 and I think
 that
  is
 reasonable as long as we don't come up with something that
 requires
 significant re-engineering to support the other requirements.

 REQUIREMENT 2: Provide consistency of configs across brokers
 (modulo
 per-broker overrides) or at least be able to verify consistency.
  What
 this effectively means is that config changes must be seen by all
 brokers eventually and we should be able to easily compare the
 full
 config of each broker.

 REQUIREMENT 3: Central config store. Needs to work with plain
 file-based configs and other systems (e.g., puppet). Ideally,
 should
 not bring in other dependencies (e.g., a DB). Possible options:
 - ZooKeeper
 - Kafka topic
 - other? E.g. making it pluggable?

 Any other requirements?

 Thanks,

 Joel

 On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
  Hey Neha,
 
  Thanks for the feedback.
  1. In my earlier exchange with Jay, I mentioned the broker
 writing
   all
 it's configs to ZK (while respecting the overrides). Then ZK can
 be
   used
to
 view all configs.
 
  2. Need to think about this a bit more. Perhaps we can discuss
 this
 during the hangout tomorrow?
 
  3  4) I viewed these config changes as mainly administrative
 operations. In the case, it may be reasonable to assume that the
 ZK
   port
is
 available for communication from the machine these commands are
 run.
Having
 a ConfigChangeRequest (or similar) is nice to have but having a
 new
  API
and
 sending requests to controller also change how we do topic based
 configuration currently. I was hoping to keep this KIP as minimal
 as
 possible and provide a means to represent and modify client and
  broker
 based configs in a central place. Are there any concerns if we
 tackle
these
 things in a later KIP?
 
  Thanks,
  Aditya
  
  From: Neha Narkhede [n...@confluent.io]
  Sent: Sunday, May 03, 2015 9:48 AM
  To: dev@kafka.apache.org
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  Thanks for starting this discussion, Aditya. Few
 questions/comments
 
  1. If you change the default values like it's mentioned in the
 KIP,
   do
 you
  also overwrite the local config file as part of updating the
  default
 value?
  If not, where does the admin look to find the default values,
 ZK or
local
  Kafka config file? What if a config value is different in both
   places?
 
  2. I share Gwen's concern around making sure that popular config
 management
  tools continue to work with this change. Would love to see how
 each
   of
  those would work with the proposal in the KIP. I don't know
 enough
about
  each of the tools but seems like in some of the tools, you have
 to
define
  some sort of class with parameter names as config names. How
 will
   such
  tools find out about the config values? In Puppet, if this means
  that
 each
  Puppet agent has to read it from ZK, this means the ZK port has
 to
  be
 open
  to pretty much every machine in the DC. This is a bummer and a
 very
  confusing requirement. Not sure if this is really a problem or
 not
(each
 of
  those tools might behave differently), though pointing out that
  this
   is
  something worth paying attention to.
 
  3. The wrapper tools that let users read/change config tools
 should
   not
  depend on ZK for the reason mentioned above. It's a pain to
 assume
   that
 the
  ZK port is open from any machine that needs to run this tool.
  Ideally
 what
  users want is a REST API to the brokers to change or read the
  config
(ala
  Elasticsearch), but in the absence of the REST API, we should
 think
   if
we
  can write the tool such that it just requires talking to the
 Kafka
broker
  port. This will require a config RPC.
 
  4. Not sure if KIP is the right place to discuss the design of
 propagating
  the config changes to the brokers, but have you thought about
 just
 letting
  the controller oversee the config changes

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-06 Thread Jun Rao
One of the Chef users confirmed that Chef integration could still work if
all configs are moved to ZK. My rough understanding of how Chef works is
that a user first registers a service host with a Chef server. After that,
a Chef client will be run on the service host. The user can then push
config changes intended for a service/host to the Chef server. The server
is then responsible for pushing the changes to Chef clients. Chef clients
support pluggable logic. For example, it can generate a config file that
Kafka broker will take. If we move all configs to ZK, we can customize the
Chef client to use our config CLI to make the config changes in Kafka. In
this model, one probably doesn't need to register every broker in Chef for
the config push. Not sure if Puppet works in a similar way.

Also for storing the configs, we probably can't store the broker/global
level configs in Kafka itself (e.g. in a special topic). The reason is that
in order to start a broker, we likely need to make some broker level config
changes (e.g., the default log.dir may not be present, the default port may
not be available, etc). If we need a broker to be up to make those changes,
we get into this chicken and egg problem.

Thanks,

Jun

On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira gshap...@cloudera.com wrote:

 Sorry I missed the call today :)

 I think an additional requirement would be:
 Make sure that traditional deployment tools (Puppet, Chef, etc) are still
 capable of managing Kafka configuration.

 For this reason, I'd like the configuration refresh to be pretty close to
 what most Linux services are doing to force a reload of configuration.
 AFAIK, this involves handling HUP signal in the main thread to reload
 configuration. Then packaging scripts can add something nice like service
 kafka reload.

 (See Apache web server:
 https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101)

 Gwen


 On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote:

  Good discussion. Since we will be talking about this at 11am, I wanted
  to organize these comments into requirements to see if we are all on
  the same page.
 
  REQUIREMENT 1: Needs to accept dynamic config changes. This needs to
  be general enough to work for all configs that we envision may need to
  accept changes at runtime. e.g., log (topic), broker, client (quotas),
  etc.. possible options include:
  - ZooKeeper watcher
  - Kafka topic
  - Direct RPC to controller (or config coordinator)
 
  The current KIP is really focused on REQUIREMENT 1 and I think that is
  reasonable as long as we don't come up with something that requires
  significant re-engineering to support the other requirements.
 
  REQUIREMENT 2: Provide consistency of configs across brokers (modulo
  per-broker overrides) or at least be able to verify consistency.  What
  this effectively means is that config changes must be seen by all
  brokers eventually and we should be able to easily compare the full
  config of each broker.
 
  REQUIREMENT 3: Central config store. Needs to work with plain
  file-based configs and other systems (e.g., puppet). Ideally, should
  not bring in other dependencies (e.g., a DB). Possible options:
  - ZooKeeper
  - Kafka topic
  - other? E.g. making it pluggable?
 
  Any other requirements?
 
  Thanks,
 
  Joel
 
  On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
   Hey Neha,
  
   Thanks for the feedback.
   1. In my earlier exchange with Jay, I mentioned the broker writing all
  it's configs to ZK (while respecting the overrides). Then ZK can be used
 to
  view all configs.
  
   2. Need to think about this a bit more. Perhaps we can discuss this
  during the hangout tomorrow?
  
   3  4) I viewed these config changes as mainly administrative
  operations. In the case, it may be reasonable to assume that the ZK port
 is
  available for communication from the machine these commands are run.
 Having
  a ConfigChangeRequest (or similar) is nice to have but having a new API
 and
  sending requests to controller also change how we do topic based
  configuration currently. I was hoping to keep this KIP as minimal as
  possible and provide a means to represent and modify client and broker
  based configs in a central place. Are there any concerns if we tackle
 these
  things in a later KIP?
  
   Thanks,
   Aditya
   
   From: Neha Narkhede [n...@confluent.io]
   Sent: Sunday, May 03, 2015 9:48 AM
   To: dev@kafka.apache.org
   Subject: Re: [DISCUSS] KIP-21 Configuration Management
  
   Thanks for starting this discussion, Aditya. Few questions/comments
  
   1. If you change the default values like it's mentioned in the KIP, do
  you
   also overwrite the local config file as part of updating the default
  value?
   If not, where does the admin look to find the default values, ZK or
 local
   Kafka config file? What if a config value is different in both places?
  
   2. I share Gwen's concern

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-05 Thread Gwen Shapira
Sorry I missed the call today :)

I think an additional requirement would be:
Make sure that traditional deployment tools (Puppet, Chef, etc) are still
capable of managing Kafka configuration.

For this reason, I'd like the configuration refresh to be pretty close to
what most Linux services are doing to force a reload of configuration.
AFAIK, this involves handling HUP signal in the main thread to reload
configuration. Then packaging scripts can add something nice like service
kafka reload.

(See Apache web server:
https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101)

Gwen


On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote:

 Good discussion. Since we will be talking about this at 11am, I wanted
 to organize these comments into requirements to see if we are all on
 the same page.

 REQUIREMENT 1: Needs to accept dynamic config changes. This needs to
 be general enough to work for all configs that we envision may need to
 accept changes at runtime. e.g., log (topic), broker, client (quotas),
 etc.. possible options include:
 - ZooKeeper watcher
 - Kafka topic
 - Direct RPC to controller (or config coordinator)

 The current KIP is really focused on REQUIREMENT 1 and I think that is
 reasonable as long as we don't come up with something that requires
 significant re-engineering to support the other requirements.

 REQUIREMENT 2: Provide consistency of configs across brokers (modulo
 per-broker overrides) or at least be able to verify consistency.  What
 this effectively means is that config changes must be seen by all
 brokers eventually and we should be able to easily compare the full
 config of each broker.

 REQUIREMENT 3: Central config store. Needs to work with plain
 file-based configs and other systems (e.g., puppet). Ideally, should
 not bring in other dependencies (e.g., a DB). Possible options:
 - ZooKeeper
 - Kafka topic
 - other? E.g. making it pluggable?

 Any other requirements?

 Thanks,

 Joel

 On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
  Hey Neha,
 
  Thanks for the feedback.
  1. In my earlier exchange with Jay, I mentioned the broker writing all
 it's configs to ZK (while respecting the overrides). Then ZK can be used to
 view all configs.
 
  2. Need to think about this a bit more. Perhaps we can discuss this
 during the hangout tomorrow?
 
  3  4) I viewed these config changes as mainly administrative
 operations. In the case, it may be reasonable to assume that the ZK port is
 available for communication from the machine these commands are run. Having
 a ConfigChangeRequest (or similar) is nice to have but having a new API and
 sending requests to controller also change how we do topic based
 configuration currently. I was hoping to keep this KIP as minimal as
 possible and provide a means to represent and modify client and broker
 based configs in a central place. Are there any concerns if we tackle these
 things in a later KIP?
 
  Thanks,
  Aditya
  
  From: Neha Narkhede [n...@confluent.io]
  Sent: Sunday, May 03, 2015 9:48 AM
  To: dev@kafka.apache.org
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  Thanks for starting this discussion, Aditya. Few questions/comments
 
  1. If you change the default values like it's mentioned in the KIP, do
 you
  also overwrite the local config file as part of updating the default
 value?
  If not, where does the admin look to find the default values, ZK or local
  Kafka config file? What if a config value is different in both places?
 
  2. I share Gwen's concern around making sure that popular config
 management
  tools continue to work with this change. Would love to see how each of
  those would work with the proposal in the KIP. I don't know enough about
  each of the tools but seems like in some of the tools, you have to define
  some sort of class with parameter names as config names. How will such
  tools find out about the config values? In Puppet, if this means that
 each
  Puppet agent has to read it from ZK, this means the ZK port has to be
 open
  to pretty much every machine in the DC. This is a bummer and a very
  confusing requirement. Not sure if this is really a problem or not (each
 of
  those tools might behave differently), though pointing out that this is
  something worth paying attention to.
 
  3. The wrapper tools that let users read/change config tools should not
  depend on ZK for the reason mentioned above. It's a pain to assume that
 the
  ZK port is open from any machine that needs to run this tool. Ideally
 what
  users want is a REST API to the brokers to change or read the config (ala
  Elasticsearch), but in the absence of the REST API, we should think if we
  can write the tool such that it just requires talking to the Kafka broker
  port. This will require a config RPC.
 
  4. Not sure if KIP is the right place to discuss the design of
 propagating
  the config changes to the brokers, but have you

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-05 Thread Joel Koshy
Good discussion. Since we will be talking about this at 11am, I wanted
to organize these comments into requirements to see if we are all on
the same page.

REQUIREMENT 1: Needs to accept dynamic config changes. This needs to
be general enough to work for all configs that we envision may need to
accept changes at runtime. e.g., log (topic), broker, client (quotas),
etc.. possible options include:
- ZooKeeper watcher
- Kafka topic
- Direct RPC to controller (or config coordinator)

The current KIP is really focused on REQUIREMENT 1 and I think that is
reasonable as long as we don't come up with something that requires
significant re-engineering to support the other requirements.

REQUIREMENT 2: Provide consistency of configs across brokers (modulo
per-broker overrides) or at least be able to verify consistency.  What
this effectively means is that config changes must be seen by all
brokers eventually and we should be able to easily compare the full
config of each broker.

REQUIREMENT 3: Central config store. Needs to work with plain
file-based configs and other systems (e.g., puppet). Ideally, should
not bring in other dependencies (e.g., a DB). Possible options:
- ZooKeeper
- Kafka topic
- other? E.g. making it pluggable?

Any other requirements?

Thanks,

Joel

On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
 Hey Neha,
 
 Thanks for the feedback.
 1. In my earlier exchange with Jay, I mentioned the broker writing all it's 
 configs to ZK (while respecting the overrides). Then ZK can be used to view 
 all configs.
 
 2. Need to think about this a bit more. Perhaps we can discuss this during 
 the hangout tomorrow?
 
 3  4) I viewed these config changes as mainly administrative operations. In 
 the case, it may be reasonable to assume that the ZK port is available for 
 communication from the machine these commands are run. Having a 
 ConfigChangeRequest (or similar) is nice to have but having a new API and 
 sending requests to controller also change how we do topic based 
 configuration currently. I was hoping to keep this KIP as minimal as possible 
 and provide a means to represent and modify client and broker based configs 
 in a central place. Are there any concerns if we tackle these things in a 
 later KIP?
 
 Thanks,
 Aditya
 
 From: Neha Narkhede [n...@confluent.io]
 Sent: Sunday, May 03, 2015 9:48 AM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
 Thanks for starting this discussion, Aditya. Few questions/comments
 
 1. If you change the default values like it's mentioned in the KIP, do you
 also overwrite the local config file as part of updating the default value?
 If not, where does the admin look to find the default values, ZK or local
 Kafka config file? What if a config value is different in both places?
 
 2. I share Gwen's concern around making sure that popular config management
 tools continue to work with this change. Would love to see how each of
 those would work with the proposal in the KIP. I don't know enough about
 each of the tools but seems like in some of the tools, you have to define
 some sort of class with parameter names as config names. How will such
 tools find out about the config values? In Puppet, if this means that each
 Puppet agent has to read it from ZK, this means the ZK port has to be open
 to pretty much every machine in the DC. This is a bummer and a very
 confusing requirement. Not sure if this is really a problem or not (each of
 those tools might behave differently), though pointing out that this is
 something worth paying attention to.
 
 3. The wrapper tools that let users read/change config tools should not
 depend on ZK for the reason mentioned above. It's a pain to assume that the
 ZK port is open from any machine that needs to run this tool. Ideally what
 users want is a REST API to the brokers to change or read the config (ala
 Elasticsearch), but in the absence of the REST API, we should think if we
 can write the tool such that it just requires talking to the Kafka broker
 port. This will require a config RPC.
 
 4. Not sure if KIP is the right place to discuss the design of propagating
 the config changes to the brokers, but have you thought about just letting
 the controller oversee the config changes and propagate via RPC to the
 brokers? That way, there is an easier way to express config changes that
 require all brokers to change it for it to be called complete. Maybe this
 is not required, but it is hard to say if we don't discuss the full set of
 configs that need to be dynamic.
 
 Thanks,
 Neha
 
 On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote:
 
  Hey Aditya,
 
  This is a great! A couple of comments:
 
  1. Leaving the file config in place is definitely the least disturbance.
  But let's really think about getting rid of the files and just have one
  config mechanism. There is always a tendency to make everything pluggable

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-05 Thread Neha Narkhede
Joel, thanks for summarizing the requirements. It makes sense for the KIP
to focus on Req #1, unless any future configs as dynamic ones warrants a
completely different design. My main concern is going with a design by
keeping only quotas in mind and then continue shoehorning other dynamic
configs into that model even if it doesn't work that well.

1. In my earlier exchange with Jay, I mentioned the broker writing all it's
 configs to ZK (while respecting the overrides). Then ZK can be used to view
 all configs.


My concern with supporting both is that it will be confusing for anyone to
know where to look to find the final value of a config or be able to tell
if a particular broker hasn't picked up a config value. Maybe you have
thought about this, I am unclear about the fix you have in mind.

 2. Need to think about this a bit more. Perhaps we can discuss this during
 the hangout tomorrow?


This isn't relevant for LI but is important for a lot of users. So we
should definitely state how those tools would continue to work with this
change in the KIP.

3  4) I viewed these config changes as mainly administrative operations.
 In the case, it may be reasonable to assume that the ZK port is available
 for communication from the machine these commands are run.


I'm not so sure about this assumption.


 Having a ConfigChangeRequest (or similar) is nice to have but having a new
 API and sending requests to controller also change how we do topic based
 configuration currently. I was hoping to keep this KIP as minimal as
 possible and provide a means to represent and modify client and broker
 based configs in a central place. Are there any concerns if we tackle these
 things in a later KIP?


I don't have concerns about reducing the scope of this KIP as long as we
are sure the approach we pick is the right direction for dynamic config and
further changes are just incremental, not a redesign.

On Tue, May 5, 2015 at 8:54 AM, Joel Koshy jjkosh...@gmail.com wrote:

 Good discussion. Since we will be talking about this at 11am, I wanted
 to organize these comments into requirements to see if we are all on
 the same page.

 REQUIREMENT 1: Needs to accept dynamic config changes. This needs to
 be general enough to work for all configs that we envision may need to
 accept changes at runtime. e.g., log (topic), broker, client (quotas),
 etc.. possible options include:
 - ZooKeeper watcher
 - Kafka topic
 - Direct RPC to controller (or config coordinator)

 The current KIP is really focused on REQUIREMENT 1 and I think that is
 reasonable as long as we don't come up with something that requires
 significant re-engineering to support the other requirements.

 REQUIREMENT 2: Provide consistency of configs across brokers (modulo
 per-broker overrides) or at least be able to verify consistency.  What
 this effectively means is that config changes must be seen by all
 brokers eventually and we should be able to easily compare the full
 config of each broker.

 REQUIREMENT 3: Central config store. Needs to work with plain
 file-based configs and other systems (e.g., puppet). Ideally, should
 not bring in other dependencies (e.g., a DB). Possible options:
 - ZooKeeper
 - Kafka topic
 - other? E.g. making it pluggable?

 Any other requirements?

 Thanks,

 Joel

 On Tue, May 05, 2015 at 01:38:09AM +, Aditya Auradkar wrote:
  Hey Neha,
 
  Thanks for the feedback.
  1. In my earlier exchange with Jay, I mentioned the broker writing all
 it's configs to ZK (while respecting the overrides). Then ZK can be used to
 view all configs.
 
  2. Need to think about this a bit more. Perhaps we can discuss this
 during the hangout tomorrow?
 
  3  4) I viewed these config changes as mainly administrative
 operations. In the case, it may be reasonable to assume that the ZK port is
 available for communication from the machine these commands are run. Having
 a ConfigChangeRequest (or similar) is nice to have but having a new API and
 sending requests to controller also change how we do topic based
 configuration currently. I was hoping to keep this KIP as minimal as
 possible and provide a means to represent and modify client and broker
 based configs in a central place. Are there any concerns if we tackle these
 things in a later KIP?
 
  Thanks,
  Aditya
  
  From: Neha Narkhede [n...@confluent.io]
  Sent: Sunday, May 03, 2015 9:48 AM
  To: dev@kafka.apache.org
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  Thanks for starting this discussion, Aditya. Few questions/comments
 
  1. If you change the default values like it's mentioned in the KIP, do
 you
  also overwrite the local config file as part of updating the default
 value?
  If not, where does the admin look to find the default values, ZK or local
  Kafka config file? What if a config value is different in both places?
 
  2. I share Gwen's concern around making sure that popular config
 management
  tools continue to work

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-04 Thread Joe Stein
Aditya, when I think about the motivation of not having to restart brokers
to change a config I think about all of the configurations I have seen
having to get changed in brokers and restarted (which is just about all of
them). What I mean by stop the world is when producers and/or consumers
will not be able to use the broker(s) for a period of time or something
within the broker holds/blocks everything for the changes to take affect
and LeaderElection is going to occur or ISR change.

Lets say someone wanted to change replicaFetchMaxBytes
or replicaFetchBackoffMs dynamically you would have to stop the
ReplicaFetcherManager. If you use a watcher then then all brokers at the
same time will have to stop and (hopefully) start ReplicaFetcherManager at
the same time. Or lets say someone wanted to change NumNetworkThreads, the
entire SocketServer for every broker at the same time would have to stop
and (hopefully) start.I believe most of the configurations fall into this
category and using a watcher notification to every broker without some
control is going to be a problem. If the notification just goes to the
controller and the controller is able to managing the processing for every
broker that might work but doesn't solve all the problems to be worked on.
We would also have to think about what to-do for the controller broker also
itself (unless we make the controller maybe not a broker as possible) as
well as how to deal with some of these changes that could take brokers in
and out of the ISR or cause Leader Election. If we can make these changes
without stopping the world (not just a matter of having the controller
managing the broker by broker restart) so that Brokers that are leaders
would still be leaders (perhaps the connections for producing / consuming
get buffered or something) when (if) they come back online.

The thing is that lots of folks want all (as many as possible) the
configuration to be dynamic and I am concerned that if we don't code for
the harder cases then we only have one or two configurations able to be
dynamic. If that is the motivation for this KIP so quotas work that is ok.

The more I think about it I am not sure just labeling certain configs to be
dynamic is going to be helpful for folks because they are still having to
manage the updates for all the configurations, restarting brokers and now a
new burden to understand dynamic properties. I think we need to add
solutions for folks where we can to make things easier without having to
add new items for them to contend with.

Thanks!

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Sun, May 3, 2015 at 8:23 PM, Aditya Auradkar 
aaurad...@linkedin.com.invalid wrote:

 Hey Joe,

 Can you elaborate what you mean by a stop the world change? In this
 protocol, we can target notifications to a subset of brokers in the cluster
 (controller if we need to). Is the AdminChangeNotification a ZK
 notification or a request type exposed by each broker?

 Thanks,
 Aditya

 
 From: Joe Stein [joe.st...@stealth.ly]
 Sent: Friday, May 01, 2015 5:25 AM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 Hi Aditya, thanks for the write up and focusing on this piece.

 Agreed we need something that we can do broker changes dynamically without
 rolling restarts.

 I think though if every broker is getting changes it with notifications it
 is going to limit which configs can be dynamic.

 We could never deliver a stop the world configuration change because then
 that would happen on the entire cluster to every broker on the same time.

 Can maybe just the controller get the notification?

 And we provide a layer for brokers to work with the controller to-do the
 config change operations at is discretion (so it can stop things if needs).

 controller gets notification, sends AdminChangeNotification to broker [X ..
 N] then brokers can do their things, even send a response for heartbeating
 while it takes the few milliseconds it needs or crashes. We need to go
 through both scenarios.

 I am worried we put this change in like this and it works for quotas and
 maybe a few other things but nothing else gets dynamic and we don't get far
 enough for almost no more rolling restarts.

 ~ Joe Stein
 - - - - - - - - - - - - - - - - -

   http://www.stealth.ly
 - - - - - - - - - - - - - - - - -

 On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote:

  1. I have deep concerns about managing configuration in ZooKeeper.
  First, Producers and Consumers shouldn't depend on ZK at all, this
  seems
  to add back a dependency we are trying to get away from.
 
  The KIP probably needs to be clarified here - I don't think Aditya was
  referring to client (producer/consumer) configs. These are global
  client-id-specific configs that need to be managed centrally.
  (Specifically, quota overrides on a per-client basis).
 
 



RE: [DISCUSS] KIP-21 Configuration Management

2015-05-04 Thread Aditya Auradkar
Hey Neha,

Thanks for the feedback.
1. In my earlier exchange with Jay, I mentioned the broker writing all it's 
configs to ZK (while respecting the overrides). Then ZK can be used to view all 
configs.

2. Need to think about this a bit more. Perhaps we can discuss this during the 
hangout tomorrow?

3  4) I viewed these config changes as mainly administrative operations. In 
the case, it may be reasonable to assume that the ZK port is available for 
communication from the machine these commands are run. Having a 
ConfigChangeRequest (or similar) is nice to have but having a new API and 
sending requests to controller also change how we do topic based configuration 
currently. I was hoping to keep this KIP as minimal as possible and provide a 
means to represent and modify client and broker based configs in a central 
place. Are there any concerns if we tackle these things in a later KIP?

Thanks,
Aditya

From: Neha Narkhede [n...@confluent.io]
Sent: Sunday, May 03, 2015 9:48 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Thanks for starting this discussion, Aditya. Few questions/comments

1. If you change the default values like it's mentioned in the KIP, do you
also overwrite the local config file as part of updating the default value?
If not, where does the admin look to find the default values, ZK or local
Kafka config file? What if a config value is different in both places?

2. I share Gwen's concern around making sure that popular config management
tools continue to work with this change. Would love to see how each of
those would work with the proposal in the KIP. I don't know enough about
each of the tools but seems like in some of the tools, you have to define
some sort of class with parameter names as config names. How will such
tools find out about the config values? In Puppet, if this means that each
Puppet agent has to read it from ZK, this means the ZK port has to be open
to pretty much every machine in the DC. This is a bummer and a very
confusing requirement. Not sure if this is really a problem or not (each of
those tools might behave differently), though pointing out that this is
something worth paying attention to.

3. The wrapper tools that let users read/change config tools should not
depend on ZK for the reason mentioned above. It's a pain to assume that the
ZK port is open from any machine that needs to run this tool. Ideally what
users want is a REST API to the brokers to change or read the config (ala
Elasticsearch), but in the absence of the REST API, we should think if we
can write the tool such that it just requires talking to the Kafka broker
port. This will require a config RPC.

4. Not sure if KIP is the right place to discuss the design of propagating
the config changes to the brokers, but have you thought about just letting
the controller oversee the config changes and propagate via RPC to the
brokers? That way, there is an easier way to express config changes that
require all brokers to change it for it to be called complete. Maybe this
is not required, but it is hard to say if we don't discuss the full set of
configs that need to be dynamic.

Thanks,
Neha

On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote:

 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least disturbance.
 But let's really think about getting rid of the files and just have one
 config mechanism. There is always a tendency to make everything pluggable
 which so often just leads to two mediocre solutions. Can we do the exercise
 of trying to consider fully getting rid of file config and seeing what goes
 wrong?

 2. Do we need to model defaults? The current approach is that if you have a
 global config x it is overridden for a topic xyz by /topics/xyz/x, and I
 think this could be extended to /brokers/0/x. I think this is simpler. We
 need to specify the precedence for these overrides, e.g. if you override at
 the broker and topic level I think the topic level takes precedence.

 3. I recommend we have the producer and consumer config just be an override
 under client.id. The override is by client id and we can have separate
 properties for controlling quotas for producers and consumers.

 4. Some configs can be changed just by updating the reference, others may
 require some action. An example of this is if you want to disable log
 compaction (assuming we wanted to make that dynamic) we need to call
 shutdown() on the cleaner. I think it may be required to register a
 listener callback that gets called when the config changes.

 5. For handling the reference can you explain your plan a bit? Currently we
 have an immutable KafkaConfig object with a bunch of vals. That or
 individual values in there get injected all over the code base. I was
 thinking something like this:
 a. We retain the KafkaConfig object as an immutable object just as today.
 b

RE: [DISCUSS] KIP-21 Configuration Management

2015-05-04 Thread Aditya Auradkar
1. Essentially, I think removing the option to configure properties via file 
may be a big change for everyone. Having said that, your points are very valid. 
I guess we can discuss this a bit more during the KIP hangout. 

4. Yes, we will need to make some changes to update the MetricConfig for any 
metric. I left it out because I felt it wasn't strictly related to the KIP. 

Thanks for the nice summary on the implementation breakdown. Basically, the KIP 
should provide a uniform mechanism to change any type of config dynamically but 
the work to actually convert configs can be out of scope.  

Thanks,
Aditya


From: Jay Kreps [jay.kr...@gmail.com]
Sent: Monday, May 04, 2015 2:00 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Hey Aditya,

1. I would argue for either staying with what we have or else moving to a
better solution, but not doing both. A solution that uses both is going to
be quite complex to figure out what is configured and where it comes from.
If you think this is needed let's try to construct the argument for why it
is needed. Like in the workflow you described think how confusing that will
be--the first time the broker starts it uses the file, but then after that
if you change the file nothing happens because it has copied the file into
ZK. Instead let's do the research to figure out why people would object to
a pure-zk solution and then see if we can't address all those concerns.

4. I think it is fine to implement this in a second phase, but I think you
will need it even to be able to update the MetricConfigs to execute the
quota change, right?

Not sure if I follow your question on implementation. I think what you are
saying might be doing something like this as a first pass
a. Generalize the TopicConfigManager to handle any type of config override
as described in the doc (broker, topic, client, etc)
b. Implement the concept of mutable configs in ConfigDef
c. Add the client-level overrides for quotas and make sure the topic
overrides all still work
d. Not actually do the work to make most of the broker configs mutable
since that will involve passing around the KafkaConfiguration object much
more broadly in places where we have static wiring now. This would be done
as-needed as we made more configs dynamic.

Is that right?

Personally I think that makes sense. The only concern is just that we get
to a good stopping point so that people aren't left in some half-way state
that we end up redoing in the next release. So I think getting to a final
state with the configuration infrastructure is important, but actually
making all the configs mutable can be done gradually.

-Jay

On Mon, May 4, 2015 at 1:31 PM, Aditya Auradkar 
aaurad...@linkedin.com.invalid wrote:

 Hey Jay,

 Thanks for the feedback.

 1. We can certainly discuss what it means to remove the file configuration
 as a thought exercise. However, is this something we want to do for real?
 IMO, we can remove file configuration by having all configs stored in
 zookeeper. The flow can be:
 - Broker starts and reads all the configs from ZK. (overrides)
 - Apply them on top of the defaults that are hardcoded within the broker.
 This should simulate file based config behavior as it is currently.
 - Potentially, we can write back all the merged configs to zookeeper
 (defaults + overrides). This means that the entire config of that broker is
 in ZK.

 Thoughts?

 2. Good point. All overridden configs (topic, client level) will have a
 corresponding broker config that serves as a default. It should be
 sufficient to change that broker config dynamically and that effectively
 means that the default has been changed. The overrides on a per
 topic/client basis still take precedence. So yeah, I don't think we need to
 model defaults explicitly. Using an example to be sure we are on the same
 page, lets say we wanted to increase the log retention time for all topics
 without having to create a  separate override for each topic, we could
 simply change the log.retention.time under /brokers/broker_id to the
 desired value and that should change the default log retention for everyone
 (apart from the explicitly overridden ones on a per-topic basis).

 3. I thought it was cleaner to simply separate the producer and consumer
 configs but I guess if they present the same clientId, they are essentially
 the same client. I'll follow your suggestion.

 4. Interesting that you mention this. I actually thought about having
 callbacks but I left it out of the initial proposal since I wanted to keep
 it relatively simple. The only configs we can change by checking references
 are the ones we check frequently while processing requests or (or something
 periodic). I shall incorporate this on the KIP.

 5. What you are proposing sounds good. Initially, I was planning to push
 down everything to KafkaConfig by not having immutable vals within. Having
 a wrapper (KafkaConfiguration

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-04 Thread Jay Kreps
 to a subset of brokers in the
 cluster
  (controller if we need to). Is the AdminChangeNotification a ZK
  notification or a request type exposed by each broker?
 
  Thanks,
  Aditya
 
  
  From: Joe Stein [joe.st...@stealth.ly]
  Sent: Friday, May 01, 2015 5:25 AM
  To: dev@kafka.apache.org
  Subject: Re: [DISCUSS] KIP-21 Configuration Management
 
  Hi Aditya, thanks for the write up and focusing on this piece.
 
  Agreed we need something that we can do broker changes dynamically
 without
  rolling restarts.
 
  I think though if every broker is getting changes it with notifications
 it
  is going to limit which configs can be dynamic.
 
  We could never deliver a stop the world configuration change because
 then
  that would happen on the entire cluster to every broker on the same time.
 
  Can maybe just the controller get the notification?
 
  And we provide a layer for brokers to work with the controller to-do the
  config change operations at is discretion (so it can stop things if
 needs).
 
  controller gets notification, sends AdminChangeNotification to broker [X
 ..
  N] then brokers can do their things, even send a response for
 heartbeating
  while it takes the few milliseconds it needs or crashes. We need to go
  through both scenarios.
 
  I am worried we put this change in like this and it works for quotas and
  maybe a few other things but nothing else gets dynamic and we don't get
 far
  enough for almost no more rolling restarts.
 
  ~ Joe Stein
  - - - - - - - - - - - - - - - - -
 
http://www.stealth.ly
  - - - - - - - - - - - - - - - - -
 
  On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote:
 
   1. I have deep concerns about managing configuration in ZooKeeper.
   First, Producers and Consumers shouldn't depend on ZK at all, this
   seems
   to add back a dependency we are trying to get away from.
  
   The KIP probably needs to be clarified here - I don't think Aditya was
   referring to client (producer/consumer) configs. These are global
   client-id-specific configs that need to be managed centrally.
   (Specifically, quota overrides on a per-client basis).
  
  
 



RE: [DISCUSS] KIP-21 Configuration Management

2015-05-04 Thread Aditya Auradkar
Hey Jay,

Thanks for the feedback. 

1. We can certainly discuss what it means to remove the file configuration as a 
thought exercise. However, is this something we want to do for real? IMO, we 
can remove file configuration by having all configs stored in zookeeper. The 
flow can be:
- Broker starts and reads all the configs from ZK. (overrides)
- Apply them on top of the defaults that are hardcoded within the broker. This 
should simulate file based config behavior as it is currently.
- Potentially, we can write back all the merged configs to zookeeper (defaults 
+ overrides). This means that the entire config of that broker is in ZK.

Thoughts?

2. Good point. All overridden configs (topic, client level) will have a 
corresponding broker config that serves as a default. It should be sufficient 
to change that broker config dynamically and that effectively means that the 
default has been changed. The overrides on a per topic/client basis still take 
precedence. So yeah, I don't think we need to model defaults explicitly. Using 
an example to be sure we are on the same page, lets say we wanted to increase 
the log retention time for all topics without having to create a  separate 
override for each topic, we could simply change the log.retention.time under 
/brokers/broker_id to the desired value and that should change the default 
log retention for everyone (apart from the explicitly overridden ones on a 
per-topic basis).

3. I thought it was cleaner to simply separate the producer and consumer 
configs but I guess if they present the same clientId, they are essentially the 
same client. I'll follow your suggestion.

4. Interesting that you mention this. I actually thought about having callbacks 
but I left it out of the initial proposal since I wanted to keep it relatively 
simple. The only configs we can change by checking references are the ones we 
check frequently while processing requests or (or something periodic). I shall 
incorporate this on the KIP.

5. What you are proposing sounds good. Initially, I was planning to push down 
everything to KafkaConfig by not having immutable vals within. Having a wrapper 
(KafkaConfiguration) like you suggest is probably cleaner.

One implementation detail. There don't appear to be any concerns wrt the client 
based config section (and the topic config already exists). Are there any 
concerns if we keep implementation of the per-client config piece and 
generalizing the code in TopicConfigManager separate from the broker config 
section? Client configs are an immediate requirement to operationalize quotas 
(perhaps can be used to manage authorization also for security). The broker 
side changes to mark configs dynamic, implement callbacks etc.. can be 
implemented as a followup task since it will take longer to identity which 
configs can be made dynamic and actually doing the work to make them so. I 
think that once we have reasonable agreement on the overall picture, we can 
implement these things piece by piece.

Thanks,
Aditya


From: Jay Kreps [jay.kr...@gmail.com]
Sent: Friday, May 01, 2015 12:53 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Hey Aditya,

This is a great! A couple of comments:

1. Leaving the file config in place is definitely the least disturbance.
But let's really think about getting rid of the files and just have one
config mechanism. There is always a tendency to make everything pluggable
which so often just leads to two mediocre solutions. Can we do the exercise
of trying to consider fully getting rid of file config and seeing what goes
wrong?

2. Do we need to model defaults? The current approach is that if you have a
global config x it is overridden for a topic xyz by /topics/xyz/x, and I
think this could be extended to /brokers/0/x. I think this is simpler. We
need to specify the precedence for these overrides, e.g. if you override at
the broker and topic level I think the topic level takes precedence.

3. I recommend we have the producer and consumer config just be an override
under client.id. The override is by client id and we can have separate
properties for controlling quotas for producers and consumers.

4. Some configs can be changed just by updating the reference, others may
require some action. An example of this is if you want to disable log
compaction (assuming we wanted to make that dynamic) we need to call
shutdown() on the cleaner. I think it may be required to register a
listener callback that gets called when the config changes.

5. For handling the reference can you explain your plan a bit? Currently we
have an immutable KafkaConfig object with a bunch of vals. That or
individual values in there get injected all over the code base. I was
thinking something like this:
a. We retain the KafkaConfig object as an immutable object just as today.
b. It is no longer legit to grab values out fo that config if they are
changeable.
c. Instead

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-04 Thread Jay Kreps
 these things piece by piece.

 Thanks,
 Aditya

 
 From: Jay Kreps [jay.kr...@gmail.com]
 Sent: Friday, May 01, 2015 12:53 PM
 To: dev@kafka.apache.org
 Subject: Re: [DISCUSS] KIP-21 Configuration Management

 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least disturbance.
 But let's really think about getting rid of the files and just have one
 config mechanism. There is always a tendency to make everything pluggable
 which so often just leads to two mediocre solutions. Can we do the exercise
 of trying to consider fully getting rid of file config and seeing what goes
 wrong?

 2. Do we need to model defaults? The current approach is that if you have a
 global config x it is overridden for a topic xyz by /topics/xyz/x, and I
 think this could be extended to /brokers/0/x. I think this is simpler. We
 need to specify the precedence for these overrides, e.g. if you override at
 the broker and topic level I think the topic level takes precedence.

 3. I recommend we have the producer and consumer config just be an override
 under client.id. The override is by client id and we can have separate
 properties for controlling quotas for producers and consumers.

 4. Some configs can be changed just by updating the reference, others may
 require some action. An example of this is if you want to disable log
 compaction (assuming we wanted to make that dynamic) we need to call
 shutdown() on the cleaner. I think it may be required to register a
 listener callback that gets called when the config changes.

 5. For handling the reference can you explain your plan a bit? Currently we
 have an immutable KafkaConfig object with a bunch of vals. That or
 individual values in there get injected all over the code base. I was
 thinking something like this:
 a. We retain the KafkaConfig object as an immutable object just as today.
 b. It is no longer legit to grab values out fo that config if they are
 changeable.
 c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration
 which has a single volatile reference to the current KafkaConfig.
 KafkaConfiguration is what gets passed into various components. So to
 access a config you do something like config.instance.myValue. When the
 config changes the config manager updates this reference.
 d. The KafkaConfiguration is the thing that allows doing the
 configuration.onChange(my.config, callback)

 -Jay

 On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar 
 aaurad...@linkedin.com.invalid wrote:

  Hey everyone,
 
  Wrote up a KIP to update topic, client and broker configs dynamically via
  Zookeeper.
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
 
  Please read and provide feedback.
 
  Thanks,
  Aditya
 
  PS: I've intentionally kept this discussion separate from KIP-5 since I'm
  not sure if that is actively being worked on and I wanted to start with a
  clean slate.
 



RE: [DISCUSS] KIP-21 Configuration Management

2015-05-03 Thread Aditya Auradkar
Hey Joe,

Can you elaborate what you mean by a stop the world change? In this protocol, 
we can target notifications to a subset of brokers in the cluster (controller 
if we need to). Is the AdminChangeNotification a ZK notification or a request 
type exposed by each broker? 

Thanks,
Aditya


From: Joe Stein [joe.st...@stealth.ly]
Sent: Friday, May 01, 2015 5:25 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Hi Aditya, thanks for the write up and focusing on this piece.

Agreed we need something that we can do broker changes dynamically without
rolling restarts.

I think though if every broker is getting changes it with notifications it
is going to limit which configs can be dynamic.

We could never deliver a stop the world configuration change because then
that would happen on the entire cluster to every broker on the same time.

Can maybe just the controller get the notification?

And we provide a layer for brokers to work with the controller to-do the
config change operations at is discretion (so it can stop things if needs).

controller gets notification, sends AdminChangeNotification to broker [X ..
N] then brokers can do their things, even send a response for heartbeating
while it takes the few milliseconds it needs or crashes. We need to go
through both scenarios.

I am worried we put this change in like this and it works for quotas and
maybe a few other things but nothing else gets dynamic and we don't get far
enough for almost no more rolling restarts.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote:

 1. I have deep concerns about managing configuration in ZooKeeper.
 First, Producers and Consumers shouldn't depend on ZK at all, this
 seems
 to add back a dependency we are trying to get away from.

 The KIP probably needs to be clarified here - I don't think Aditya was
 referring to client (producer/consumer) configs. These are global
 client-id-specific configs that need to be managed centrally.
 (Specifically, quota overrides on a per-client basis).




Re: [DISCUSS] KIP-21 Configuration Management

2015-05-03 Thread Neha Narkhede
Thanks for starting this discussion, Aditya. Few questions/comments

1. If you change the default values like it's mentioned in the KIP, do you
also overwrite the local config file as part of updating the default value?
If not, where does the admin look to find the default values, ZK or local
Kafka config file? What if a config value is different in both places?

2. I share Gwen's concern around making sure that popular config management
tools continue to work with this change. Would love to see how each of
those would work with the proposal in the KIP. I don't know enough about
each of the tools but seems like in some of the tools, you have to define
some sort of class with parameter names as config names. How will such
tools find out about the config values? In Puppet, if this means that each
Puppet agent has to read it from ZK, this means the ZK port has to be open
to pretty much every machine in the DC. This is a bummer and a very
confusing requirement. Not sure if this is really a problem or not (each of
those tools might behave differently), though pointing out that this is
something worth paying attention to.

3. The wrapper tools that let users read/change config tools should not
depend on ZK for the reason mentioned above. It's a pain to assume that the
ZK port is open from any machine that needs to run this tool. Ideally what
users want is a REST API to the brokers to change or read the config (ala
Elasticsearch), but in the absence of the REST API, we should think if we
can write the tool such that it just requires talking to the Kafka broker
port. This will require a config RPC.

4. Not sure if KIP is the right place to discuss the design of propagating
the config changes to the brokers, but have you thought about just letting
the controller oversee the config changes and propagate via RPC to the
brokers? That way, there is an easier way to express config changes that
require all brokers to change it for it to be called complete. Maybe this
is not required, but it is hard to say if we don't discuss the full set of
configs that need to be dynamic.

Thanks,
Neha

On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote:

 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least disturbance.
 But let's really think about getting rid of the files and just have one
 config mechanism. There is always a tendency to make everything pluggable
 which so often just leads to two mediocre solutions. Can we do the exercise
 of trying to consider fully getting rid of file config and seeing what goes
 wrong?

 2. Do we need to model defaults? The current approach is that if you have a
 global config x it is overridden for a topic xyz by /topics/xyz/x, and I
 think this could be extended to /brokers/0/x. I think this is simpler. We
 need to specify the precedence for these overrides, e.g. if you override at
 the broker and topic level I think the topic level takes precedence.

 3. I recommend we have the producer and consumer config just be an override
 under client.id. The override is by client id and we can have separate
 properties for controlling quotas for producers and consumers.

 4. Some configs can be changed just by updating the reference, others may
 require some action. An example of this is if you want to disable log
 compaction (assuming we wanted to make that dynamic) we need to call
 shutdown() on the cleaner. I think it may be required to register a
 listener callback that gets called when the config changes.

 5. For handling the reference can you explain your plan a bit? Currently we
 have an immutable KafkaConfig object with a bunch of vals. That or
 individual values in there get injected all over the code base. I was
 thinking something like this:
 a. We retain the KafkaConfig object as an immutable object just as today.
 b. It is no longer legit to grab values out fo that config if they are
 changeable.
 c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration
 which has a single volatile reference to the current KafkaConfig.
 KafkaConfiguration is what gets passed into various components. So to
 access a config you do something like config.instance.myValue. When the
 config changes the config manager updates this reference.
 d. The KafkaConfiguration is the thing that allows doing the
 configuration.onChange(my.config, callback)

 -Jay

 On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar 
 aaurad...@linkedin.com.invalid wrote:

  Hey everyone,
 
  Wrote up a KIP to update topic, client and broker configs dynamically via
  Zookeeper.
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
 
  Please read and provide feedback.
 
  Thanks,
  Aditya
 
  PS: I've intentionally kept this discussion separate from KIP-5 since I'm
  not sure if that is actively being worked on and I wanted to start with a
  clean slate.
 




-- 
Thanks,
Neha


RE: [DISCUSS] KIP-21 Configuration Management

2015-05-03 Thread Aditya Auradkar
Hey everyone,

Thanks for the comments. I'll respond to each one-by-one. In the meantime, can 
we put this on the agenda for the KIP hangout for next week?

Thanks,
Aditya


From: Neha Narkhede [n...@confluent.io]
Sent: Sunday, May 03, 2015 9:48 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Thanks for starting this discussion, Aditya. Few questions/comments

1. If you change the default values like it's mentioned in the KIP, do you
also overwrite the local config file as part of updating the default value?
If not, where does the admin look to find the default values, ZK or local
Kafka config file? What if a config value is different in both places?

2. I share Gwen's concern around making sure that popular config management
tools continue to work with this change. Would love to see how each of
those would work with the proposal in the KIP. I don't know enough about
each of the tools but seems like in some of the tools, you have to define
some sort of class with parameter names as config names. How will such
tools find out about the config values? In Puppet, if this means that each
Puppet agent has to read it from ZK, this means the ZK port has to be open
to pretty much every machine in the DC. This is a bummer and a very
confusing requirement. Not sure if this is really a problem or not (each of
those tools might behave differently), though pointing out that this is
something worth paying attention to.

3. The wrapper tools that let users read/change config tools should not
depend on ZK for the reason mentioned above. It's a pain to assume that the
ZK port is open from any machine that needs to run this tool. Ideally what
users want is a REST API to the brokers to change or read the config (ala
Elasticsearch), but in the absence of the REST API, we should think if we
can write the tool such that it just requires talking to the Kafka broker
port. This will require a config RPC.

4. Not sure if KIP is the right place to discuss the design of propagating
the config changes to the brokers, but have you thought about just letting
the controller oversee the config changes and propagate via RPC to the
brokers? That way, there is an easier way to express config changes that
require all brokers to change it for it to be called complete. Maybe this
is not required, but it is hard to say if we don't discuss the full set of
configs that need to be dynamic.

Thanks,
Neha

On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote:

 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least disturbance.
 But let's really think about getting rid of the files and just have one
 config mechanism. There is always a tendency to make everything pluggable
 which so often just leads to two mediocre solutions. Can we do the exercise
 of trying to consider fully getting rid of file config and seeing what goes
 wrong?

 2. Do we need to model defaults? The current approach is that if you have a
 global config x it is overridden for a topic xyz by /topics/xyz/x, and I
 think this could be extended to /brokers/0/x. I think this is simpler. We
 need to specify the precedence for these overrides, e.g. if you override at
 the broker and topic level I think the topic level takes precedence.

 3. I recommend we have the producer and consumer config just be an override
 under client.id. The override is by client id and we can have separate
 properties for controlling quotas for producers and consumers.

 4. Some configs can be changed just by updating the reference, others may
 require some action. An example of this is if you want to disable log
 compaction (assuming we wanted to make that dynamic) we need to call
 shutdown() on the cleaner. I think it may be required to register a
 listener callback that gets called when the config changes.

 5. For handling the reference can you explain your plan a bit? Currently we
 have an immutable KafkaConfig object with a bunch of vals. That or
 individual values in there get injected all over the code base. I was
 thinking something like this:
 a. We retain the KafkaConfig object as an immutable object just as today.
 b. It is no longer legit to grab values out fo that config if they are
 changeable.
 c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration
 which has a single volatile reference to the current KafkaConfig.
 KafkaConfiguration is what gets passed into various components. So to
 access a config you do something like config.instance.myValue. When the
 config changes the config manager updates this reference.
 d. The KafkaConfiguration is the thing that allows doing the
 configuration.onChange(my.config, callback)

 -Jay

 On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar 
 aaurad...@linkedin.com.invalid wrote:

  Hey everyone,
 
  Wrote up a KIP to update topic, client and broker configs dynamically via
  Zookeeper

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-02 Thread Jay Kreps
Hey Aditya,

This is a great! A couple of comments:

1. Leaving the file config in place is definitely the least disturbance.
But let's really think about getting rid of the files and just have one
config mechanism. There is always a tendency to make everything pluggable
which so often just leads to two mediocre solutions. Can we do the exercise
of trying to consider fully getting rid of file config and seeing what goes
wrong?

2. Do we need to model defaults? The current approach is that if you have a
global config x it is overridden for a topic xyz by /topics/xyz/x, and I
think this could be extended to /brokers/0/x. I think this is simpler. We
need to specify the precedence for these overrides, e.g. if you override at
the broker and topic level I think the topic level takes precedence.

3. I recommend we have the producer and consumer config just be an override
under client.id. The override is by client id and we can have separate
properties for controlling quotas for producers and consumers.

4. Some configs can be changed just by updating the reference, others may
require some action. An example of this is if you want to disable log
compaction (assuming we wanted to make that dynamic) we need to call
shutdown() on the cleaner. I think it may be required to register a
listener callback that gets called when the config changes.

5. For handling the reference can you explain your plan a bit? Currently we
have an immutable KafkaConfig object with a bunch of vals. That or
individual values in there get injected all over the code base. I was
thinking something like this:
a. We retain the KafkaConfig object as an immutable object just as today.
b. It is no longer legit to grab values out fo that config if they are
changeable.
c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration
which has a single volatile reference to the current KafkaConfig.
KafkaConfiguration is what gets passed into various components. So to
access a config you do something like config.instance.myValue. When the
config changes the config manager updates this reference.
d. The KafkaConfiguration is the thing that allows doing the
configuration.onChange(my.config, callback)

-Jay

On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar 
aaurad...@linkedin.com.invalid wrote:

 Hey everyone,

 Wrote up a KIP to update topic, client and broker configs dynamically via
 Zookeeper.

 https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration

 Please read and provide feedback.

 Thanks,
 Aditya

 PS: I've intentionally kept this discussion separate from KIP-5 since I'm
 not sure if that is actively being worked on and I wanted to start with a
 clean slate.



RE: [DISCUSS] KIP-21 Configuration Management

2015-05-01 Thread Aditya Auradkar
Hey Gwen,

Thanks for the feedback. As Joel said, these client configs do not introduce a 
producer/consumer zk dependency. It is configuration that is needed by the 
broker.

From your comments, I gather that you are more worried about managing broker 
internal configs via Zookeeper since we already have a file. So why have two 
mechanisms? Given that we already manage topic specific configuration in ZK, 
it seems a good fit to at least have client configs there since these config 
parameters aren't really driven through a file anyway. It also maintains 
consistency. 

Even for broker configs, it seems very consistent to have all the overridden 
configs in one place which is easy to view and change. As you mentioned user's 
shouldn't ever have to fiddle with Zookeeper directly, our tooling should 
provide the ability to view and modify configs on a per-broker basis. I do like 
your suggestion of reloading config files but I'm not sure this works easily 
for everyone. For example, often these per-host overrides in config files are 
managed by hostname but what we really want are broker level overrides which 
means that it should ideally be tied to a broker-id which is a Kafka detail. In 
addition, sometimes these configs pushed to individual hosts aren't the 
properties files themselves.. rather some company specific stuff that also 
contains the Kafka configs. I guess the point I'm trying to make is that people 
may not be able to reload configs directly from file without doing some 
additional work in many cases.

As far as propogating configuration changes, perhaps I can clarify this section 
a bit more. Also, we can also do a pass over all the configs in KafkaConfig and 
have a list of properties that can be converted slowly.

Thanks,
Aditya


From: Joel Koshy [jjkosh...@gmail.com]
Sent: Thursday, April 30, 2015 5:14 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

1. I have deep concerns about managing configuration in ZooKeeper.
First, Producers and Consumers shouldn't depend on ZK at all, this seems
to add back a dependency we are trying to get away from.

The KIP probably needs to be clarified here - I don't think Aditya was
referring to client (producer/consumer) configs. These are global
client-id-specific configs that need to be managed centrally.
(Specifically, quota overrides on a per-client basis).



Re: [DISCUSS] KIP-21 Configuration Management

2015-05-01 Thread Joe Stein
Hi Aditya, thanks for the write up and focusing on this piece.

Agreed we need something that we can do broker changes dynamically without
rolling restarts.

I think though if every broker is getting changes it with notifications it
is going to limit which configs can be dynamic.

We could never deliver a stop the world configuration change because then
that would happen on the entire cluster to every broker on the same time.

Can maybe just the controller get the notification?

And we provide a layer for brokers to work with the controller to-do the
config change operations at is discretion (so it can stop things if needs).

controller gets notification, sends AdminChangeNotification to broker [X ..
N] then brokers can do their things, even send a response for heartbeating
while it takes the few milliseconds it needs or crashes. We need to go
through both scenarios.

I am worried we put this change in like this and it works for quotas and
maybe a few other things but nothing else gets dynamic and we don't get far
enough for almost no more rolling restarts.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote:

 1. I have deep concerns about managing configuration in ZooKeeper.
 First, Producers and Consumers shouldn't depend on ZK at all, this
 seems
 to add back a dependency we are trying to get away from.

 The KIP probably needs to be clarified here - I don't think Aditya was
 referring to client (producer/consumer) configs. These are global
 client-id-specific configs that need to be managed centrally.
 (Specifically, quota overrides on a per-client basis).




Re: [DISCUSS] KIP-21 Configuration Management

2015-04-30 Thread Joel Koshy
1. I have deep concerns about managing configuration in ZooKeeper.
First, Producers and Consumers shouldn't depend on ZK at all, this seems
to add back a dependency we are trying to get away from.

The KIP probably needs to be clarified here - I don't think Aditya was
referring to client (producer/consumer) configs. These are global
client-id-specific configs that need to be managed centrally.
(Specifically, quota overrides on a per-client basis).