how about a command line script (bin/kafka-config-init.sh) to load in a file for the configs to initialize the config values in zookeerper but kafka reads the configs from zookeeper
another script can also have options for doing updates (bin/kafka-config-update.sh) if we provide a writing mechanism then the config management systems (we use chef) can interact nice the zookeeper updates in a standard way that we document and support win? win? On Fri, Jan 18, 2013 at 4:19 PM, Jay Kreps <[email protected]> wrote: > Yes please, any help very much appreciated. > > I am not sure if I understand what you are proposing, though. Are you > saying support both the config file and zk for topic-level configs? I hate > to do things where the answer is "do both"...I guess I feel that although > everyone walks away happy it ends up being a lot of code and combinatorial > testing. So if there is a different plan that hits all requirements I like > that better. I am very sensitive to the fact that zookeeper is an okay > key/value store but a really poor replacement for a config management > system. It might be worth while to try to work out a way that meets all > needs, if such a thing exists. > > Is bouncing brokers for topic-overrides a problem for you in your > environment? If so how would you fix it? > > -Jay > > On Fri, Jan 18, 2013 at 7:53 AM, Joe Stein <[email protected]> > wrote: > > > Can I help out? > > > > Also can we abstract the config call too? We have so much in chef, it's > > not that i don't want to call our zookeeper cluster for it but we don't > > have our topology yet mapped out in znodes they are in our own instances > of > > code. > > > > It should have both a pull and push for changes, one thing that's nice > > with zookeeper and having a watcher. > > > > /* > > Joe Stein, Chief Architect > > http://www.medialets.com > > Twitter: @allthingshadoop > > Mobile: 917-597-9771 > > */ > > > > On Jan 18, 2013, at 12:09 AM, Jay Kreps <[email protected]> wrote: > > > > > Currently kafka broker config is all statically defined in a properties > > > file with the broker. This mostly works pretty well, but for per-topic > > > configuration (the flush policy, partition count, etc) it is pretty > > painful > > > to have to bounce the broker every time you make a config change. > > > > > > That lead to this proposal: > > > https://cwiki.apache.org/confluence/display/KAFKA/Dynamic+Topic+Config > > > > > > An open question is how topic-default configurations should work. > > > > > > Currently each of our topic-level configs is paired with a default. So > > you > > > would have something like > > > segment.size.bytes > > > which would be the default, and then you can override this for topics > > that > > > need something different using a map: > > > segment.size.bytes.per.topic > > > > > > The proposal is to move the topic configuration into zookeeper so that > > for > > > a topic "my-topic" we would have a znode > > > /brokers/topics/my-topic/config > > > and the contents of this znode would be the topic configuration either > as > > > json or properties or whatever. > > > > > > There are two ways this config could work: > > > 1. Defaults resolved at topic creation time: At the time a topic is > > created > > > the user would specify some properties they wanted for that topic, any > > > topic they didn't specify would take the server default. ALL these > > > properties would be stored in the znode. > > > 2. Defaults resolved at config read time: When a topic is created the > > user > > > specifies particularly properties they want and ONLY the properties > they > > > particularly specify would be stored. At runtime we would merge these > > > properties with whatever the server defaults currently are. > > > > > > This is a somewhat nuanced point, but perhaps important. > > > > > > The advantage of the first proposal is that it is simple. If you want > to > > > know the configuration for a particular topic you go to zookeeper and > > look > > > at that topics config. Mixing the combination of server config and > > > zookeeper config dynamically makes it a little harder to figure out > what > > > the current state of anything is. > > > > > > The disadvantage of the first proposal (and the advantage of the second > > > proposal) is that making global changes is easier. For example if you > > want > > > to globally lower the retention for all topics, in proposal one you > would > > > have to iterate over all topics and update the config (this could be > done > > > automatically with tooling, but under the covers the tool would do > this). > > > In the second case you would just update the default value. > > > > > > Thoughts? If no one cares, I will just pick whatever seems best. > > > > > > -Jay > > > -- /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> */
