Hey Joe, I think this is proposing several things: 1. A new command line utility. This isn't really fully specified here. There is sample usage but I actually don't really understand what all the commands will be. Also, presumably this will replace the existing shell scripts, right? We obviously don't want to be in a state where we have both... 2. A new set of language agnostic administrative protocols. 3. A new Java API for issuing administrative requests using the protocol. I don't see any discussion on what this will look like.
It might be easiest to tackle these one at a time, no? If not we really do need to get a complete description at each layer as these are pretty core public apis. -Jay On Fri, Feb 6, 2015 at 11:18 AM, Joe Stein <joe.st...@stealth.ly> wrote: > I updated the installation and sample usage for the existing patches on the > KIP site > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations > > There are still a few pending items here. > > 1) There was already some discussion about using the Broker that is the > Controller here https://issues.apache.org/jira/browse/KAFKA-1772 and we > should elaborate on that more in the thread or agree we are ok with admin > asking for the controller to talk to and then just sending that broker the > admin tasks. > > 2) I like this idea https://issues.apache.org/jira/browse/KAFKA-1912 but > we > can refactor after KAFK-1694 committed, no? I know folks just want to talk > to the broker that is the controller. It may even become useful to have the > controller run on a broker that isn't even a topic broker anymore (small > can of worms I am opening here but it elaborates on Guozhang's hot spot > point. > > 3) anymore feedback? > > - Joe Stein > > On Fri, Jan 23, 2015 at 3:15 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > > A centralized admin operation protocol would be very useful. > > > > One more general comment here is that controller is originally designed > to > > only talk to other brokers through ControllerChannel, while the broker > > instance which carries the current controller is agnostic of its > existence, > > and use KafkaApis to handle general Kafka requests. Having all admin > > requests redirected to the controller instance will force the broker to > be > > aware of its carried controller, and access its internal data for > handling > > these requests. Plus with the number of clients out of Kafka's control, > > this may easily cause the controller to be a hot spot in terms of request > > load. > > > > > > On Thu, Jan 22, 2015 at 10:09 PM, Joe Stein <joe.st...@stealth.ly> > wrote: > > > > > inline > > > > > > On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps <jay.kr...@gmail.com> > wrote: > > > > > > > Hey Joe, > > > > > > > > This is great. A few comments on KIP-4 > > > > > > > > 1. This is much needed functionality, but there are a lot of the so > > let's > > > > really think these protocols through. We really want to end up with a > > set > > > > of well thought-out, orthoganol apis. For this reason I think it is > > > really > > > > important to think through the end state even if that includes APIs > we > > > > won't implement in the first phase. > > > > > > > > > > ok > > > > > > > > > > > > > > 2. Let's please please please wait until we have switched the server > > over > > > > to the new java protocol definitions. If we add upteen more ad hoc > > scala > > > > objects that is just generating more work for the conversion we know > we > > > > have to do. > > > > > > > > > > ok :) > > > > > > > > > > > > > > 3. This proposal introduces a new type of optional parameter. This is > > > > inconsistent with everything else in the protocol where we use -1 or > > some > > > > other marker value. You could argue either way but let's stick with > > that > > > > for consistency. For clients that implemented the protocol in a > better > > > way > > > > than our scala code these basic primitives are hard to change. > > > > > > > > > > yes, less confusing, ok. > > > > > > > > > > > > > > 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest > which > > > has > > > > brokers, topics, and partitions. I think we should rename that > request > > > > ClusterMetadataRequest (or just MetadataRequest) and include the id > of > > > the > > > > controller. Or are there other things we could add here? > > > > > > > > > > We could add broker version to it. > > > > > > > > > > > > > > 5. We have a tendency to try to make a lot of requests that can only > go > > > to > > > > particular nodes. This adds a lot of burden for client > implementations > > > (it > > > > sounds easy but each discovery can fail in many parts so it ends up > > > being a > > > > full state machine to do right). I think we should consider making > > admin > > > > commands and ideally as many of the other apis as possible available > on > > > all > > > > brokers and just redirect to the controller on the broker side. > Perhaps > > > > there would be a general way to encapsulate this re-routing behavior. > > > > > > > > > > If we do that then we should also preserve what we have and do both. > The > > > client can then decide "do I want to go to any broker and proxy" or > just > > > "go to controller and run admin task". Lots of folks have seen > > controllers > > > come under distress because of their producers/consumers. There is > ticket > > > too for controller elect and re-elect > > > https://issues.apache.org/jira/browse/KAFKA-1778 so you can force it > to > > a > > > broker that has 0 load. > > > > > > > > > > > > > > 6. We should probably normalize the key value pairs used for configs > > > rather > > > > than embedding a new formatting. So two strings rather than one with > an > > > > internal equals sign. > > > > > > > > > > ok > > > > > > > > > > > > > > 7. Is the postcondition of these APIs that the command has begun or > > that > > > > the command has been completed? It is a lot more usable if the > command > > > has > > > > been completed so you know that if you create a topic and then > publish > > to > > > > it you won't get an exception about there being no such topic. > > > > > > > > > > We should define that more. There needs to be some more state there, > yes. > > > > > > We should try to cover > https://issues.apache.org/jira/browse/KAFKA-1125 > > > within what we come up with. > > > > > > > > > > > > > > 8. Describe topic and list topics duplicate a lot of stuff in the > > > metadata > > > > request. Is there a reason to give back topics marked for deletion? I > > > feel > > > > like if we just make the post-condition of the delete command be that > > the > > > > topic is deleted that will get rid of the need for this right? And it > > > will > > > > be much more intuitive. > > > > > > > > > > I will go back and look through it. > > > > > > > > > > > > > > 9. Should we consider batching these requests? We have generally > tried > > to > > > > allow multiple operations to be batched. My suspicion is that without > > > this > > > > we will get a lot of code that does something like > > > > for(topic: adminClient.listTopics()) > > > > adminClient.describeTopic(topic) > > > > this code will work great when you test on 5 topics but not do as > well > > if > > > > you have 50k. > > > > > > > > > > So => Input is a list of topics (or none for all) and a batch response > > from > > > the controller (which could be routed through another broker) of the > > entire > > > response? We could introduce a Batch keyword to explicitly show the > usage > > > of it. > > > > > > > > > > 10. I think we should also discuss how we want to expose a > programmatic > > > JVM > > > > client api for these operations. Currently people rely on AdminUtils > > > which > > > > is totally sketchy. I think we probably need another client under > > > clients/ > > > > that exposes administrative functionality. We will need this just to > > > > properly test the new apis, I suspect. We should figure out that API. > > > > > > > > > > We were talking about that here > > > https://issues.apache.org/jira/browse/KAFKA-1774 and wrote it in java > > > https://reviews.apache.org/r/29301/diff/7/?page=4#75 so we could do > > > something like that, sure. > > > > > > > > > > > > > > 11. The other information that would be really useful to get would be > > > > information about partitions--how much data is in the partition, what > > are > > > > the segment offsets, what is the log-end offset (i.e. last offset), > > what > > > is > > > > the compaction point, etc. I think that done right this would be the > > > > successor to the very awkward OffsetRequest we have today. > > > > > > > > > > yes! > > > > > > > > > > > > > > -Jay > > > > > > > > On Wed, Jan 21, 2015 at 10:27 PM, Joe Stein <joe.st...@stealth.ly> > > > wrote: > > > > > > > > > Hi, created a KIP > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations > > > > > > > > > > JIRA https://issues.apache.org/jira/browse/KAFKA-1694 > > > > > > > > > > /******************************************* > > > > > Joe Stein > > > > > Founder, Principal Consultant > > > > > Big Data Open Source Security LLC > > > > > http://www.stealth.ly > > > > > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop > > > > > > > ********************************************/ > > > > > > > > > > > > > > > > > > > > -- > > -- Guozhang > > >