Hey Joe,

I think this is proposing several things:
1. A new command line utility. This isn't really fully specified here.
There is sample usage but I actually don't really understand what all the
commands will be. Also, presumably this will replace the existing shell
scripts, right? We obviously don't want to be in a state where we have
both...
2. A new set of language agnostic administrative protocols.
3. A new Java API for issuing administrative requests using the protocol. I
don't see any discussion on what this will look like.

It might be easiest to tackle these one at a time, no? If not we really do
need to get a complete description at each layer as these are pretty core
public apis.

-Jay

On Fri, Feb 6, 2015 at 11:18 AM, Joe Stein <joe.st...@stealth.ly> wrote:

> I updated the installation and sample usage for the existing patches on the
> KIP site
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations
>
> There are still a few pending items here.
>
> 1) There was already some discussion about using the Broker that is the
> Controller here https://issues.apache.org/jira/browse/KAFKA-1772 and we
> should elaborate on that more in the thread or agree we are ok with admin
> asking for the controller to talk to and then just sending that broker the
> admin tasks.
>
> 2) I like this idea https://issues.apache.org/jira/browse/KAFKA-1912 but
> we
> can refactor after KAFK-1694 committed, no? I know folks just want to talk
> to the broker that is the controller. It may even become useful to have the
> controller run on a broker that isn't even a topic broker anymore (small
> can of worms I am opening here but it elaborates on Guozhang's hot spot
> point.
>
> 3) anymore feedback?
>
> - Joe Stein
>
> On Fri, Jan 23, 2015 at 3:15 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>
> > A centralized admin operation protocol would be very useful.
> >
> > One more general comment here is that controller is originally designed
> to
> > only talk to other brokers through ControllerChannel, while the broker
> > instance which carries the current controller is agnostic of its
> existence,
> > and use KafkaApis to handle general Kafka requests. Having all admin
> > requests redirected to the controller instance will force the broker to
> be
> > aware of its carried controller, and access its internal data for
> handling
> > these requests. Plus with the number of clients out of Kafka's control,
> > this may easily cause the controller to be a hot spot in terms of request
> > load.
> >
> >
> > On Thu, Jan 22, 2015 at 10:09 PM, Joe Stein <joe.st...@stealth.ly>
> wrote:
> >
> > > inline
> > >
> > > On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps <jay.kr...@gmail.com>
> wrote:
> > >
> > > > Hey Joe,
> > > >
> > > > This is great. A few comments on KIP-4
> > > >
> > > > 1. This is much needed functionality, but there are a lot of the so
> > let's
> > > > really think these protocols through. We really want to end up with a
> > set
> > > > of well thought-out, orthoganol apis. For this reason I think it is
> > > really
> > > > important to think through the end state even if that includes APIs
> we
> > > > won't implement in the first phase.
> > > >
> > >
> > > ok
> > >
> > >
> > > >
> > > > 2. Let's please please please wait until we have switched the server
> > over
> > > > to the new java protocol definitions. If we add upteen more ad hoc
> > scala
> > > > objects that is just generating more work for the conversion we know
> we
> > > > have to do.
> > > >
> > >
> > > ok :)
> > >
> > >
> > > >
> > > > 3. This proposal introduces a new type of optional parameter. This is
> > > > inconsistent with everything else in the protocol where we use -1 or
> > some
> > > > other marker value. You could argue either way but let's stick with
> > that
> > > > for consistency. For clients that implemented the protocol in a
> better
> > > way
> > > > than our scala code these basic primitives are hard to change.
> > > >
> > >
> > > yes, less confusing, ok.
> > >
> > >
> > > >
> > > > 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest
> which
> > > has
> > > > brokers, topics, and partitions. I think we should rename that
> request
> > > > ClusterMetadataRequest (or just MetadataRequest) and include the id
> of
> > > the
> > > > controller. Or are there other things we could add here?
> > > >
> > >
> > > We could add broker version to it.
> > >
> > >
> > > >
> > > > 5. We have a tendency to try to make a lot of requests that can only
> go
> > > to
> > > > particular nodes. This adds a lot of burden for client
> implementations
> > > (it
> > > > sounds easy but each discovery can fail in many parts so it ends up
> > > being a
> > > > full state machine to do right). I think we should consider making
> > admin
> > > > commands and ideally as many of the other apis as possible available
> on
> > > all
> > > > brokers and just redirect to the controller on the broker side.
> Perhaps
> > > > there would be a general way to encapsulate this re-routing behavior.
> > > >
> > >
> > > If we do that then we should also preserve what we have and do both.
> The
> > > client can then decide "do I want to go to any broker and proxy" or
> just
> > > "go to controller and run admin task". Lots of folks have seen
> > controllers
> > > come under distress because of their producers/consumers. There is
> ticket
> > > too for controller elect and re-elect
> > > https://issues.apache.org/jira/browse/KAFKA-1778 so you can force it
> to
> > a
> > > broker that has 0 load.
> > >
> > >
> > > >
> > > > 6. We should probably normalize the key value pairs used for configs
> > > rather
> > > > than embedding a new formatting. So two strings rather than one with
> an
> > > > internal equals sign.
> > > >
> > >
> > > ok
> > >
> > >
> > > >
> > > > 7. Is the postcondition of these APIs that the command has begun or
> > that
> > > > the command has been completed? It is a lot more usable if the
> command
> > > has
> > > > been completed so you know that if you create a topic and then
> publish
> > to
> > > > it you won't get an exception about there being no such topic.
> > > >
> > >
> > > We should define that more. There needs to be some more state there,
> yes.
> > >
> > > We should try to cover
> https://issues.apache.org/jira/browse/KAFKA-1125
> > > within what we come up with.
> > >
> > >
> > > >
> > > > 8. Describe topic and list topics duplicate a lot of stuff in the
> > > metadata
> > > > request. Is there a reason to give back topics marked for deletion? I
> > > feel
> > > > like if we just make the post-condition of the delete command be that
> > the
> > > > topic is deleted that will get rid of the need for this right? And it
> > > will
> > > > be much more intuitive.
> > > >
> > >
> > > I will go back and look through it.
> > >
> > >
> > > >
> > > > 9. Should we consider batching these requests? We have generally
> tried
> > to
> > > > allow multiple operations to be batched. My suspicion is that without
> > > this
> > > > we will get a lot of code that does something like
> > > >    for(topic: adminClient.listTopics())
> > > >       adminClient.describeTopic(topic)
> > > > this code will work great when you test on 5 topics but not do as
> well
> > if
> > > > you have 50k.
> > > >
> > >
> > > So => Input is a list of topics (or none for all) and a batch response
> > from
> > > the controller (which could be routed through another broker) of the
> > entire
> > > response? We could introduce a Batch keyword to explicitly show the
> usage
> > > of it.
> > >
> > >
> > > > 10. I think we should also discuss how we want to expose a
> programmatic
> > > JVM
> > > > client api for these operations. Currently people rely on AdminUtils
> > > which
> > > > is totally sketchy. I think we probably need another client under
> > > clients/
> > > > that exposes administrative functionality. We will need this just to
> > > > properly test the new apis, I suspect. We should figure out that API.
> > > >
> > >
> > > We were talking about that here
> > > https://issues.apache.org/jira/browse/KAFKA-1774 and wrote it in java
> > > https://reviews.apache.org/r/29301/diff/7/?page=4#75 so we could do
> > > something like that, sure.
> > >
> > >
> > > >
> > > > 11. The other information that would be really useful to get would be
> > > > information about partitions--how much data is in the partition, what
> > are
> > > > the segment offsets, what is the log-end offset (i.e. last offset),
> > what
> > > is
> > > > the compaction point, etc. I think that done right this would be the
> > > > successor to the very awkward OffsetRequest we have today.
> > > >
> > >
> > > yes!
> > >
> > >
> > > >
> > > > -Jay
> > > >
> > > > On Wed, Jan 21, 2015 at 10:27 PM, Joe Stein <joe.st...@stealth.ly>
> > > wrote:
> > > >
> > > > > Hi, created a KIP
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations
> > > > >
> > > > > JIRA https://issues.apache.org/jira/browse/KAFKA-1694
> > > > >
> > > > > /*******************************************
> > > > >  Joe Stein
> > > > >  Founder, Principal Consultant
> > > > >  Big Data Open Source Security LLC
> > > > >  http://www.stealth.ly
> > > > >  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop
> >
> > > > > ********************************************/
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Reply via email to