On Fri, Feb 3, 2017, at 16:57, Dong Lin wrote: > Thanks for the reply, Colin. I have some comments inline.
Hi Dong L., > > In addition, I also have some comments regarding the Future() in response > to your latest email. As Ismael mentioned, we have added > purgeDataBefore() > API in AdminClient. This API returns Future() that allows user to purge > data in either syn or async manner. And we have presented use-case for > both > syn and async usage of this API in the discussion thread of KIP-107. I > think we should at least return a Future() object in this case, right? Hmm. I didn't see any discussion in the KIP-107 [DISCUSS] thread of why a Future<> based API was proposed. It seemed like the discussion focused on other issues (or maybe I missed the an email thread?) > > As you mentioned, we can transform a blocking API into a Futures-based > API > by using a thread pool. But thread pool seems inconvenient as compared to > using future().get() which transform a Future-based API into a blocking > API. Would it be a good reason to return Future() for all those API where > we need both syn and async mode? That's a good point. It's easier to build a sync API on top of Futures than the reverse. > > > On Fri, Feb 3, 2017 at 10:20 AM, Colin McCabe <cmcc...@apache.org> wrote: > > > On Thu, Feb 2, 2017, at 17:54, Dong Lin wrote: > > > Hey Colin, > > > > > > Thanks for the KIP. I have a few comments below: > > > > > > - I share similar view with Ismael that a Future-based API is better. > > > PurgeDataFrom() is an example API that user may want to do it > > > asynchronously even though there is only one request in flight at a time. > > > In the future we may also have some admin operation that allows user to > > > move replica from one broker to another, which also needs to work in both > > > sync and async style. It seems more generic to return Future<?> for any > > > API > > > that requires both mode. > > > > > > - I am not sure if it is the cleanest way to have enum classes > > > CreateTopicsFlags and DeleteTopicsFlags. Are we going to create such > > > class > > > for every future API that requires configuration? It may be more generic > > > to > > > provide Map<String, ?> to any admin API that operates on multiple > > > entries. > > > For example, deleteTopic(Map<String, Boolean>). And it can be Map<String, > > > Properties> for those API that requires multiple configs per entry. And > > > we > > > can provide default value, doc, config name for those API as we do > > > with AbstractConfig. > > > > Thanks for the comments, Dong L. > > > > EnumSet<CreateTopicsFlags>, EnumSet<DeleteTopicsFlags>, and so forth are > > type-safe ways for the user to pass a set of boolean flags to the > > function. It is basically a replacement for having an api like > > createTopics(...., boolean nonblocking, boolean validateOnly). It is > > preferrable to having the boolean flags because it's immediately clear > > when reading the code what createTopics(..., > > CreateTopicsFlags.VALIDATE_ONLY) means, whereas it is not immediately > > clear what createTopics(..., false, true) means. It also prevents > > having lots of function overloads over time, which becomes confusing for > > users. The EnumSet is not intended as a replacement for all possible > > future arguments we might add, but just an easy way to add more boolean > > arguments later without adding messy function overloads or type > > signatures with many booleans. > > > > Map<String, ?> is not type-safe, and I don't think we should use it in > > our APIs. Instead, we should just add function overloads if necessary. > > > > I agree that using EnumSet is type safe. > > > > > > > > > > - I not sure if "Try" is very intuitive to Java developer. Is there any > > > counterpart of scala's "Try" in java > > > > Unfortunately, there is no equivalent to Try in the standard Java > > library. That is the first place I checked, and I spent a while > > searching. The closest is probably Java 8's Optional. However, > > Optional just allows us to express Some(thing) or None, so callers would > > not be able to determine what the error was. > > > > > We actually have quite a few > > > existing > > > classes in Kafka that address the same problem, such as > > > ProduceRequestResult, LogAppendResult etc. Maybe we can follow the same > > > conversion and use *Result as this class name. > > > > Hmm. ProduceRequestResult and LogAppendResult just store an exception > > alongside the data, and make the caller responsible for checking whether > > the exception is null (or None) before looking at the data. I don't > > think this is a good solution, because if the user forgets to do this, > > they will interpret whatever is in the data fields as valid, even when > > it is not. ProduceRequestResult and LogAppendResult are also internal > > classes, not user-facing APIs, so we did not spend time thinking about > > how to make them easy to use for end-users. > > > > I am not actually suggesting to use LogAppendResult directly. I am > wondering if it would be more intuitive for developer to name this class > something like OperationResult instead of Try. The OperationResult can > store whatever we want to store with current "Try" class you proposed in > the KIP. It is just my personal opinion that "Try" doesn't directly tell > me > what the class actually does. But I am not too worried about it if you > think "Try" is better. I guess part of the reason I used "Try" is that scala.util.Try [ http://www.scala-lang.org/api/2.9.3/scala/util/Try.html ] is very well-known to most Scala developers and fills the same function. I'm open to other names... "Result" might be better than "OperationResult", since it's shorter. > > > > > > > > > > - How are we going to allow user to specify timeout for blocking APIs > > > such > > > as deleteTopic? Is this configured per AdminClient, or should it be > > > specified in the API parameter? > > > > Right now, it is specified by the configuration of the AdminClient. > > > > User probably wants different client-side timeout for different API, > right? > For example, user may want to block for a long time for topic creation > API > because he needs topic to be created before doing the next thing. But the > user may want to call purgeDataBefore() API periodically and > automatically > without waiting for purge operation to complete. So if we don't provide > Future() to user, we should probably allow user to specify client-side > timeout in the API. Keep in mind "async on the client" (futures) is different than "async on the server" (setting very short or 0 timeouts in the RPC). We could have a future-based API and still have long server timeouts. It just means that the futures will wait longer before returning failure or success. I agree that it would be nice to be able to set per-operation timeouts, though. cheers, Colin > > > > > > > > > > - Are we going to have this class initiate its own thread, as we do with > > > Sender class, to send/receive requests? If yes, it will be useful to have > > > have a class that extends AbstractConfig and specifies config and their > > > default values. > > > > Yes, I agree. I will add this to the KIP. > > > > best, > > Colin > > > > > > > > Thanks, > > > Dong > > > > > > > > > > > > On Thu, Feb 2, 2017 at 3:02 PM, Ismael Juma <ism...@juma.me.uk> wrote: > > > > > > > Hi Colin, > > > > > > > > Thanks for the KIP, great to make progress on this. I have some initial > > > > comments, will post more later: > > > > > > > > 1. We have KafkaProducer that implements the Producer interface and > > > > KafkaConsumer that implements the Consumer interface. Maybe we could > > have > > > > KafkaAdminClient that implements the AdminClient interface? Or maybe > > just > > > > KafkaAdmin. Not sure, some ideas for consideration. Also, I don't > > think we > > > > should worry about a name clash with the internal AdminClient written > > in > > > > Scala. That will go away soon enough and choosing a good name for the > > > > public class is what matters. > > > > > > > > 2. We should include the proposed package name in the KIP > > > > (probably org.apache.kafka.clients.admin?). > > > > > > > > 3. It would be good to list the supported configs. > > > > > > > > 4. KIP-107, which passed the vote, specifies the introduction of a > > method > > > > to AdminClient with the following signature. We should figure out how > > it > > > > would look given this proposal. > > > > > > > > Future<Map<TopicPartition, PurgeDataResult>> > > > > purgeDataBefore(Map<TopicPartition, Long> offsetForPartition) > > > > > > > > 5. I am not sure about rejecting the Futures-based API. I think I would > > > > prefer that, personally. Grant had an interesting idea of not exposing > > the > > > > batch methods in the AdminClient to start with to keep it simple and > > > > relying on a Future based API to make it easy for users to run things > > > > concurrently. This is consistent with the producer and Java 8 makes > > things > > > > a lot nicer with CompletableFuture (similar to Scala Futures). I will > > think > > > > more about this and other details of the proposal and send a follow-up. > > > > > > > > Ismael > > > > > > > > On Thu, Feb 2, 2017 at 6:54 PM, Colin McCabe <cmcc...@apache.org> > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I wrote up a Kafka improvement proposal for adding an > > > > > AdministrativeClient interface. This is a continuation of the work > > on > > > > > KIP-4 towards centralized administrative operations. Please check > > out > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-117% > > > > 3A+Add+a+public+ > > > > > AdministrativeClient+API+for+Kafka+admin+operations > > > > > > > > > > regards, > > > > > Colin > > > > > > > > > > >