Hi! > On 12 Aug 2016, at 20:22, Jun Rao <j...@confluent.io> wrote: > > Hi, Andrey, > > Yes, I agree that it's more work for the client to do the round-robin logic > since it has to be stateful. However, that applies to both the consumer > client and the replica fetch thread. I just feel that it's weird to use one > strategy in the replica fetch thread and another in the consumer client. An > alternative is to just let the leader broker shuffle the partitions before > filling up the bytes. This alleviates the need for the clients to maintain > additional states. It's not as deterministic as doing round-robin in the > client, but is probably good enough in practice. >
Oh, I got the point. You are talking about high-level consumer client API (i.e. Consumer.java). Yes, it makes sense to use the same strategy as in ReplicaFetcher thread there. > For 2), I am not sure if we want to set the default limit of fetch response > in the consumer to be 0. Today, sending a response of more than 2GB will > cause buffer overflow. How about defaulting it to 50BM? This should be > enough for the common case. For the default fetch response limit in the > replica fetcher, I'd recommend 10MB since there could be multiple replica > fetcher threads to different brokers. > Ok, I’ll put this numbers in PR - we can discuss them later. > Thanks, > > Jun > Andrey. > > On Fri, Aug 12, 2016 at 9:19 AM, Andrey L. Neporada < > anepor...@yandex-team.ru> wrote: > >> Hi! >> >>> On 12 Aug 2016, at 18:32, Jun Rao <j...@confluent.io> wrote: >>> >>> Hi, Andrey, >>> >>> Why shouldn't the client library do reordering? It seems that if >>> ReplicaFetcher thread does round-robin, the consumer client should do >> that >>> too? >>> >> >> IMHO the client library is not a good place to implement such logic. >> For example, if we want to put round-robin logic in client lib, it will >> become stageful (since it should remember first empty partition received >> from server). It should also remember some kind of context to distinguish >> one end-user (or partition set) from another, etc. >> >> Personally, I'd expect from client library to submit requests as is. >> Anyway, I think it is difficult to come up with reordering logic that will >> be good for all clients. >> >> So, I propose: >> >> 1) preserve per-partition limit in fetch request (as discussed) >> 2) make default limit of fetch response to be 0 (which means no limit at >> all). So, by default, new fetch request should behave exactly like old one >> 3) implement round-robin logic in ReplicaFetcherThread and use non-zero >> default fetch response limit there. >> 4) document that all end-users who want to use FetchRequest with actual >> response limit to implement some kind of round-robin to ensure fairness (if >> they care about it) >> >> >>> Thanks, >>> >>> Jun >>> >> >> Thanks, >> Andrey. >> >> >>> On Fri, Aug 12, 2016 at 3:56 AM, Andrey L. Neporada < >>> anepor...@yandex-team.ru> wrote: >>> >>>> Hi! >>>> >>>>> On 12 Aug 2016, at 04:29, Jun Rao <j...@confluent.io> wrote: >>>>> >>>>> Hi, Andrey, >>>>> >>>>> One potential benefit of keeping the per partition limit is for Kafka >>>>> stream. When reading messages from different partitions, KStream >> prefers >>>> to >>>>> read from partitions with smaller timestamps first and only advances >> the >>>>> KStream timestamp when it sees at least one message from every >> partition. >>>>> Being able to fill up multiple partitions in a single fetch response >> can >>>>> help KStream advance the timestamp quicker when there is backlog from >> the >>>>> input. So, it's probably better if we just add a new response limit >> while >>>>> keeping the per partition limit. >>>>> >>>> >>>> Yes, this makes sense for me. >>>> >>>>> Also, for fairness across partitions, you mentioned "The solution is to >>>>> start fetching from first empty partition in round-robin fashion or to >>>>> perform random shuffle of partitions.". It seems the former is more >>>>> deterministic. Did you use that in your implementation and should we >>>>> recommend that for non-java clients as well? >>>>> >>>> >>>> In my initial implementation I just did random shuffling on server side. >>>> Now I plan to use deterministic approach - do round-robin in >>>> ReplicaFetcher thread - I believe the client library itself shouldn’t do >>>> any form of reordering. >>>> >>>> But we should definitely recommend some form of shuffling if client >> wants >>>> to limit response size. >>>> Not sure which shuffling method is better. >>>> >>>> >>>>> Thanks, >>>>> >>>>> Jun >>>>> >>>> >>>> Thanks, >>>> Andrey. >>>> >>>> >> >>