Re: [DISCUSSION] Partition Selection and Coordination By Brokers for Producers

Jun Rao Mon, 01 Jun 2015 17:51:13 -0700

Bhavesh,

I am not sure if load balancing based on the consumption rate (1.b) makes
sense. Each consumer typically consumes all partitions from a topic. So, as
long as the data in each partition is balanced, the consumption rate will
be balanced too. Selecting a partition based on the size of each partition
could be useful, but I am not sure if it's going to be significantly better
than just having the clients pick a random partition. Also, implementing
this on the broker side has downside. First, having the broker forward each
produce request increases the network traffic on the broker. Second, this
likely will make the broker code more complicated since we probably have to
put every forwarded produce request in a purgatory. Third, we currently
don't maintain the size of each partition on every broker.


Given these, I think your best bet is probably to just fix those non-java
clients to send data in a round robin way.

Thanks,

Jun

On Fri, May 29, 2015 at 1:22 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com>
wrote:

> Hi Kafka Dev Team,
>
> I would appreciate your feedback on moving producer partition selection
> from producer to Broker.   Also, please do let me know what is correct
> process of collecting feedback from Kafka Dev team and/or community.
>
> Thanks,
>
> Bhavesh
>
> On Tue, May 26, 2015 at 11:54 AM, Bhavesh Mistry <
> mistry.p.bhav...@gmail.com
> > wrote:
>
> > Hi Kafka Dev Team,
> >
> > I am sorry I am new to process of discussion and/or KIP.  So, I had
> > commented  other email voting chain.  Please do let me know correct
> process
> > for collecting and staring discussion with Kafka Dev Group.
> >
> > Here is original message:
> >
> > I have had experience with both producer and consumer side.  I have
> > different  use case on this partition selection strategy.
> >
> >
> >
> > Problem :
> >
> >
> > We have heterogeneous environment of producers (by that I mean we have
> > node js, python, New Java & Old Scala Based producers to same topic).   I
> > have seen that not all producers employ round-robing strategies for
> > non-keyed message like new producer does.  Hence, it creates non uniform
> > data ingestion into partition and delay in overall message processing.
> >
> > How to address uniform distribution/message injection rate to all
> > partitions ?
> >
> >
> >
> > Propose Solution:
> >
> >
> > Let broker cluster decide the next partition for topic to send data
> rather
> > than producer itself with more intelligence.
> >
> > 1)   When sending data to brokers (ProduceResponse) Kafka Protocol over
> > the wire send hint to client which partition to send based on following
> > logic (Or can be customizable)
> >
> > a.     Based on overall data injection rate for topic and current
> > producer injection rate
> >
> > b.     Ability rank partition based on consumer rate (Advance Use Case as
> > there may be many consumers so weighted average etc... )
> >
> >
> >
> > Untimely, brokers will coordinate among thousand of producers and divert
> > data injection  rate (out-of-box feature) and consumption rate (pluggable
> > interface implementation on brokers’ side).  The goal  here is to attain
> > uniformity and/or lower delivery rate to consumer.  This is similar to
> > consumer coordination moving to brokers. The producer side partition
> > selection would also move to brokers.  This will benefit both java and
> > non-java clients.
> >
> >
> >
> > Please let me know your feedback on this subject matter.  I am sure lots
> > of you run  Kafka in Enterprise Environment where you may have different
> > type of producers for same topic (e.g logging client in JavaScript, PHP,
> > Java and Python etc sending to log topic).  I would really appreciate
> your
> > feedback on this.
> >
> >
> >
> >
> >
> > Thanks,
> >
> >
> > Bhavesh
> >
>

Re: [DISCUSSION] Partition Selection and Coordination By Brokers for Producers

Reply via email to