Hi Collin,

Thanks for the reply. Actually the RoundRobinPartitioner won't do an equal
distribution when working with multiple producers. One producer does not
know the others. If you consider that producers are randomly producing
messages, in the worst case scenario all producers can be synced and one
could have as many messages in a single partition as the number of
producers.
It's easy to generate evidences of it.

I have asked this question on the users mail list too (and on Slack and on
Stackoverflow).

Kafka currently does not have means to do a round robin across multiple
producers or on the broker side.

This means there is currently NO GUARANTEE of equal distribution across
partitions as the partition election is decided by the producer.

There result is an unbalanced consumption when working with consumer groups
and the options are: creating a custom shared partitioner, relying on Kafka
random partition or introducing a middle man between topics (all of them
having big cons).

I thought of asking here to see whether this is a topic that could concern
other developers (and maybe understand whether this could be a KIP
discussion)

Maybe I'm missing something... I would like to know.

According to my interpretation of the code (just read through some
classes), but there is currently no way to do partition balancing on the
broker - the producer sends messages directly to partition leaders so
partition currently needs to be defined on the producer.

I understand that in order to perform round robin across partitions of a
topic when working with multiple producers, some development needs to be
done. Am I right?


Thanks


On Fri, Jun 12, 2020, 10:57 PM Colin McCabe <cmcc...@apache.org> wrote:

> HI Vinicius,
>
> This question seems like a better fit for the user mailing list rather
> than the developer mailing list.
>
> Anyway, if I understand correctly, you are asking if the producer can
> choose to assign partitions in a round-robin fashion rather than based on
> the key.  The answer is, you can, by using RoundRobinPartitioner. (again,
> if I'm understanding the question correctly).
>
> best,
> Colin
>
> On Tue, Jun 9, 2020, at 00:48, Vinicius Scheidegger wrote:
> > Anyone?
> >
> > On Fri, Jun 5, 2020 at 2:42 PM Vinicius Scheidegger <
> > vinicius.scheideg...@gmail.com> wrote:
> >
> > > Does anyone know how could I perform a load balance to distribute
> equally
> > > the messages to all consumers within the same consumer group having
> > > multiple producers?
> > >
> > > Is this a conceptual flaw on Kafka, wasn't it thought for equal
> > > distribution with multiple producers or am I missing something?
> > > I've asked on Stack Overflow, on Kafka users mailing group, here (on
> Kafka
> > > Devs) and on Slack - and still have no definitive answer (actually
> most of
> > > the time I got no answer at all)
> > >
> > > Would something like this even be possible in the way Kafka is
> currently
> > > designed?
> > > How does proposing for a KIP work?
> > >
> > > Thanks,
> > >
> > >
> > >
> > > On Thu, May 28, 2020, 3:44 PM Vinicius Scheidegger <
> > > vinicius.scheideg...@gmail.com> wrote:
> > >
> > >> Hi,
> > >>
> > >> I'm trying to understand a little bit more about how Kafka works.
> > >> I have a design with multiple producers writing to a single topic and
> > >> multiple consumers in a single Consumer Group consuming message from
> this
> > >> topic.
> > >>
> > >> My idea is to distribute the messages from all producers equally. From
> > >> reading the documentation I understood that the partition is always
> > >> selected by the producer. Is that correct?
> > >>
> > >> I'd also like to know if there is an out of the box option to assign
> the
> > >> partition via a round robin *on the broker side *to guarantee equal
> > >> distribution of the load - if possible to each consumer, but if not
> > >> possible, at least to each partition.
> > >>
> > >> If my understanding is correct, it looks like in a multiple producer
> > >> scenario there is lack of support from Kafka regarding load balancing
> and
> > >> customers have to either stick to the hash of the key (random
> distribution,
> > >> although it would guarantee same key goes to the same partition) or
> they
> > >> have to create their own logic on the producer side (i.e. by sharing
> memory)
> > >>
> > >> Am I missing something?
> > >>
> > >> Thank you,
> > >>
> > >> Vinicius Scheidegger
> > >>
> > >
> >
>

Reply via email to