[
https://issues.apache.org/jira/browse/KAFKA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias J. Sax reopened KAFKA-8812:
------------------------------------
No need to close the ticket, as long as the KIP was not declined :)
> Rebalance Producers - yes, I mean it ;-)
> ----------------------------------------
>
> Key: KAFKA-8812
> URL: https://issues.apache.org/jira/browse/KAFKA-8812
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Affects Versions: 2.3.0
> Reporter: Werner Daehn
> Priority: Major
>
> Please bare with me. Initially this thought sounds stupid but it has its
> merits.
>
> How do you build a distributed producer at the moment? You use Kafka Connect
> which in turn requires a cluster that tells which instance is producing what
> partitions.
> On the consumer side it is different. There Kafka itself does the data
> distribution. If you have 10 Kafka partitions and 10 consumers, each will get
> data for one partition. With 5 consumers, each will get data from two
> partitions. And if there is only a single consumer active, it gets all data.
> All is managed by Kafka, all you have to do is start as many consumers as you
> want.
>
> I'd like to suggest something similar for the producers. A producer would
> tell Kafka that its source has 10 partitions. The Kafka server then responds
> with a list of partitions this instance shall be responsible for. If it is
> the only producer, the response would be all 10 partitions. If it is the
> second instance starting up, the first instance would get the information it
> should produce data for partition 1-5 and the new one for partition 6-10. If
> the producer fails to respond with an alive packet, a rebalance does happen,
> informing the active producer to take more load and the dead producer will
> get an error when sending data again.
> For restart, the producer rebalance has to send the starting point where to
> start producing the data onwards from as well, of course. Would be best if
> this is a user generated pointer and not the topic offset. Then it can be
> e.g. the database system change number, a database transaction id or
> something similar.
>
--
This message was sent by Atlassian Jira
(v8.3.2#803003)