Werner Daehn created KAFKA-8812:
-----------------------------------

             Summary: Rebalance Producers - yes, I mean it ;-)
                 Key: KAFKA-8812
                 URL: https://issues.apache.org/jira/browse/KAFKA-8812
             Project: Kafka
          Issue Type: Improvement
          Components: core
    Affects Versions: 2.3.0
            Reporter: Werner Daehn


Please bare with me. Initially this thought sounds stupid but it has its merits.

 

How do you build a distributed producer at the moment? You use Kafka Connect 
which in turn requires a cluster that tells which instance is producing what 
partitions.

On the consumer side it is different. There Kafka itself does the data 
distribution. If you have 10 Kafka partitions and 10 consumers, each will get 
data for one partition. With 5 consumers, each will get data from two 
partitions. And if there is only a single consumer active, it gets all data. 
All is managed by Kafka, all you have to do is start as many consumers as you 
want.

 

I'd like to suggest something similar for the producers. A producer would tell 
Kafka that its source has 10 partitions. The Kafka server then responds with a 
list of partitions this instance shall be responsible for. If it is the only 
producer, the response would be all 10 partitions. If it is the second instance 
starting up, the first instance would get the information it should produce 
data for partition 1-5 and the new one for partition 6-10. If the producer 
fails to respond with an alive packet, a rebalance does happen, informing the 
active producer to take more load and the dead producer will get an error when 
sending data again.

For restart, the producer rebalance has to send the starting point where to 
start producing the data onwards from as well, of course. Would be best if this 
is a user generated pointer and not the topic offset. Then it can be e.g. the 
database system change number, a database transaction id or something similar.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to