tsturzl commented on issue #12224:
URL: https://github.com/apache/pulsar/issues/12224#issuecomment-1608379774

   @Jason918 I believe the author of this ticket was talking about Pulsar 
through their understanding of Kafka. I've been looking for something similar 
to this in Pulsar and have not been able to find it, nor find a work around. 
Pulsar key shared subscriptions work similarly to Kafka, where as Kafka 
balances assignment of partitions to consumers Pulsar key shared subscriptions 
just divide the hash key space among all consumers which are consuming the same 
key shared subscriptions. The key difference here is that Pulsar is not 
assigning partitions to consumers, but rather assigning them a key space for 
which they'll received the messages for. In this context "rebalancing" might be 
the wrong word, but ultimately what it being talked about is letting consumers 
know when their hash ranges have changed.
   
   This is common in distributed data processing work flows where the data you 
see come in informs you to consume other topics and combine the latest from 
each source. So I might have a key share subscription to feed out work to a 
pool of consumers, and based on the data they see they might consumer other 
subscriptions to retrieve dependent data. If another consumer joins it will 
subdivide the key space on another consumers therefore the other consumer needs 
to know to stop consuming these other topics, as they are no longer needed and 
the key shared subscription will no longer send them. In Kafka you handle this 
issue by notifying the consumer that a rebalance occurred. That means that the 
client can evaluate what it should stop consuming.
   
   `ConsumerEventListener` might be a good place to expand into for providing 
this feature, but as it stand the functionality of this event listener does not 
provide similar functionality to the mentioned `ConsumerRebalanceListener` in 
Kafka. Pulsar may not do partition reassignment, but Pulsar doesn't provide 
similar functionality to handle changes in a key shared consumer's keyspace, 
making it difficult to effectively implement combined latest processing 
strategies, and even simple things like invalidating local caches which might 
cache based on previously seen data from that key shared subscriptions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to