Re: [DISCUSS] KIP-932: Queues for Kafka

Adam Warski Tue, 30 May 2023 08:53:37 -0700

Thanks for the explanation!

So effectively, a share group is subscribed to each partition - but the data is 
not pushed to the consumer, but only sent on demand. And when demand is 
signalled, a batch of messages is sent?
Hence it would be up to the consumer to prefetch a sufficient number of batches 
to ensure, that it will never be "bored"?


Adam

> On 30 May 2023, at 15:25, Andrew Schofield <[email protected]> wrote:
> 
> Hi Adam,
> Thanks for your question.
> 
> With a share group, each fetch is able to grab available records from any 
> partition. So, it alleviates
> the “head-of-line” blocking problem where a slow consumer gets in the way. 
> There’s no actual
> stealing from a slow consumer, but it can be overtaken and must complete its 
> processing within
> the timeout.
> 
> The way I see this working is that when a consumer joins a share group, it 
> receives a set of
> assigned share-partitions. To start with, every consumer will be assigned all 
> partitions. We
> can be smarter than that, but I think that’s really a question of writing a 
> smarter assignor
> just as has occurred over the years with consumer groups.
> 
> Only a small proportion of Kafka workloads are super high throughput. Share 
> groups would
> struggle with those I’m sure. Share groups do not diminish the value of 
> consumer groups
> for streaming. They just give another option for situations where a different 
> style of
> consumption is more appropriate.
> 
> Thanks,
> Andrew
> 
>> On 29 May 2023, at 17:18, Adam Warski <[email protected]> wrote:
>> 
>> Hello,
>> 
>> thank you for the proposal! A very interesting read.
>> 
>> I do have one question, though. When you subscribe to a topic using consumer 
>> groups, it might happen that one consumer has processed all messages from 
>> its partitions, while another one still has a lot of work to do (this might 
>> be due to unbalanced partitioning, long processing times etc.). In a 
>> message-queue approach, it would be great to solve this problem - so that a 
>> consumer that is free can steal work from other consumers. Is this somehow 
>> covered by share groups?
>> 
>> Maybe this is planned as "further work", as indicated here:
>> 
>> "
>> It manages the topic-partition assignments for the share-group members. An 
>> initial, trivial implementation would be to give each member the list of all 
>> topic-partitions which matches its subscriptions and then use the pull-based 
>> protocol to fetch records from all partitions. A more sophisticated 
>> implementation could use topic-partition load and lag metrics to distribute 
>> partitions among the consumers as a kind of autonomous, self-balancing 
>> partition assignment, steering more consumers to busier partitions, for 
>> example. Alternatively, a push-based fetching scheme could be used. Protocol 
>> details will follow later.
>> "
>> 
>> but I’m not sure if I understand this correctly. A fully-connected graph 
>> seems like a lot of connections, and I’m not sure if this would play well 
>> with streaming.
>> 
>> This also seems as one of the central problems - a key differentiator 
>> between share and consumer groups (the other one being persisting state of 
>> messages). And maybe the exact way we’d want to approach this would, to a 
>> certain degree, dictate the design of the queueing system?
>> 
>> Best,
>> Adam Warski
>> 
>> On 2023/05/15 11:55:14 Andrew Schofield wrote:
>>> Hi,
>>> I would like to start a discussion thread on KIP-932: Queues for Kafka. 
>>> This KIP proposes an alternative to consumer groups to enable cooperative 
>>> consumption by consumers without partition assignment. You end up with 
>>> queue semantics on top of regular Kafka topics, with per-message 
>>> acknowledgement and automatic handling of messages which repeatedly fail to 
>>> be processed.
>>> 
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka
>>> 
>>> Please take a look and let me know what you think.
>>> 
>>> Thanks.
>>> Andrew
>> 
> 

-- 
Adam Warski

https://www.softwaremill.com
https://twitter.com/adamwarski

Re: [DISCUSS] KIP-932: Queues for Kafka

Reply via email to