Re: Scaling up kafka consumers
The kafka fast connector handles this differently than the standard kafka client (which requires one consumer per partition at most), by breaking offsets into consumable ranges which allows one partition to be read by multiple conumers where each consumer uniquely receives a different offset range. See: https://github.com/gerritjvv/kafka-fast That said: There are two usage scenarios: Lots of topics (10 or more) where you'd want 1-2-4 partitions or less topics where you want more partitions, if yours is the latter you'd want more than a single partition 4-6 are better numbers ( numbers are from my experience with 5 nodes). On 24 Feb 2017 17:30, "Jakub Stransky" wrote: > Hello everyone, > > I was reading/checking kafka documentation regarding point-2-point and > publish subscribe communications patterns in kafka and I am wondering how > to scale up consumer side in point to point scenario when consuming from > single kafka topic. > > Let say I have a single topic with single partition and I have one node > where the kafka consumer is running. If I want to scale up my service I add > another node - which has the same configuration as the first one (topic, > partition and consumer group id). Those two nodes start competing for > messages from kafka topic. > > What I am not sure in this scenario and is actually subject of my question > is "*Whether they do get each node unique messages or there is still > possibility that some messages will be consumed by both nodes etc*". > Because I can see scenarios that both nodes are started at the same time - > they gets the same topic offset from zookeeper and started consuming > messages from that offset. OR am I thinking in a wrong direction? > > Thanks > Jakub >
Re: Scaling up kafka consumers
Hi If you have two consumers in your consumer group, but only one partition in the topic, then only one consumer will do any work. It’s not the case that “those two nodes start competing for messages” — one node will read from the partition, the other will have nothing to do. So to scale up by adding more consumers in your consumer group, you’ll need more partitions in your topic. Ian. > On Feb 24, 2017, at 9:30 AM, Jakub Stransky wrote: > > Hello everyone, > > I was reading/checking kafka documentation regarding point-2-point and > publish subscribe communications patterns in kafka and I am wondering how > to scale up consumer side in point to point scenario when consuming from > single kafka topic. > > Let say I have a single topic with single partition and I have one node > where the kafka consumer is running. If I want to scale up my service I add > another node - which has the same configuration as the first one (topic, > partition and consumer group id). Those two nodes start competing for > messages from kafka topic. > > What I am not sure in this scenario and is actually subject of my question > is "*Whether they do get each node unique messages or there is still > possibility that some messages will be consumed by both nodes etc*". > Because I can see scenarios that both nodes are started at the same time - > they gets the same topic offset from zookeeper and started consuming > messages from that offset. OR am I thinking in a wrong direction? > > Thanks > Jakub
Re: Scaling up kafka consumers
A single partition can be consumed by at most a single consumer. Consumers compete to take ownership of a partition. So, in order to gain parallelism you need to add more partitions. There is a library that allows multiple consumers to consume from a single partition https://github.com/gerritjvv/kafka-fast. But I've never used it. On Fri, Feb 24, 2017 at 7:30 AM, Jakub Stransky wrote: > Hello everyone, > > I was reading/checking kafka documentation regarding point-2-point and > publish subscribe communications patterns in kafka and I am wondering how > to scale up consumer side in point to point scenario when consuming from > single kafka topic. > > Let say I have a single topic with single partition and I have one node > where the kafka consumer is running. If I want to scale up my service I add > another node - which has the same configuration as the first one (topic, > partition and consumer group id). Those two nodes start competing for > messages from kafka topic. > > What I am not sure in this scenario and is actually subject of my question > is "*Whether they do get each node unique messages or there is still > possibility that some messages will be consumed by both nodes etc*". > Because I can see scenarios that both nodes are started at the same time - > they gets the same topic offset from zookeeper and started consuming > messages from that offset. OR am I thinking in a wrong direction? > > Thanks > Jakub >