I did that but i am getting confusing results e.g
I have created 4 Kafka Consumer threads for doing data analytic, these threads just wait for Kafka messages to get consumed and I have provided the key provided when I produce, it means that all the messages will go to one single partition ref " http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ " "* On the consumer side, Kafka always gives a single partition’s data to one consumer thread.*" If you see my application logs, my 4 Kafka Consumer Application threads which are calling consume() , Arn't all message of a particular ID should be consumed by one Kafka Application thread ? [2016-10-08 23:37:07.498]AxThreadId 23516 ->ID:4495 offset: 74 ][ID ID date:2016-09-28 20:07:32.000 ] [2016-10-08 23:37:07.498]AxThreadId 2208 ->ID:4496 offset: 80 ][ID ID date: 2016-09-28 20:07:39.000 ] [2016-10-08 23:37:07.498]AxThreadId 2208 ->ID:4495 offset: 77 ][ID date: 2016-09-28 20:07:35.000 ] [2016-10-08 23:37:07.498]AxThreadId 23516 ->ID:4495 offset: 76][ID date: 2016-09-28 20:07:34.000 ] [2016-10-08 23:37:07.498]AxThreadId 9540 ->ID:4495 offset: 75 ][ID date: 2016-09-28 20:07:33.000 ] [2016-10-08 23:37:07.499]AxThreadId 23516 ->ID:4495 offset: 78 ][ID date: 2016-09-28 20:07:36.000 ] [2016-10-08 23:37:07.499]AxThreadId 2208 ->ID:4495 offset: 79 ][ID date: 2016-09-28 20:07:37.000 ] [2016-10-08 23:37:07.499]AxThreadId 9540 ->ID:4495 offset: 80 ][ID date: 2016-09-28 20:07:38.000 ] [2016-10-08 23:37:07.500]AxThreadId 23516 ->ID:4495 offset: 81][ID date: 2016-09-28 20:07:39.000 ] On Sun, Oct 9, 2016 at 1:31 PM, Hans Jespersen <h...@confluent.io> wrote: > Then publish with the user ID as the key and all messages for the same key > will be guaranteed to go to the same partition and therefore be in order > for whichever consumer gets that partition. > > > //h...@confluent.io > -------- Original message --------From: Abhit Kalsotra <abhit...@gmail.com> > Date: 10/9/16 12:39 AM (GMT-08:00) To: users@kafka.apache.org Subject: > Re: Regarding Kafka > What about the order of message getting received ? If i don't mention the > partition. > > Lets say if i have user ID :4456 and I have to do some analytics at the > Kafka Consumer end and at my consumer end if its not getting consumed the > way I sent, then my analytics will go haywire. > > Abhi > > On Sun, Oct 9, 2016 at 12:50 PM, Hans Jespersen <h...@confluent.io> wrote: > > > You don't even have to do that because the default partitioner will > spread > > the data you publish to the topic over the available partitions for you. > > Just try it out to see. Publish multiple messages to the topic without > > using keys, and without specifying a partition, and observe that they are > > automatically distributed out over the available partitions. > > > > > > //h...@confluent.io > > -------- Original message --------From: Abhit Kalsotra < > abhit...@gmail.com> > > Date: 10/8/16 11:19 PM (GMT-08:00) To: users@kafka.apache.org Subject: > > Re: Regarding Kafka > > Hans > > > > Thanks for the response, yeah you can say yeah I am treating topics like > > partitions, because my > > > > current logic of producing to a respective topic goes something like this > > > > RdKafka::ErrorCode resp = m_kafkaProducer->produce(m_ > > kafkaTopic[whichTopic], > > > partition, > > > > RdKafka::Producer::RK_MSG_COPY, > > ptr, > > size, > > > > &partitionKey, > > NULL); > > where partitionKey is unique number or userID, so what I am doing > currently > > each partitionKey%10 > > so whats so ever is the remainder, I am dumping that to the respective > > topic. > > > > But as per your suggestion, Let me create close to 40-50 partitions for a > > single topic and when i am producing I do something like this > > > > RdKafka::ErrorCode resp = m_kafkaProducer->produce(m_kafkaTopic, > > > > partition%(50), > > > > RdKafka::Producer::RK_MSG_COPY, > > ptr, > > size, > > > > &partitionKey, > > NULL); > > > > Abhi > > > > On Sun, Oct 9, 2016 at 10:13 AM, Hans Jespersen <h...@confluent.io> > wrote: > > > > > Why do you have 10 topics? It seems like you are treating topics like > > > partitions and it's unclear why you don't just have 1 topic with 10, > 20, > > or > > > even 30 partitions. Ordering is only guaranteed at a partition level. > > > > > > In general if you want to capacity plan for partitions you benchmark a > > > single partition and then divide your peak estimated throughput by the > > > results of the single partition results. > > > > > > If you expect the peak throughput to increase over time then double > your > > > partition count to allow room to grow the number of consumers without > > > having to repartition. > > > > > > Sizing can be a bit more tricky if you are using keys but it doesn't > > sound > > > like you are if today you are publishing to topics the way you > describe. > > > > > > -hans > > > > > > > On Oct 8, 2016, at 9:01 PM, Abhit Kalsotra <abhit...@gmail.com> > wrote: > > > > > > > > Guys any views ? > > > > > > > > Abhi > > > > > > > >> On Sat, Oct 8, 2016 at 4:28 PM, Abhit Kalsotra <abhit...@gmail.com> > > > wrote: > > > >> > > > >> Hello > > > >> > > > >> I am using librdkafka c++ library for my application . > > > >> > > > >> *My Kafka Cluster Set up* > > > >> 2 Kafka Zookeper running on 2 different instances > > > >> 7 Kafka Brokers , 4 Running on 1 machine and 3 running on other > > machine > > > >> Total 10 Topics and partition count is 3 with replication factor of > 3. > > > >> > > > >> Now in my case I need to be very specific for the *message order* > > when I > > > >> am consuming the messages. I know if all the messages gets produced > to > > > the > > > >> same partition, it always gets consumed in the same order. > > > >> > > > >> I need expert opinions like what's the ideal partition count I > should > > > >> consider without effecting performance.( I am looking for close to > > > 100,000 > > > >> messages per seconds). > > > >> The topics are from 0 to 9 and when I am producing messages I do > > > something > > > >> like uniqueUserId % 10 , and then pointing to a respective topic > like > > 0 > > > || > > > >> 1 || 2 etc.. > > > >> > > > >> Abhi > > > >> > > > >> > > > >> > > > >> > > > >> -- > > > >> If you can't succeed, call it version 1.0 > > > >> > > > > > > > > > > > > > > > > -- > > > > If you can't succeed, call it version 1.0 > > > > > > > > > > > -- > > If you can't succeed, call it version 1.0 > > > > > > -- > If you can't succeed, call it version 1.0 > -- If you can't succeed, call it version 1.0