Hi Raja, Just to be clear are you suggesting that p1.10 is being read before p1.1 ?
If thats the case can you use a console consumer that comes packed with kafka and verify the ordering based on timestamps ? Thanks, Dev On Tue, Jun 7, 2016 at 5:31 PM, Raja.Aravapalli <[email protected]> wrote: > > Thanks a lot Devendra Tagare for the response. > > What you said is very clear and understandable. But, wondering, I am NOT > getting that partition level order!! My operator is processing the records > in jumbled order rather than in sequence! > And, I am saying this because, I am generating timestamps upon tuple > receipt and emitting that timestamp to my destination, which is clearly > showing the records are receiving to operator in a shuffled order. > > I get records at milli second level differences!! Will that be a problem ? > > > Regards, > Raja. > > From: Devendra Tagare <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Tuesday, June 7, 2016 at 7:12 PM > > To: "[email protected]" <[email protected]> > Subject: Re: kafka input is processing records in a jumbled order > > Hi Raja, > > When you apply ONE_TO_MANY partitioning scheme, one instance of the > operator consumes from many partitions of a kafka topic. > > When you look at the consumed data, all the events coming from a given > partition would be ordered but there are no ordering guarantees across > partitions since kafka does not guarantee that > > eg : If 3 partitions of a topic p1,p2,p3 having 10 messages each are > connected to one physical partition of the KafkaInputOperator , then the > ordering guarantee of p1.1 to p1.10 is honored.ie message 10 of p1 be > consumed only after messages 1 through 9 are consumed but the operator > could consumer messages in a order like p1.1,p2.1,p1.2,p1.3,p3.1,p2.2..... > which still follows the guarantees per partition. > > Thanks, > Dev > > On Tue, Jun 7, 2016 at 5:00 PM, Raja.Aravapalli < > [email protected]> wrote: > >> >> Thanks for the response Thomas. >> >> My quick doubt is.. >> >> I have around 30 partitions of kafka topic, And all of them have messages >> ordered at partition level. >> >> So, when I consume those messages using single consumer[with ONE_TO_MANY >> strategy set], still the ordering doesn’t work ? >> >> >> My messages in topic are guaranteed to be ordered at partition level. >> >> Thanks a lot in advance for your response. >> >> >> Regards, >> Raja. >> >> From: Thomas Weise <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Tuesday, June 7, 2016 at 5:52 PM >> To: "[email protected]" <[email protected]> >> Subject: Re: kafka input is processing records in a jumbled order >> >> Raja, >> >> Are you expecting ordering across multiple Kafka partitions? >> >> All messages from a given Kafka partition are received by the same >> consumer and thus will be ordered. However, when messages come from >> multiple partitions there is no such guarantee. >> >> Thomas >> >> >> On Tue, Jun 7, 2016 at 3:34 PM, Raja.Aravapalli < >> [email protected]> wrote: >> >>> >>> Hi >>> >>> I have built a DAG, that reads from kafka and in the next operators, >>> does lookup to a hbase table and update hbase table based on some business >>> logic. >>> >>> Some times my operator which does hbase lookup and update in the same >>> operator(Custom written), is processing the records it receives from kafka >>> in a jumbled order, which is causing, many records being ignored from >>> processing!! >>> >>> I am not using any parallel partitions/instance, and with >>> KafkaInputOperator I am using only partition strategy ONE_TO_MANY. >>> >>> I am very new to Apex. I expected, Apex will guarantee the ordering. >>> >>> Can someone pls share your knowledge on the issue…? >>> >>> >>> Thanks a lot in advance… >>> >>> >>> Regards, >>> Raja. >>> >> >> >
