Re: Processing time series data in order

2016-12-29 Thread Ewen Cheslack-Postava
The best you can do to ensure ordering today is to set: acks = all retries = Integer.MAX_VALUE max.block.ms = Long.MAX_VALUE max.in.flight.requests.per.connection = 1 This ensures there's only one outstanding produce request (batch of messages) at a time, it will be retried indefinitely on retria

Re: Processing time series data in order

2016-12-28 Thread Ali Akhtar
This will only ensure the order of delivery though, not the actual order of the events, right? I.e if due to network lag or any other reason, if the producer sends A, then B, but B arrives before A, then B will be returned before A even if they both went to the same partition. Am I correct about t

Re: Processing time series data in order

2016-12-27 Thread Tauzell, Dave
If you specify a key with each message then all messages with the same key get sent to the same partition. > On Dec 26, 2016, at 23:32, Ali Akhtar wrote: > > How would I route the messages to a specific partition? > >> On 27 Dec 2016 10:25 a.m., "Asaf Mesika" wrote: >> >> There is a much easier

Re: Processing time series data in order

2016-12-26 Thread Ali Akhtar
How would I route the messages to a specific partition? On 27 Dec 2016 10:25 a.m., "Asaf Mesika" wrote: > There is a much easier approach: your can route all messages of a given Id > to a specific partition. Since each partition has a single writer you get > the ordering you wish for. Of course

Re: Processing time series data in order

2016-12-26 Thread Asaf Mesika
There is a much easier approach: your can route all messages of a given Id to a specific partition. Since each partition has a single writer you get the ordering you wish for. Of course this won't work if your updates occur in different hosts. Also maybe Kafka streams can help shard the based on it

Re: Processing time series data in order

2016-12-21 Thread Jesse Hodges
Depending on the expected max out of order window, why not order them in memory? Then you don't need to reread from Cassandra, in case of a problem you can reread data from Kafka. -Jesse > On Dec 21, 2016, at 7:24 PM, Ali Akhtar wrote: > > - I'm receiving a batch of messages to a Kafka topi

Re: Processing time series data in order

2016-12-21 Thread Ali Akhtar
The batch size can be large, so in memory ordering isn't an option, unfortunately. On Thu, Dec 22, 2016 at 7:09 AM, Jesse Hodges wrote: > Depending on the expected max out of order window, why not order them in > memory? Then you don't need to reread from Cassandra, in case of a problem > you ca

Processing time series data in order

2016-12-21 Thread Ali Akhtar
- I'm receiving a batch of messages to a Kafka topic. Each message has a timestamp, however the messages can arrive / get processed out of order. I.e event 1's timestamp could've been a few seconds before event 2, and event 2 could still get processed before event 1. - I know the number of messag