Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Kanagha
Hi Matthias, Thanks a lot for the clarifications. I agree, that using fieldGrouping would be sufficient to maintain ordering per key. If messages are keyed and stored in kafka, fieldGrouping would be sufficient. Only if kafka uses round robin partitioning in absence of a key, then custom code

Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Matthias J. Sax
> Does the parallelism_hint set when a KafkaSpout is added to a topology, > need to match the number of partitions in a topic? No. On 06/05/2016 11:26 AM, Matthias J. Sax wrote: > Hi Kanagha, > > For reading, KafkaSpout's internally used KafkaConsumer ensures that > data is received in-order

Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Matthias J. Sax
Some addition: Actually, I have some doubt that you need the order of a partitions to be preserved, you usually want to preserve the order per key -- and a partition contains multiple keys. Thus, in Storm it is also sufficient to preserve the order by key (and not per partition). Thus, you can

Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Matthias J. Sax
Hi Kanagha, For reading, KafkaSpout's internally used KafkaConsumer ensures that data is received in-order per partition. Because the spout might read multiple partitions, and emit only a single (logical) output stream, within this output stream, data from multiple partitions interleave (the

Maintaining message ordering using KafkaSpout/Bolt

2016-06-04 Thread Kanagha
Hi, I'm looking at the documentation for using KafkaSpout/KafkaBolt. https://github.com/apache/storm/tree/master/external/storm-kafka How is ordering guaranteed while reading messages from Kafka using KafkaSpout? Does the parallelism_hint set when a KafkaSpout is added to a topology, need to