Re: Spark kafka integration issues

2016-09-14 Thread Cody Koeninger
Yeah, an updated version of that blog post is available at https://github.com/koeninger/kafka-exactly-once On Wed, Sep 14, 2016 at 11:35 AM, Mukesh Jha wrote: > Thanks for the reply Cody. > > I found the below article on the same, very helpful. Thanks for the details, >

Re: Spark kafka integration issues

2016-09-14 Thread Mukesh Jha
Thanks for the reply Cody. I found the below article on the same, very helpful. Thanks for the details, much appreciated. http://blog.cloudera.com/blog/2015/03/exactly-once-spark-streaming-from-apache-kafka/ On Tue, Sep 13, 2016 at 8:14 PM, Cody Koeninger wrote: > 1. see

Re: Spark kafka integration issues

2016-09-13 Thread Cody Koeninger
1. see http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers look for HasOffsetRange. If you really want the info per-message rather than per-partition, createRDD has an overload that takes a messageHandler from MessageAndMetadata to

Spark kafka integration issues

2016-09-13 Thread Mukesh Jha
Hello fellow sparkers, I'm using spark to consume messages from kafka in a non streaming fashion. I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the same. I have a few queries for the same, please get back if you guys have clues on the same. 1) Is there anyway to get the