Re: Spark kafka integration issues
Yeah, an updated version of that blog post is available at https://github.com/koeninger/kafka-exactly-once On Wed, Sep 14, 2016 at 11:35 AM, Mukesh Jhawrote: > Thanks for the reply Cody. > > I found the below article on the same, very helpful. Thanks for the details, > much appreciated. > > http://blog.cloudera.com/blog/2015/03/exactly-once-spark-streaming-from-apache-kafka/ > > On Tue, Sep 13, 2016 at 8:14 PM, Cody Koeninger wrote: >> >> 1. see >> http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers >> look for HasOffsetRange. If you really want the info per-message >> rather than per-partition, createRDD has an overload that takes a >> messageHandler from MessageAndMetadata to whatever you need >> >> 2. createRDD takes type parameters for the key and value decoder, so >> specify them there >> >> 3. you can use spark-streaming-kafka-0-8 against 0.9 or 0.10 brokers. >> There is a spark-streaming-kafka-0-10 package with additional features >> that only works on brokers 0.10 or higher. A pull request for >> documenting it has been merged, but not deployed. >> >> On Tue, Sep 13, 2016 at 6:46 PM, Mukesh Jha >> wrote: >> > Hello fellow sparkers, >> > >> > I'm using spark to consume messages from kafka in a non streaming >> > fashion. >> > I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the >> > same. >> > >> > I have a few queries for the same, please get back if you guys have >> > clues on >> > the same. >> > >> > 1) Is there anyway to get the have the topic and partition & offset >> > information for each item from the KafkaRDD. I'm using the >> > KafkaUtils.createRDD[String, String, StringDecoder, StringDecoder] to >> > create >> > my kafka RDD. >> > 2) How to pass my custom Decoder instead of using the String or Byte >> > decoder >> > are there any examples for the same? >> > 3) is there a newer version to consumer from kafka-0.10 & kafka-0.9 >> > clusters >> > >> > -- >> > Thanks & Regards, >> > >> > Mukesh Jha > > > > > -- > > > Thanks & Regards, > > Mukesh Jha - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Spark kafka integration issues
Thanks for the reply Cody. I found the below article on the same, very helpful. Thanks for the details, much appreciated. http://blog.cloudera.com/blog/2015/03/exactly-once-spark-streaming-from-apache-kafka/ On Tue, Sep 13, 2016 at 8:14 PM, Cody Koeningerwrote: > 1. see http://spark.apache.org/docs/latest/streaming-kafka- > integration.html#approach-2-direct-approach-no-receivers > look for HasOffsetRange. If you really want the info per-message > rather than per-partition, createRDD has an overload that takes a > messageHandler from MessageAndMetadata to whatever you need > > 2. createRDD takes type parameters for the key and value decoder, so > specify them there > > 3. you can use spark-streaming-kafka-0-8 against 0.9 or 0.10 brokers. > There is a spark-streaming-kafka-0-10 package with additional features > that only works on brokers 0.10 or higher. A pull request for > documenting it has been merged, but not deployed. > > On Tue, Sep 13, 2016 at 6:46 PM, Mukesh Jha > wrote: > > Hello fellow sparkers, > > > > I'm using spark to consume messages from kafka in a non streaming > fashion. > > I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the > > same. > > > > I have a few queries for the same, please get back if you guys have > clues on > > the same. > > > > 1) Is there anyway to get the have the topic and partition & offset > > information for each item from the KafkaRDD. I'm using the > > KafkaUtils.createRDD[String, String, StringDecoder, StringDecoder] to > create > > my kafka RDD. > > 2) How to pass my custom Decoder instead of using the String or Byte > decoder > > are there any examples for the same? > > 3) is there a newer version to consumer from kafka-0.10 & kafka-0.9 > clusters > > > > -- > > Thanks & Regards, > > > > Mukesh Jha > -- Thanks & Regards, *Mukesh Jha *
Re: Spark kafka integration issues
1. see http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers look for HasOffsetRange. If you really want the info per-message rather than per-partition, createRDD has an overload that takes a messageHandler from MessageAndMetadata to whatever you need 2. createRDD takes type parameters for the key and value decoder, so specify them there 3. you can use spark-streaming-kafka-0-8 against 0.9 or 0.10 brokers. There is a spark-streaming-kafka-0-10 package with additional features that only works on brokers 0.10 or higher. A pull request for documenting it has been merged, but not deployed. On Tue, Sep 13, 2016 at 6:46 PM, Mukesh Jhawrote: > Hello fellow sparkers, > > I'm using spark to consume messages from kafka in a non streaming fashion. > I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the > same. > > I have a few queries for the same, please get back if you guys have clues on > the same. > > 1) Is there anyway to get the have the topic and partition & offset > information for each item from the KafkaRDD. I'm using the > KafkaUtils.createRDD[String, String, StringDecoder, StringDecoder] to create > my kafka RDD. > 2) How to pass my custom Decoder instead of using the String or Byte decoder > are there any examples for the same? > 3) is there a newer version to consumer from kafka-0.10 & kafka-0.9 clusters > > -- > Thanks & Regards, > > Mukesh Jha - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Spark kafka integration issues
Hello fellow sparkers, I'm using spark to consume messages from kafka in a non streaming fashion. I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the same. I have a few queries for the same, please get back if you guys have clues on the same. 1) Is there anyway to get the have the topic and partition & offset information for each item from the KafkaRDD. I'm using the *KafkaUtils.createRDD[String, String, StringDecoder, StringDecoder]* to create my kafka RDD. 2) How to pass my custom Decoder instead of using the String or Byte decoder are there any examples for the same? 3) is there a newer version to consumer from kafka-0.10 & kafka-0.9 clusters -- Thanks & Regards, *Mukesh Jha*