Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

Sean Owen Thu, 09 Oct 2014 22:17:51 -0700

Your RDD does not contain pairs, since you ".map(_._2)" (BTW that can
just be ".values"). "Hadoop files" means "SequenceFiles" and those
store key-value pairs. That's why the method only appears for
RDD[(K,V)].


On Fri, Oct 10, 2014 at 3:50 AM, Buntu Dev <buntu...@gmail.com> wrote:
> Thanks Sean, but I'm importing org.apache.spark.streaming.StreamingContext._
>
> Here are the spark imports:
>
> import org.apache.spark.streaming._
>
> import org.apache.spark.streaming.StreamingContext._
>
> import org.apache.spark.streaming.kafka._
>
> import org.apache.spark.SparkConf
>
> ....
>
>     val stream = KafkaUtils.createStream(ssc, zkQuorum, group,
> topicpMap).map(_._2)             stream.saveAsNewAPIHadoopFile (destination,
> classOf[Void], classOf[Group], classOf[ExampleOutputFormat], conf)
>
> ....
>
> Anything else I might be missing?

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

Reply via email to