Mahesh, - One direction could be : create a parquet schema, convert & save the records to hdfs. - This might help https://github.com/massie/spark-parquet-example/blob/master/src/main/scala/com/zenfractal/SparkParquetExample.scala
Cheers <k/> On Tue, Jun 17, 2014 at 12:52 PM, maheshtwc < mahesh.padmanab...@twc-contractor.com> wrote: > Hello, > > Is there an easy way to convert RDDs within a DStream into Parquet records? > Here is some incomplete pseudo code: > > // Create streaming context > val ssc = new StreamingContext(...) > > // Obtain a DStream of events > val ds = KafkaUtils.createStream(...) > > // Get Spark context to get to the SQL context > val sc = ds.context.sparkContext > > val sqlContext = new org.apache.spark.sql.SQLContext(sc) > > // For each RDD > ds.foreachRDD((rdd: RDD[Array[Byte]]) => { > > // What do I do next? > }) > > Thanks, > Mahesh > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-RDDs-to-Parquet-records-tp7762.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >