Task Serialization Error on DataFrame.foreachPartition

2015-06-20 Thread Nishant Patel
Hi, I am loading data from Hive table to Hbase after doing some manipulation. I am getting error as 'Task not Serializable'. My code is as below. public class HiveToHbaseLoader implements Serializable { public static void main(String[] args) throws Exception { String

Spark SQL - Not able to create schema RDD for nested Directory for specific directory names

2015-02-05 Thread Nishant Patel
Hi, I got strange behavior. When I am creating schema RDD for nested directory sometimes it work and sometime it does not work. My question is whether nested directory supported or not? My code is as below. val fileLocation = hdfs://localhost:9000/apps/hive/warehouse/hl7 val parquetRDD =

java.io.NotSerializableException: org.apache.spark.streaming.StreamingContext

2015-01-23 Thread Nishant Patel
Below is code I have written. I am getting NotSerializableException. How can I handle this scenario? kafkaStream.foreachRDD(rdd = { println() rdd.foreachPartition(partitionOfRecords = { partitionOfRecords.foreach( record = { //Write for CSV.