I have installed the SparkR package from Spark distribution into the R
library. I can call the following command and it seems to work properly:
library(SparkR)
However, when I try to get the Spark context using the following code,
sc - sparkR.init(master=local)
It fails after some time with the
Hi I have Hive insert into query which creates new Hive partitions. I have
two Hive partitions named server and date. Now I execute insert into queries
using the following code and try to save it
DataFrame dframe = hiveContext.sql(insert into summary1
partition(server='a1',date='2015-05-22')
Hi I am having couple of Spark jobs which processes thousands of files every
day. File size may very from MBs to GBs. After finishing job I usually save
using the following code
finalJavaRDD.saveAsParquetFile(/path/in/hdfs); OR
dataFrame.write.format(orc).save(/path/in/hdfs) //storing as ORC file
Hi I have to fire few insert into queries which uses Hive partitions. I have
two Hive partitions named server and date. Now I execute insert into queries
using hiveContext as shown below query works fine
hiveContext.sql(insert into summary1
partition(server='a1',date='2015-05-22') select from