Amit,

sqlContext <- sparkRSQL.init(sc)

peopleDF <- read.df(sqlContext, "hdfs://master:9000/sears/example.csv")

have you restarted the R session in RStudio between the two lines?

From: Amit Behera [mailto:amit.bd...@gmail.com]
Sent: Thursday, October 8, 2015 5:59 PM
To: user@spark.apache.org
Subject: How can I read file from HDFS i sparkR from RStudio

Hi All,
I am very new to SparkR.
I am able to run a sample code from example given in the link : 
http://www.r-bloggers.com/installing-and-starting-sparkr-locally-on-windows-os-and-rstudio/
Then I am trying to read a file from HDFS in RStudio, but unable to read.
Below is my code.

Sys.setenv(SPARK_HOME="/home/affine/spark")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths()))

library(SparkR)

library(rJava)

Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" 
"com.databricks:spark-csv_2.10:1.2.0" "sparkr-shell"')

sc <- sparkR.init(master = "spark://master:7077",sparkPackages = 
"com.databricks:spark-csv_2.1:1.2.0")

sqlContext <- sparkRSQL.init(sc)

peopleDF <- read.df(sqlContext, "hdfs://master:9000/sears/example.csv")
Error:


Error in callJMethod(sqlContext, "getConf", "spark.sql.sources.default",  :

  Invalid jobj 1. If SparkR was restarted, Spark operations need to be 
re-executed.
Please tell me where I am going wrong.
Thanks,
Amit.

Reply via email to