[ https://issues.apache.org/jira/browse/SPARK-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604053#comment-14604053 ]
Vincent Warmerdam edited comment on SPARK-8596 at 6/27/15 8:04 AM: ------------------------------------------------------------------- I'm writing a small tutorial to get up to scratch with rstudio on AWS. It works. The main issue seems that currently ec2 installs an old version of R (3.1) while most packages like ggplot require a new version (3.2). I'm going to share the tutorial with the Rstudio guys soon. My approach is to run `spark/bin/start-all.sh` on the master node and then run the following commands in Rstudio on the master node: .libPaths( c( .libPaths(), '/root/spark/R/lib') ) Sys.setenv(SPARK_HOME = '/root/spark') Sys.setenv(PATH = paste(Sys.getenv(c("PATH")), '/root/spark/bin', sep=':')) library(SparkR) sc <- sparkR.init('<SPARK MASTER ADR>') sqlContext <- sparkRSQL.init(sc) This works on my end, and I've been able to use the dataframe API with a json blob on s3 with this sqlContext. The main issue on my end is that because of the old R version I can't install visualisation/knitr packages. The spark dataframe works like a charm in the GUI though. was (Author: cantdutchthis): I'm writing a small tutorial to get up to scratch with rstudio on AWS. It works. The main issue seems that currently ec2 installs an old version of R (3.1) while most packages like ggplot require a new version (3.2). I'm going to share the tutorial with the Rstudio guys soon. My approach is to run `spark/bin/start-all.sh` on the master node and then run the following commands in Rstudio on the master node: .libPaths( c( .libPaths(), '/root/spark/R/lib') ) Sys.setenv(SPARK_HOME = '/root/spark') Sys.setenv(PATH = paste(Sys.getenv(c("PATH")), '/root/spark/bin', sep=':')) library(SparkR) sc <- sparkR.init('<SPARK MASTER ADR>') sqlContext <- sparkRSQL.init(sc) This works on my end, and I've been able to use the dataframe API with a json blob on s3 with this sqlContext. The main issue on my end is that because of the old R version I can't install visualisation/knitr packages. The spark dataframe works like a charm in the GUI though. > Install and configure RStudio server on Spark EC2 > ------------------------------------------------- > > Key: SPARK-8596 > URL: https://issues.apache.org/jira/browse/SPARK-8596 > Project: Spark > Issue Type: Improvement > Components: EC2, SparkR > Reporter: Shivaram Venkataraman > > This will make it convenient for R users to use SparkR from their browsers -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org