Re: Spark 1.4.0 - Using SparkR on EC2 Instance

2015-07-09 Thread RedOakMark
That’s correct.  We were setting up a Spark EC2 cluster from the command line, 
then installing RStudio Server, logging into that through the web interface and 
attempting to initialize the cluster within RStudio.  

We have made some progress on this outside of the thread - I will see what I 
can compile and share as a potential walkthrough. 



 On Jul 8, 2015, at 9:25 PM, BenPorter [via Apache Spark User List] 
 ml-node+s1001560n23732...@n3.nabble.com wrote:
 
 RedOakMark - just to make sure I understand what you did.  You ran the EC2 
 script on a local machine to spin up the cluster, but then did not try to run 
 anything in R/RStudio from your local machine.  Instead you installed RStudio 
 on the driver and ran it as a local cluster from that driver.  Is that 
 correct?  Otherwise, you make no reference to the master/EC2 server in this 
 code, so I have to assume that means you were running this directly from the 
 master. 
 
 Thanks, 
 Ben 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-Using-SparkR-on-EC2-Instance-tp23506p23732.html
  
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-Using-SparkR-on-EC2-Instance-tp23506p23732.html
 To unsubscribe from Spark 1.4.0 - Using SparkR on EC2 Instance, click here 
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=23506code=bWFya0ByZWRvYWtzdHJhdGVnaWMuY29tfDIzNTA2fDE0OTQ4NTQ4ODQ=.
 NAML 
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-Using-SparkR-on-EC2-Instance-tp23506p23742.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark 1.4.0 - Using SparkR on EC2 Instance

2015-06-27 Thread RedOakMark
For anyone monitoring the thread, I was able to successfully install and run
a small Spark cluster and model using this method:

First, make sure that the username being used to login to RStudio Server is
the one that was used to install Spark on the EC2 instance.  Thanks to
Shivaram for his help here.  

Login to RStudio and ensure that these references are used - set the library
location to the folder where spark is installed.  In my case,
~/home/rstudio/spark.

# # This line loads SparkR (the R package) from the installed directory
library(SparkR, lib.loc=./spark/R/lib)

The edits to this line were important, so that Spark knew where the install
folder was located when initializing the cluster.

# Initialize the Spark local cluster in R, as ‘sc’
sc - sparkR.init(local[2], SparkR, ./spark)

From here, we ran a basic model using Spark, from RStudio, which ran
successfully. 




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-Using-SparkR-on-EC2-Instance-tp23506p23514.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark 1.4.0 - Using SparkR on EC2 Instance

2015-06-26 Thread RedOakMark
Good morning,

I am having a bit of trouble finalizing the installation and usage of the
newest Spark version 1.4.0, deploying to an Amazon EC2 instance and using
RStudio to run on top of it.  

Using these instructions (
http://spark.apache.org/docs/latest/ec2-scripts.html
http://spark.apache.org/docs/latest/ec2-scripts.html  ) we can fire up an
EC2 instance (which we have been successful doing - we have gotten the
cluster to launch from the command line without an issue).  Then, I
installed RStudio Server on the same EC2 instance (the master) and
successfully logged into it (using the test/test user) through the web
browser.

This is where I get stuck - within RStudio, when I try to reference/find the
folder that SparkR was installed, to load the SparkR library and initialize
a SparkContext, I get permissions errors on the folders, or the library
cannot be found because I cannot find the folder in which the library is
sitting.

Has anyone successfully launched and utilized SparkR 1.4.0 in this way, with
RStudio Server running on top of the master instance?  Are we on the right
track, or should we manually launch a cluster and attempt to connect to it
from another instance running R?

Thank you in advance!

Mark



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-Using-SparkR-on-EC2-Instance-tp23506.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org