spark ec2 script doest not install necessary files to launch spark

2015-11-06 Thread Emaasit
Hello,
I followed the instructions for launching Spark 1.5.1 on my AWS EC2 but the
script is not installing all the folders/files required to initialize Spark.
Since the log message is long, I have created a gist here:
https://gist.github.com/Emaasit/696145959bbbd989bfe1

Please help. I have been going at this for more than 6 hours now to no
success.



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-script-doest-not-install-necessary-files-to-launch-spark-tp25311.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: inlcudePackage() deprecated?

2015-06-04 Thread Daniel Emaasit
Got it. Ignore my similar question on Github comments.

On Thu, Jun 4, 2015 at 11:48 AM, Shivaram Venkataraman 
shiva...@eecs.berkeley.edu wrote:

 Yeah - We don't have support for running UDFs on DataFrames yet. There is
 an open issue to track this
 https://issues.apache.org/jira/browse/SPARK-6817

 Thanks
 Shivaram

 On Thu, Jun 4, 2015 at 3:10 AM, Daniel Emaasit daniel.emaa...@gmail.com
 wrote:

 Hello Shivaram,
 Was the includePackage() function deprecated in SparkR 1.4.0?
 I don't see it in the documentation? If it was, does that mean that we
 can use R packages on Spark DataFrames the usual way we do for local R
 dataframes?

 Daniel

 --
 Daniel Emaasit
 Ph.D. Research Assistant
 Transportation Research Center (TRC)
 University of Nevada, Las Vegas
 Las Vegas, NV 89154-4015
 Cell: 615-649-2489
 www.danielemaasit.com  http://www.danielemaasit.com/







-- 
Daniel Emaasit
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com  http://www.danielemaasit.com/


Re: DataFrames coming in SparkR in Apache Spark 1.4.0

2015-06-03 Thread Emaasit
You can build Spark from the 1.4 release branch yourself:
https://github.com/apache/spark/tree/branch-1.4



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/DataFrames-coming-in-SparkR-in-Apache-Spark-1-4-0-tp23116p23131.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Error: Building Spark 1.4.0 from Github-1.4 release branch

2015-06-03 Thread Emaasit
 cannot
find t
he path specified) - [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please
rea
d the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception
C:\Program Files\Apache Software Foundation\spark-branch-1.4



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Error-Building-Spark-1-4-0-from-Github-1-4-release-branch-tp23132.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark 1.4.0 build Error on Windows

2015-06-03 Thread Daniel Emaasit
-shared-archive-resources\META-INF\NOTICE (The system cannot
find t
he path specified) - [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please rea
d the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception
C:\Program Files\Apache Software Foundation\spark-branch-1.4

On Tue, Jun 2, 2015 at 7:17 PM, Shivaram Venkataraman 
shivaram.venkatara...@gmail.com wrote:

 No worries - Also cc'ing user@spark.apache.org might get faster responses
 !

 Shivaram

 On Tue, Jun 2, 2015 at 6:05 PM, Daniel Emaasit daniel.emaa...@gmail.com
 wrote:

 Oops, My bad. I was building from the wrong Directory.


 On Tue, Jun 2, 2015 at 5:57 PM, Daniel Emaasit daniel.emaa...@gmail.com
 wrote:

 Hello Shivaram,
 While I was able to build Spark 1.3.0. I am getting errors building
 Spark 1.4.0. I was trying to build from the 1.4 branch from
 https://github.com/apache/spark/tree/branch-1.4
 Here is the log file.

 C:\Program Files\Apache Software Foundation\spark-branch-1.4cd build

 C:\Program Files\Apache Software Foundation\spark-branch-1.4\buildls
 mvn  sbt  sbt-launch-lib.bash

 C:\Program Files\Apache Software Foundation\spark-branch-1.4\buildmvn
 -Psparkr
  -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package
 [INFO] Scanning for projects...
 [INFO]
 
 [INFO] BUILD FAILURE
 [INFO]
 
 [INFO] Total time: 0.469 s
 [INFO] Finished at: 2015-06-02T17:47:28-07:00
 [INFO] Final Memory: 4M/121M
 [INFO]
 
 [WARNING] The requested profile sparkr could not be activated because
 it does
 not exist.
 [WARNING] The requested profile yarn could not be activated because it
 does no
 t exist.
 [WARNING] The requested profile hadoop-2.4 could not be activated
 because it d
 oes not exist.
 [ERROR] The goal you specified requires a project to execute but there
 is no POM
  in this directory (C:\Program Files\Apache Software
 Foundation\spark-branch-1.4
 \build). Please verify you invoked Maven from the correct directory. -
 [Help 1]

 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the
 -e swit
 ch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions,
 please rea
 d the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/MissingProject
 Exception
 C:\Program Files\Apache Software Foundation\spark-branch-1.4\build

 --
 Daniel Emaasit
 Ph.D. Research Assistant
 Transportation Research Center (TRC)
 University of Nevada, Las Vegas
 Las Vegas, NV 89154-4015
 Cell: 615-649-2489
 www.danielemaasit.com  http://www.danielemaasit.com/






 --
 Daniel Emaasit
 Ph.D. Research Assistant
 Transportation Research Center (TRC)
 University of Nevada, Las Vegas
 Las Vegas, NV 89154-4015
 Cell: 615-649-2489
 www.danielemaasit.com  http://www.danielemaasit.com/







-- 
Daniel Emaasit
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com  http://www.danielemaasit.com/


DataFrames coming in SparkR in Apache Spark 1.4.0

2015-06-02 Thread Emaasit
For the impatient R-user, here is a  link
http://people.apache.org/~pwendell/spark-nightly/spark-1.4-docs/latest/sparkr.html
  
to get started working with DataFrames using SparkR.

Or copy and paste this link into your web browser:
http://people.apache.org/~pwendell/spark-nightly/spark-1.4-docs/latest/sparkr.html

Happy coding,
Daniel



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/DataFrames-coming-in-SparkR-in-Apache-Spark-1-4-0-tp23116.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: IDE for sparkR

2015-06-02 Thread Emaasit
Rstudio is the best IDE for running sparkR.
Instructions for this can be found at this  link
https://github.com/apache/spark/tree/branch-1.4/R  . You will need to set
some environment variables as described below.

*Using SparkR from RStudio*

If you wish to use SparkR from RStudio or other R frontends you will need to
set some environment variables which point SparkR to your Spark
installation. For example

# Set this to where Spark is installed
Sys.setenv(SPARK_HOME=/Users/shivaram/spark)
# This line loads SparkR from the installed directory
.libPaths(c(file.path(Sys.getenv(SPARK_HOME), R, lib), .libPaths()))
library(SparkR)
sc - sparkR.init(master=local)



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/IDE-for-sparkR-tp4764p23115.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Book: Data Analysis with SparkR

2014-11-21 Thread Emaasit
Is the a book on SparkR for the absolute  terrified beginner?
I use R for my daily analysis and I am interested in a detailed guide to
using SparkR for data analytics: like a book or online tutorials. If there's
any please direct me to the address.

Thanks,
Daniel



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Book-Data-Analysis-with-SparkR-tp19529.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org