Spark-submit not running

2014-08-28 Thread Hingorani, Vineet
The file is compiling properly but when I try to run the jar file using spark-submit, it is giving some errors. I am running spark locally and have downloaded a pre-built version of Spark named For Hadoop 2 (HDP2, CDH5). AI don't know if it is a dependency problem but I don't want to have

RE: Spark-submit not running

2014-08-28 Thread Hingorani, Vineet
[mailto:so...@cloudera.com] Sent: Donnerstag, 28. August 2014 13:49 To: Hingorani, Vineet Cc: user@spark.apache.org Subject: Re: Spark-submit not running You need to set HADOOP_HOME. Is Spark officially supposed to work on Windows or not at this stage? I know the build doesn't quite yet. On Thu, Aug

RE: Spark-submit not running

2014-08-28 Thread Hingorani, Vineet
: Donnerstag, 28. August 2014 15:16 To: Guru Medasani Cc: Hingorani, Vineet; user@spark.apache.org Subject: Re: Spark-submit not running Yes, but I think at the moment there is still a dependency on Hadoop even when not using it. See https://issues.apache.org/jira/browse/SPARK-2356 On Thu, Aug 28, 2014 at 2

RE: Spark-submit not running

2014-08-28 Thread Hingorani, Vineet
) Compilation failed Vineet -Original Message- From: Sean Owen [mailto:so...@cloudera.com] Sent: Donnerstag, 28. August 2014 16:30 To: Hingorani, Vineet Cc: user@spark.apache.org Subject: Re: Spark-submit not running You should set this as early as possible in your program, before other

Example File not running

2014-08-27 Thread Hingorani, Vineet
Hello all, I am able to use Spark in the shell but I am not able to run a spark file. I am using sbt and the jar is created but even the SimpleApp class example given on the site http://spark.apache.org/docs/latest/quick-start.html is not running. I installed a prebuilt version of spark and

RE: Example File not running

2014-08-27 Thread Hingorani, Vineet
What should I put the value of that environment variable? I want to run the scripts locally on my machine and do not have any Hadoop installed. Thank you From: Akhil Das [mailto:ak...@sigmoidanalytics.com] Sent: Mittwoch, 27. August 2014 12:54 To: Hingorani, Vineet Cc: user@spark.apache.org

RE: Example File not running

2014-08-27 Thread Hingorani, Vineet
...@sigmoidanalytics.com] Sent: Mittwoch, 27. August 2014 13:35 To: Hingorani, Vineet Cc: user@spark.apache.org Subject: Re: Example File not running It should point to your hadoop installation directory. (like C:\hadoop\) Since you don't have hadoop installed, What is the code that you are running

Example file not running

2014-08-27 Thread Hingorani, Vineet
Hello all, I am able to use Spark in the shell but I am not able to run a spark file. I am using sbt and the jar is created but even the SimpleApp class example given on the site http://spark.apache.org/docs/latest/quick-start.html is not running. I installed a prebuilt version of spark and

RE: Example File not running

2014-08-27 Thread Hingorani, Vineet
: Mittwoch, 27. August 2014 16:01 To: Hingorani, Vineet Cc: user@spark.apache.org Subject: Re: Example File not running You can install hadoop 2 by reading this doc https://wiki.apache.org/hadoop/Hadoop2OnWindows Once you are done with it, you can set the environment variable HADOOP_HOME

Manipulating columns in CSV file or Transpose of Array[Array[String]] RDD

2014-08-25 Thread Hingorani, Vineet
Hello all, Could someone help me with the manipulation of csv file data. I have 'semicolon' separated csv data including doubles and strings. I want to calculate the maximum/average of a column. When I read the file using sc.textFile(test.csv).map(_.split(;), each field is read as string.

RE: Manipulating columns in CSV file or Transpose of Array[Array[String]] RDD

2014-08-25 Thread Hingorani, Vineet
...@paxata.com] Sent: Montag, 25. August 2014 18:34 To: Hingorani, Vineet Cc: user@spark.apache.org Subject: Re: Manipulating columns in CSV file or Transpose of Array[Array[String]] RDD Do you want to do this on one column or all numeric columns? On Mon, Aug 25, 2014 at 7:09 AM, Hingorani, Vineet

Manipulating/Analyzing CSV files in Spark on local machine

2014-08-22 Thread Hingorani, Vineet
Hello all, I am new to Spark and I want to analyze csv file using Spark on my local machine. The csv files contains airline database and I want to get a few descriptive statistics (e.g. maximum of one column, mean, standard deviation in a column, etc.) for my file. I am reading the file using