Re: Submitting Spark Applications using Spark Submit

2015-06-22 Thread Andrew Or
. make local changes 4. build/mvn package -DskipTests [...] (no need to clean again here) 5. bin/spark-submit --master spark://[...] --class your.main.class your.jar No need to pass in extra --driver-java-options or --driver-extra-classpath as others have suggested. When using spark-submit

Re: Submitting Spark Applications using Spark Submit

2015-06-20 Thread Raghav Shankar
[...] (no need to clean again here) 5. bin/spark-submit --master spark://[...] --class your.main.class your.jar No need to pass in extra --driver-java-options or --driver-extra-classpath as others have suggested. When using spark-submit, the main jar comes from assembly/target/scala_2.10, which

Re: What files/folders/jars spark-submit script depend on ?

2015-06-19 Thread Andrew Or
Hi Elkhan, Spark submit depends on several things: the launcher jar (1.3.0+ only), the spark-core jar, and the spark-yarn jar (in your case). Why do you want to put it in HDFS though? AFAIK you can't execute scripts directly from HDFS; you need to copy them to a local file system first. I don't

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Andrew Or
) 5. bin/spark-submit --master spark://[...] --class your.main.class your.jar No need to pass in extra --driver-java-options or --driver-extra-classpath as others have suggested. When using spark-submit, the main jar comes from assembly/target/scala_2.10, which is prepared through mvn package. You

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Andrew Or
changes 4. build/mvn package -DskipTests [...] (no need to clean again here) 5. bin/spark-submit --master spark://[...] --class your.main.class your.jar No need to pass in extra --driver-java-options or --driver-extra-classpath as others have suggested. When using spark-submit, the main jar

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Raghav Shankar
/mvn package -DskipTests [...] (no need to clean again here) 5. bin/spark-submit --master spark://[...] --class your.main.class your.jar No need to pass in extra --driver-java-options or --driver-extra-classpath as others have suggested. When using spark-submit, the main jar comes from assembly

Re: What files/folders/jars spark-submit script depend on ?

2015-06-19 Thread Elkhan Dadashov
Thanks Andrew. We cannot include Spark in our Java project due to dependency issues. The Spark will not be exposed to clients. What we want todo is to put spark tarball (in worst case) into HDFS, so through our java app which runs in local mode, launch spark-submit script with user python files

What files/folders/jars spark-submit script depend on ?

2015-06-19 Thread Elkhan Dadashov
Hi all, If I want to ship spark-submit script to HDFS. and then call it from HDFS location for starting Spark job, which other files/folders/jars need to be transferred into HDFS with spark-submit script ? Due to some dependency issues, we can include Spark in our Java application, so instead we

Re: Submitting Spark Applications using Spark Submit

2015-06-18 Thread lovelylavs
.n3.nabble.com/Submitting-Spark-Applications-using-Spark-Submit-tp23352p23395.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-18 Thread Corey Nolet
This is not independent programmatic way of running of Spark job on Yarn cluster. The example I created simply demonstrates how to wire up the classpath so that spark submit can be called programmatically. For my use case, I wanted to hold open a connection so I could send tasks to the executors

Re: Submitting Spark Applications using Spark Submit

2015-06-18 Thread maxdml
You can specify the jars of your application to be included with spark-submit with the /--jars/ switch. Otherwise, are you sure that your newly compiled spark jar assembly is in assembly/target/scala-2.10/? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com

Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Elkhan Dadashov
Hi all, Is there any way running Spark job in programmatic way on Yarn cluster without using spark-submit script ? I cannot include Spark jars on my Java application (due o dependency conflict and other reasons), so I'll be shipping Spark assembly uber jar (spark-assembly-1.3.1-hadoop2.3.0.jar

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Elkhan Dadashov
elkhan8...@gmail.com wrote: Hi all, Is there any way running Spark job in programmatic way on Yarn cluster without using spark-submit script ? I cannot include Spark jars on my Java application (due o dependency conflict and other reasons), so I'll be shipping Spark assembly uber jar

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Guru Medasani
Hi Elkhan, There are couple of ways to do this. 1) Spark-jobserver is a popular web server that is used to submit spark jobs. https://github.com/spark-jobserver/spark-jobserver https://github.com/spark-jobserver/spark-jobserver 2) Spark-submit script sets the classpath for the job. Bypassing

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Corey Nolet
cluster without using spark-submit script ? I cannot include Spark jars on my Java application (due o dependency conflict and other reasons), so I'll be shipping Spark assembly uber jar (spark-assembly-1.3.1-hadoop2.3.0.jar) to Yarn cluster, and then execute job (Python or Java) on Yarn-cluster

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Yanbo Liang
If you run Spark on YARN, the simplest way is replace the $SPARK_HOME/lib/spark-.jar with your own version spark jar file and run your application. The spark-submit script will upload this jar to YARN cluster automatically and then you can run your application as usual. It does not care about

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
To clarify, I am using the spark standalone cluster. On Tuesday, June 16, 2015, Yanbo Liang yblia...@gmail.com wrote: If you run Spark on YARN, the simplest way is replace the $SPARK_HOME/lib/spark-.jar with your own version spark jar file and run your application. The spark-submit

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Will Briggs
to submit a spark application using the command line. I used the spark submit command for doing so. I initially setup my Spark application on Eclipse and have been making changes on there. I recently obtained my own version of the Spark source code and added a new method to RDD.scala. I created a new

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
I made the change so that I could implement top() using treeReduce(). A member on here suggested I make the change in RDD.scala to accomplish that. Also, this is for a research project, and not for commercial use. So, any advice on how I can get the spark submit to use my custom built jars

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Will Briggs
If this is research-only, and you don't want to have to worry about updating the jars installed by default on the cluster, you can add your custom Spark jar using the spark.driver.extraLibraryPath configuration property when running spark-submit, and then use the experimental

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
The documentation says spark.driver.userClassPathFirst can only be used in cluster mode. Does this mean I have to set the --deploy-mode option for spark-submit to cluster? Or can I still use the default client? My understanding is that even in the default deploy mode, spark still uses

Submitting Spark Applications using Spark Submit

2015-06-16 Thread raggy
I am trying to submit a spark application using the command line. I used the spark submit command for doing so. I initially setup my Spark application on Eclipse and have been making changes on there. I recently obtained my own version of the Spark source code and added a new method to RDD.scala

Re: how to use a properties file from a url in spark-submit

2015-06-12 Thread Gary Ogden
Turns out one of the other developers wrapped the jobs in script and did a cd to another folder in the script before executing spark-submit. On 12 June 2015 at 14:20, Matthew Jones mle...@gmail.com wrote: Hmm either spark-submit isn't picking up the relative path or Chronos is not setting your

Re: [Spark 1.4.0]How to set driver's system property using spark-submit options?

2015-06-12 Thread Peng Cheng
://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-How-to-set-driver-s-system-property-using-spark-submit-options-tp23298.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e

Re: [Spark 1.4.0]How to set driver's system property using spark-submit options?

2015-06-12 Thread Ted Yu
This is the SPARK JIRA which introduced the warning: [SPARK-7037] [CORE] Inconsistent behavior for non-spark config properties in spark-shell and spark-submit On Fri, Jun 12, 2015 at 4:34 PM, Peng Cheng rhw...@gmail.com wrote: Hi Andrew, Thanks a lot! Indeed, it doesn't start with spark

Re: [Spark 1.4.0]How to set driver's system property using spark-submit options?

2015-06-12 Thread Peng Cheng
2015 at 19:39, Ted Yu yuzhih...@gmail.com wrote: This is the SPARK JIRA which introduced the warning: [SPARK-7037] [CORE] Inconsistent behavior for non-spark config properties in spark-shell and spark-submit On Fri, Jun 12, 2015 at 4:34 PM, Peng Cheng rhw...@gmail.com wrote: Hi Andrew

Re: how to use a properties file from a url in spark-submit

2015-06-12 Thread Gary Ogden
That's a great idea. I did what you suggested and added the url to the props file in the uri of the json. The properties file now shows up in the sandbox. But when it goes to run spark-submit with --properties-file props.properties it fails to find it: Exception in thread main

Re: how to use a properties file from a url in spark-submit

2015-06-12 Thread Matthew Jones
Hmm either spark-submit isn't picking up the relative path or Chronos is not setting your working directory to your sandbox. Try using cd $MESOS_SANDBOX spark-submit --properties-file props.properties On Fri, Jun 12, 2015 at 12:32 PM Gary Ogden gog...@gmail.com wrote: That's a great idea. I

[Spark 1.4.0]How to set driver's system property using spark-submit options?

2015-06-12 Thread Peng Cheng
set driver's system property in 1.4.0? Is there a reason it is removed without a deprecation warning? Thanks a lot for your advices. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-How-to-set-driver-s-system-property-using-spark-submit-options

Re: [Spark 1.4.0]How to set driver's system property using spark-submit options?

2015-06-12 Thread Andrew Or
set driver's system property in 1.4.0? Is there a reason it is removed without a deprecation warning? Thanks a lot for your advices. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-How-to-set-driver-s-system-property-using-spark-submit-options

how to use a properties file from a url in spark-submit

2015-06-11 Thread Gary Ogden
I have a properties file that is hosted at a url. I would like to be able to use the url in the --properties-file parameter when submitting a job to mesos using spark-submit via chronos I would rather do this than use a file on the local server. This doesn't seem to work though when submitting

Re: how to use a properties file from a url in spark-submit

2015-06-11 Thread Marcelo Vanzin
That's not supported. You could use wget / curl to download the file to a temp location before running spark-submit, though. On Thu, Jun 11, 2015 at 12:48 PM, Gary Ogden gog...@gmail.com wrote: I have a properties file that is hosted at a url. I would like to be able to use the url

Re: spark-submit does not use hive-site.xml

2015-06-10 Thread Cheng Lian
, in $SPARK_HOME/conf directory, I tried to share its config with spark. When I start spark-shell, it gives me a default sqlContext, and I can use that to access my Hive's tables with no problem. But once I submit a similar query via Spark application through 'spark-submit', it does not see

Re: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-10 Thread Akhil Das
standalone mode. Any ideas? Thanks Dong Lei *From:* Akhil Das [mailto:ak...@sigmoidanalytics.com] *Sent:* Tuesday, June 9, 2015 4:46 PM *To:* Dong Lei *Cc:* user@spark.apache.org *Subject:* Re: ClassNotDefException when using spark-submit with multiple jars and files located

Re: spark-submit does not use hive-site.xml

2015-06-10 Thread James Pirz
tables with no problem. But once I submit a similar query via Spark application through 'spark-submit', it does not see the tables and it seems it does not pick hive-site.xml which is under conf directory in Spark's home. I tried to use '--files' argument with spark-submit to pass hive-site.xml

Re: spark-submit does not use hive-site.xml

2015-06-10 Thread Cheng Lian
submit a similar query via Spark application through 'spark-submit', it does not see the tables and it seems it does not pick hive-site.xml which is under conf directory in Spark's home. I tried to use '--files' argument with spark-submit to pass hive-site.xml' to the workers

Re: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-09 Thread Akhil Das
...@sigmoidanalytics.com] *Sent:* Tuesday, June 9, 2015 3:24 PM *To:* Dong Lei *Cc:* user@spark.apache.org *Subject:* Re: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS Once you submits the application, you can check in the driver UI (running on port 4040

RE: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-09 Thread Dong Lei
3:24 PM To: Dong Lei Cc: user@spark.apache.org Subject: Re: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS Once you submits the application, you can check in the driver UI (running on port 4040) Environment Tab to see whether those jars you added got

Re: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-09 Thread Akhil Das
by putting the jar with the class in it on the top of your classpath. Thanks Best Regards On Tue, Jun 9, 2015 at 9:05 AM, Dong Lei dong...@microsoft.com wrote: Hi, spark-users: I’m using spark-submit to submit multiple jars and files(all in HDFS) to run a job, with the following command

Re: Can a Spark App run with spark-submit write pdf files to HDFS

2015-06-09 Thread nsalian
. Moreover, is there a specific need to use Spark in this case? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-a-Spark-App-run-with-spark-submit-write-pdf-files-to-HDFS-tp23233p23237.html Sent from the Apache Spark User List mailing list archive

Can a Spark App run with spark-submit write pdf files to HDFS

2015-06-09 Thread Richard Catlin
I would like to write pdf files using pdfbox to HDFS from my Spark application. Can this be done? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-a-Spark-App-run-with-spark-submit-write-pdf-files-to-HDFS-tp23233.html Sent from the Apache Spark User

Re: spark-submit working differently than pyspark when trying to find external jars

2015-06-09 Thread Walt Schlender
I figured it out *in case anyone else has this problem in the future. spark-submit --driver-class-path lib/postgresql-9.4-1201.jdbc4.jar --packages com.databricks:spark-csv_2.10:1.0.3 path/to/my/script.py What I found is that you MUST put the path to your script at the end of the spark-submit

Re: Can a Spark App run with spark-submit write pdf files to HDFS

2015-06-09 Thread William Briggs
.nabble.com/Can-a-Spark-App-run-with-spark-submit-write-pdf-files-to-HDFS-tp23233.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional

spark-submit does not use hive-site.xml

2015-06-09 Thread James Pirz
config with spark. When I start spark-shell, it gives me a default sqlContext, and I can use that to access my Hive's tables with no problem. But once I submit a similar query via Spark application through 'spark-submit', it does not see the tables and it seems it does not pick hive-site.xml which

RE: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-09 Thread Dong Lei
Dong Lei From: Akhil Das [mailto:ak...@sigmoidanalytics.commailto:ak...@sigmoidanalytics.com] Sent: Tuesday, June 9, 2015 3:24 PM To: Dong Lei Cc: user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: ClassNotDefException when using spark-submit with multiple jars and files located

Re: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-09 Thread Jörn Franke
: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS You can put a Thread.sleep(10) in the code to have the UI available for quiet some time. (Put it just before starting any of your transformations) Or you can enable the spark history server https

RE: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-09 Thread Dong Lei
: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS I am not sure they work with HDFS pathes. You may want to look at the source code. Alternatively you can create a fat jar containing all jars (let your build tool set correctly METAINF). This always works

ClassNotDefException when using spark-submit with multiple jars and files located on HDFS

2015-06-08 Thread Dong Lei
Hi, spark-users: I'm using spark-submit to submit multiple jars and files(all in HDFS) to run a job, with the following command: Spark-submit --class myClass --master spark://localhost:7077/ --deploy-mode cluster --jars hdfs://localhost/1.jar, hdfs://localhost/2.jar --files hdfs

run spark submit on cloudera cluster

2015-06-03 Thread Pa Rö
hi, i want to run my spark app on a cluster, i use cloudera live single node vm. how i must build the job for the spark submit script? and i must upload spark submit on hdfs? best regards paul

Re: yarn-cluster spark-submit process not dying

2015-05-28 Thread Corey Nolet
Thanks Sandy- I was digging through the code in the deploy.yarn.Client and literally found that property right before I saw your reply. I'm on 1.2.x right now which doesn't have the property. I guess I need to update sooner rather than later. On Thu, May 28, 2015 at 3:56 PM, Sandy Ryza

yarn-cluster spark-submit process not dying

2015-05-28 Thread Corey Nolet
I am submitting jobs to my yarn cluster via the yarn-cluster mode and I'm noticing the jvm that fires up to allocate the resources, etc... is not going away after the application master and executors have been allocated. Instead, it just sits there printing 1 second status updates to the console.

Re: yarn-cluster spark-submit process not dying

2015-05-28 Thread Sandy Ryza
Hi Corey, As of this PR https://github.com/apache/spark/pull/5297/files, this can be controlled with spark.yarn.submit.waitAppCompletion. -Sandy On Thu, May 28, 2015 at 11:48 AM, Corey Nolet cjno...@gmail.com wrote: I am submitting jobs to my yarn cluster via the yarn-cluster mode and I'm

Driver ID from spark-submit

2015-04-27 Thread Rares Vernica
Hello, I am trying to use the default Spark cluster manager in a production environment. I will be submitting jobs with spark-submit. I wonder if the following is possible: 1. Get the Driver ID from spark-submit. We will use this ID to keep track of the job and kill it if necessary. 2. Weather

need info on Spark submit on yarn-cluster mode

2015-04-08 Thread sachin Singh
Hi , I observed that we have installed only one cluster, and submiting job as yarn-cluster then getting below error, so is this cause that installation is only one cluster? Please correct me, if this is not cause then why I am not able to run in cluster mode, spark submit command is - spark-submit

Re: need info on Spark submit on yarn-cluster mode

2015-04-08 Thread Steve Loughran
below error, so is this cause that installation is only one cluster? Please correct me, if this is not cause then why I am not able to run in cluster mode, spark submit command is - spark-submit --jars some dependent jars... --master yarn --class com.java.jobs.sparkAggregation mytest-1.0.0.jar

RE: EC2 spark-submit --executor-memory

2015-04-08 Thread java8964
If you are using Spark Standalone deployment, make sure you set the WORKER_MEMROY over 20G, and you do have 20G physical memory. Yong Date: Tue, 7 Apr 2015 20:58:42 -0700 From: li...@adobe.com To: user@spark.apache.org Subject: EC2 spark-submit --executor-memory Dear Spark team, I'm

Re: Can't run spark-submit with an application jar on a Mesos cluster

2015-03-31 Thread hbogert
why it is failing. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-run-spark-submit-with-an-application-jar-on-a-Mesos-cluster-tp22277p22319.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Can't run spark-submit with an application jar on a Mesos cluster

2015-03-31 Thread seglo
: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-run-spark-submit-with-an-application-jar-on-a-Mesos-cluster-tp22277p22331.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e

Re: Spark-submit not working when application jar is in hdfs

2015-03-30 Thread nsalian
Client mode would not support HDFS jar extraction. I tried this: sudo -u hdfs spark-submit --class org.apache.spark.examples.SparkPi --deploy-mode cluster --master yarn hdfs:///user/spark/spark-examples-1.2.0-cdh5.3.2-hadoop2.5.0-cdh5.3.2.jar 10 And it worked. -- View this message in context

Re: Can't run spark-submit with an application jar on a Mesos cluster

2015-03-29 Thread seglo
-606645514-5050-2744-0037 There's nothing in mesos-slave.ERROR for this framework ID. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-run-spark-submit-with-an-application-jar-on-a-Mesos-cluster-tp22277p22282.html Sent from the Apache Spark User List mailing

Re: Can't run spark-submit with an application jar on a Mesos cluster

2015-03-29 Thread Timothy Chen
of this question where I try to submit the application by referring to it on HDFS is very similar to the recent question Spark-submit not working when application jar is in hdfs http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-td21840.html

Can't run spark-submit with an application jar on a Mesos cluster

2015-03-29 Thread seglo
-submit --class org.apache.spark.examples.SparkPi --master mesos://10.173.40.36:5050 ~/spark-1.3.0-bin-hadoop2.4/lib/spark-examples-1.3.0-hadoop2.4.0.jar 100 And the output: jclouds@development-5159-d9:~/learning-spark$ ~/spark-1.3.0-bin-hadoop2.4/bin/spark-submit --class

Re: Can't run spark-submit with an application jar on a Mesos cluster

2015-03-29 Thread seglo
The latter part of this question where I try to submit the application by referring to it on HDFS is very similar to the recent question Spark-submit not working when application jar is in hdfs http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar

Re: Can't run spark-submit with an application jar on a Mesos cluster

2015-03-29 Thread hbogert
/20150329-232522-84118794-5050-18181-/executors/5/runs/latest/stderr -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-run-spark-submit-with-an-application-jar-on-a-Mesos-cluster-tp22277p22280.html Sent from the Apache Spark User List mailing list archive

Re: Spark-submit not working when application jar is in hdfs

2015-03-29 Thread dilm
Made it work by using yarn-cluster as master instead of local. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-tp21840p22281.html Sent from the Apache Spark User List mailing list archive

Re: Spark-submit not working when application jar is in hdfs

2015-03-28 Thread rrussell25
Hi, did you resolve this issue or just work around it be keeping your application jar local? Running into the same issue with 1.3. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-tp21840p22272.html

Re: Spark-submit not working when application jar is in hdfs

2015-03-28 Thread Ted Yu
a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, it works. However, when I copied my application jar to a directory in hdfs, i get the following exception: Warning: Skip remote jar hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-26 Thread Noorul Islam K M
Sandy Ryza sandy.r...@cloudera.com writes: Creating a SparkContext and setting master as yarn-cluster unfortunately will not work. SPARK-4924 added APIs for doing this in Spark, but won't be included until 1.4. -Sandy Did you look into something like [1]? With that you can make rest API

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-26 Thread Sandy Ryza
Creating a SparkContext and setting master as yarn-cluster unfortunately will not work. SPARK-4924 added APIs for doing this in Spark, but won't be included until 1.4. -Sandy On Tue, Mar 17, 2015 at 3:19 AM, Akhil Das ak...@sigmoidanalytics.com wrote: Create SparkContext set master as

Re: netlib-java cannot load native lib in Windows when using spark-submit

2015-03-23 Thread Xi Shen
option ? spark-submit --driver-library-path /opt/hadoop/lib/native ... Cheers On Sat, Mar 21, 2015 at 4:58 PM, Xi Shen davidshe...@gmail.com wrote: Hi, I use the *OpenBLAS* DLL, and have configured my application to work in IDE. When I start my Spark application from IntelliJ IDE, I can see

Re: netlib-java cannot load native lib in Windows when using spark-submit

2015-03-22 Thread Xi Shen
implementation from: com.github.fommil.netlib.NativeRefBLAS From the Spark UI, I can see: spark.driver.extraLibrary c:\openblas Thanks, David On Sun, Mar 22, 2015 at 11:45 AM Ted Yu yuzhih...@gmail.com wrote: Can you try the --driver-library-path option ? spark-submit --driver-library-path

Re: netlib-java cannot load native lib in Windows when using spark-submit

2015-03-22 Thread Ted Yu
: spark.driver.extraLibrary c:\openblas Thanks, David On Sun, Mar 22, 2015 at 11:45 AM Ted Yu yuzhih...@gmail.com wrote: Can you try the --driver-library-path option ? spark-submit --driver-library-path /opt/hadoop/lib/native ... Cheers On Sat, Mar 21, 2015 at 4:58 PM, Xi Shen

Re: netlib-java cannot load native lib in Windows when using spark-submit

2015-03-22 Thread Burak Yavuz
...@gmail.com wrote: Can you try the --driver-library-path option ? spark-submit --driver-library-path /opt/hadoop/lib/native ... Cheers On Sat, Mar 21, 2015 at 4:58 PM, Xi Shen davidshe...@gmail.com wrote: Hi, I use the *OpenBLAS* DLL, and have configured my application to work in IDE. When I

netlib-java cannot load native lib in Windows when using spark-submit

2015-03-21 Thread Xi Shen
Hi, I use the *OpenBLAS* DLL, and have configured my application to work in IDE. When I start my Spark application from IntelliJ IDE, I can see in the log that the native lib is loaded successfully. But if I use *spark-submit* to start my application, the native lib still cannot be load. I saw

Re: netlib-java cannot load native lib in Windows when using spark-submit

2015-03-21 Thread Ted Yu
Can you try the --driver-library-path option ? spark-submit --driver-library-path /opt/hadoop/lib/native ... Cheers On Sat, Mar 21, 2015 at 4:58 PM, Xi Shen davidshe...@gmail.com wrote: Hi, I use the *OpenBLAS* DLL, and have configured my application to work in IDE. When I start my Spark

Re: Spark-submit and multiple files

2015-03-20 Thread Guillaume Charhon
: $ bin/spark-submit --py-files work.py main.py On Tue, Mar 17, 2015 at 3:29 AM, poiuytrez guilla...@databerries.com wrote: Hello guys, I am having a hard time to understand how spark-submit behave with multiple files. I have created two code snippets. Each code snippet is composed

What is the jvm size when start spark-submit through local mode

2015-03-20 Thread Shuai Zheng
Hi, I am curious, when I start a spark program in local mode, which parameter will be used to decide the jvm memory size for executor? In theory should be: --executor-memory 20G But I remember local mode will only start spark executor in the same process of driver, then should be:

Re: Spark-submit and multiple files

2015-03-20 Thread Davies Liu
the error log. On Thu, Mar 19, 2015 at 8:03 PM, Davies Liu dav...@databricks.com wrote: You could submit additional Python source via --py-files , for example: $ bin/spark-submit --py-files work.py main.py On Tue, Mar 17, 2015 at 3:29 AM, poiuytrez guilla...@databerries.com wrote: Hello guys

Re: Error when using multiple python files spark-submit

2015-03-20 Thread Guillaume Charhon
I see. I will try the other way around. On Thu, Mar 19, 2015 at 8:06 PM, Davies Liu dav...@databricks.com wrote: the options of spark-submit should come before main.py, or they will become the options of main.py, so it should be: ../hadoop/spark-install/bin/spark-submit --py-files

Re: Spark-submit and multiple files

2015-03-20 Thread Petar Zecevic
I tried your program in yarn-client mode and it worked with no exception. This is the command I used: spark-submit --master yarn-client --py-files work.py main.py (Spark 1.2.1) On 20.3.2015. 9:47, Guillaume Charhon wrote: Hi Davies, I am already using --py-files. The system does use

Re: Error when using multiple python files spark-submit

2015-03-19 Thread Davies Liu
the options of spark-submit should come before main.py, or they will become the options of main.py, so it should be: ../hadoop/spark-install/bin/spark-submit --py-files /home/poiuytrez/naive.py,/home/poiuytrez/processing.py,/home/poiuytrez/settings.py --master spark://spark-m:7077 main.py

Re: Spark-submit and multiple files

2015-03-19 Thread Davies Liu
You could submit additional Python source via --py-files , for example: $ bin/spark-submit --py-files work.py main.py On Tue, Mar 17, 2015 at 3:29 AM, poiuytrez guilla...@databerries.com wrote: Hello guys, I am having a hard time to understand how spark-submit behave with multiple files. I

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-17 Thread Akhil Das
Create SparkContext set master as yarn-cluster then run it as a standalone program? Thanks Best Regards On Tue, Mar 17, 2015 at 1:27 AM, rrussell25 rrussel...@gmail.com wrote: Hi, were you ever able to determine a satisfactory approach for this problem? I have a similar situation and would

Spark-submit and multiple files

2015-03-17 Thread poiuytrez
Hello guys, I am having a hard time to understand how spark-submit behave with multiple files. I have created two code snippets. Each code snippet is composed of a main.py and work.py. The code works if I paste work.py then main.py in a pyspark shell. However both snippets do not work when using

Error when using multiple python files spark-submit

2015-03-16 Thread poiuytrez
I have a spark app which is composed of multiple files. When I launch Spark using: ../hadoop/spark-install/bin/spark-submit main.py --py-files /home/poiuytrez/naive.py,/home/poiuytrez/processing.py,/home/poiuytrez/settings.py --master spark://spark-m:7077 I am getting an error: 15

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-16 Thread rrussell25
Hi, were you ever able to determine a satisfactory approach for this problem? I have a similar situation and would prefer to execute the job directly from java code within my jms listener and/or servlet container. -- View this message in context:

Re: Temp directory used by spark-submit

2015-03-11 Thread Akhil Das
, I notice that when I run spark-submit, a temporary directory containing all the jars and resource files is created under /tmp (for example, /tmp/spark-fd1b77fc-50f4-4b1c-a122-5cf36969407c). Sometimes this directory gets cleanup after the job, but sometimes it doesn't, which fills up my root

Spark UI and running spark-submit with --master yarn

2015-03-02 Thread Anupama Joshi
the Spark UI but I cannot see my job running. When I run spark-submit with --master local[*] , I see the spark UI , my job everything (Thats great) Do I need to do some settings to see the UI? Thanks -AJ

Re: Spark UI and running spark-submit with --master yarn

2015-03-02 Thread Marcelo Vanzin
That's the RM's RPC port, not the web UI port. (See Ted's e-mail - normally web UI is on 8088.) On Mon, Mar 2, 2015 at 4:14 PM, Anupama Joshi anupama.jo...@gmail.com wrote: Hi Marcelo, Thanks for the quick reply. I have a EMR cluster and I am running the spark-submit on the master node

Re: Spark UI and running spark-submit with --master yarn

2015-03-02 Thread Ted Yu
Default RM Web UI port is 8088 (configurable through yarn.resourcemanager.webapp.address) Cheers On Mon, Mar 2, 2015 at 4:14 PM, Anupama Joshi anupama.jo...@gmail.com wrote: Hi Marcelo, Thanks for the quick reply. I have a EMR cluster and I am running the spark-submit on the master node

Re: Spark UI and running spark-submit with --master yarn

2015-03-02 Thread Marcelo Vanzin
What are you calling masternode? In yarn-cluster mode, the driver is running somewhere in your cluster, not on the machine where you run spark-submit. The easiest way to get to the Spark UI when using Yarn is to use the Yarn RM's web UI. That will give you a link to the application's UI

Re: Spark UI and running spark-submit with --master yarn

2015-03-02 Thread Anupama Joshi
Hi Marcelo, Thanks for the quick reply. I have a EMR cluster and I am running the spark-submit on the master node in the cluster. When I start the spark-submit , I see 15/03/02 23:48:33 INFO client.RMProxy: Connecting to ResourceManager at / 172.31.43.254:9022 But If I try that URL or the use

Re: Spark UI and running spark-submit with --master yarn

2015-03-02 Thread Marcelo Vanzin
. (See Ted's e-mail - normally web UI is on 8088.) On Mon, Mar 2, 2015 at 4:14 PM, Anupama Joshi anupama.jo...@gmail.com wrote: Hi Marcelo, Thanks for the quick reply. I have a EMR cluster and I am running the spark-submit on the master node in the cluster. When I start the spark-submit

Re: What joda-time dependency does spark submit use/need?

2015-03-02 Thread Su She
option. It would be helpful to know if the error I am getting is because of spark-submit or the java app. Thank you! Exception in thread main java.lang.NoClassDefFoundError: org/joda/time/format/DateTimeFormat at com.amazonaws.auth.AWS4Signer.clinit(AWS4Signer.java:44

What joda-time dependency does spark submit use/need?

2015-02-27 Thread Su She
Hello Everyone, I'm having some issues launching (non-spark) applications via the spark-submit commands. The common error I am getting is c/p below. I am able to submit a spark streaming/kafka spark application, but can't start a dynamoDB java app. The common error is related to joda-time. 1) I

Re: What joda-time dependency does spark submit use/need?

2015-02-27 Thread Todd Nist
You can specify these jars (joda-time-2.7.jar, joda-convert-1.7.jar) either as part of your build and assembly or via the --jars option to spark-submit. HTH. On Fri, Feb 27, 2015 at 2:48 PM, Su She suhsheka...@gmail.com wrote: Hello Everyone, I'm having some issues launching (non-spark

Spark-submit not working when application jar is in hdfs

2015-02-26 Thread dilm
I'm trying to run a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, it works. However, when I copied my application jar to a directory in hdfs, i get the following exception: Warning: Skip remote jar hdfs://localhost:9000/user/hdfs/jars

What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-02-25 Thread kshekhram
I do spark upgrade if I use this method. 2. I can use *spark-submit* command line shell to submit my jobs. But to trigger it from my web application I need to use either Java ProcessBuilder api or some package built on java ProcessBuilder. This has 2 issues. First it doesn't sound like a clean way

Re: Can't I mix non-Spark properties into a .properties file and pass it to spark-submit via --properties-file?

2015-02-18 Thread Emre Sevinc
properties file into the über JAR anymore, I can modify the file on the web server and when I submit my application via spark-submit, passing the URL of the properties file, the driver program reads the contents of that file for once, retrieves the values of the keys and continues. PS: I've opted

Re: Can't I mix non-Spark properties into a .properties file and pass it to spark-submit via --properties-file?

2015-02-17 Thread Charles Feduke
spark-submit ... --conf spark.driver.extraJavaOptions=-DpropertiesFile=/home/emre/data/myModule.properties But when I try to retrieve the value of propertiesFile via System.err.println(propertiesFile : + System.getProperty(propertiesFile)); I get NULL: propertiesFile : null

<    1   2   3   4   5   6   7   8   >