Re: Submitting Spark Applications using Spark Submit

2015-06-22 Thread Andrew Or
Did you restart your master / workers? On the master node, run `sbin/stop-all.sh` followed by `sbin/start-all.sh` 2015-06-20 17:59 GMT-07:00 Raghav Shankar raghav0110...@gmail.com: Hey Andrew, I tried the following approach: I modified my Spark build on my local machine. I did downloaded

Re: Submitting Spark Applications using Spark Submit

2015-06-20 Thread Raghav Shankar
Hey Andrew, I tried the following approach: I modified my Spark build on my local machine. I did downloaded the Spark 1.4.0 src code and then made a change to ResultTask.scala( I made a simple change to see if it work. I added a print statement). Now, I built spark using mvn

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Andrew Or
Hi Raghav, If you want to make changes to Spark and run your application with it, you may follow these steps. 1. git clone g...@github.com:apache/spark 2. cd spark; build/mvn clean package -DskipTests [...] 3. make local changes 4. build/mvn package -DskipTests [...] (no need to clean again

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Andrew Or
Hi Raghav, I'm assuming you're using standalone mode. When using the Spark EC2 scripts you need to make sure that every machine has the most updated jars. Once you have built on one of the nodes, you must *rsync* the Spark directory to the rest of the nodes (see /root/spark-ec2/copy-dir). That

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Raghav Shankar
Thanks Andrew! Is this all I have to do when using the spark ec2 script to setup a spark cluster? It seems to be getting an assembly jar that is not from my project(perhaps from a maven repo). Is there a way to make the ec2 script use the assembly jar that I created? Thanks, Raghav On Friday,

Re: Submitting Spark Applications using Spark Submit

2015-06-18 Thread lovelylavs
Hi, To make the jar files as part of the jar which you would like to use, you should create a uber jar. Please refer to the following: https://maven.apache.org/plugins/maven-shade-plugin/examples/includes-excludes.html -- View this message in context:

Re: Submitting Spark Applications using Spark Submit

2015-06-18 Thread maxdml
You can specify the jars of your application to be included with spark-submit with the /--jars/ switch. Otherwise, are you sure that your newly compiled spark jar assembly is in assembly/target/scala-2.10/? -- View this message in context:

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Yanbo Liang
If you run Spark on YARN, the simplest way is replace the $SPARK_HOME/lib/spark-.jar with your own version spark jar file and run your application. The spark-submit script will upload this jar to YARN cluster automatically and then you can run your application as usual. It does not care about

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
To clarify, I am using the spark standalone cluster. On Tuesday, June 16, 2015, Yanbo Liang yblia...@gmail.com wrote: If you run Spark on YARN, the simplest way is replace the $SPARK_HOME/lib/spark-.jar with your own version spark jar file and run your application. The spark-submit

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Will Briggs
In general, you should avoid making direct changes to the Spark source code. If you are using Scala, you can seamlessly blend your own methods on top of the base RDDs using implicit conversions. Regards, Will On June 16, 2015, at 7:53 PM, raggy raghav0110...@gmail.com wrote: I am trying to

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
I made the change so that I could implement top() using treeReduce(). A member on here suggested I make the change in RDD.scala to accomplish that. Also, this is for a research project, and not for commercial use. So, any advice on how I can get the spark submit to use my custom built jars

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Will Briggs
If this is research-only, and you don't want to have to worry about updating the jars installed by default on the cluster, you can add your custom Spark jar using the spark.driver.extraLibraryPath configuration property when running spark-submit, and then use the experimental

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
The documentation says spark.driver.userClassPathFirst can only be used in cluster mode. Does this mean I have to set the --deploy-mode option for spark-submit to cluster? Or can I still use the default client? My understanding is that even in the default deploy mode, spark still uses the