Hi Elkhan, There are couple of ways to do this.
1) Spark-jobserver is a popular web server that is used to submit spark jobs. https://github.com/spark-jobserver/spark-jobserver <https://github.com/spark-jobserver/spark-jobserver> 2) Spark-submit script sets the classpath for the job. Bypassing the spark-submit script means you have to manage some of this work in your program itself. Here is a link with some discussions around how to handle this scenario. http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/What-dependencies-to-submit-Spark-jobs-programmatically-not-via/td-p/24721 <http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/What-dependencies-to-submit-Spark-jobs-programmatically-not-via/td-p/24721> Guru Medasani gdm...@gmail.com > On Jun 17, 2015, at 6:01 PM, Elkhan Dadashov <elkhan8...@gmail.com> wrote: > > This is not independent programmatic way of running of Spark job on Yarn > cluster. > > That example demonstrates running on Yarn-client mode, also will be dependent > of Jetty. Users writing Spark programs do not want to depend on that. > > I found this SparkLauncher class introduced in Spark 1.4 version > (https://github.com/apache/spark/tree/master/launcher > <https://github.com/apache/spark/tree/master/launcher>) which allows running > Spark jobs in programmatic way. > > SparkLauncher exists in Java and Scala APIs, but I could not find in Python > API. > > Did not try it yet, but seems promising. > > Example: > > import org.apache.spark.launcher.SparkLauncher; > > public class MyLauncher { > > public static void main(String[] args) throws Exception { > > Process spark = new SparkLauncher() > > .setAppResource("/my/app.jar") > > .setMainClass("my.spark.app.Main") > > .setMaster("local") > > .setConf(SparkLauncher.DRIVER_MEMORY, "2g") > > .launch(); > > spark.waitFor(); > > } > > } > > } > > > > On Wed, Jun 17, 2015 at 5:51 PM, Corey Nolet <cjno...@gmail.com > <mailto:cjno...@gmail.com>> wrote: > An example of being able to do this is provided in the Spark Jetty Server > project [1] > > [1] https://github.com/calrissian/spark-jetty-server > <https://github.com/calrissian/spark-jetty-server> > > On Wed, Jun 17, 2015 at 8:29 PM, Elkhan Dadashov <elkhan8...@gmail.com > <mailto:elkhan8...@gmail.com>> wrote: > Hi all, > > Is there any way running Spark job in programmatic way on Yarn cluster > without using spark-submit script ? > > I cannot include Spark jars on my Java application (due o dependency conflict > and other reasons), so I'll be shipping Spark assembly uber jar > (spark-assembly-1.3.1-hadoop2.3.0.jar) to Yarn cluster, and then execute job > (Python or Java) on Yarn-cluster. > > So is there any way running Spark job implemented in python file/Java class > without calling it through spark-submit script ? > > Thanks. > > > > > > > -- > > Best regards, > Elkhan Dadashov