This is not independent programmatic way of running of Spark job on Yarn
cluster.

That example demonstrates running on *Yarn-client* mode, also will be
dependent of Jetty. Users writing Spark programs do not want to depend on
that.

I found this SparkLauncher class introduced in Spark 1.4 version (
https://github.com/apache/spark/tree/master/launcher) which allows running
Spark jobs in programmatic way.

SparkLauncher exists in Java and Scala APIs, but I could not find in Python
API.

Did not try it yet, but seems promising.

Example:

import org.apache.spark.launcher.SparkLauncher;

public class MyLauncher {

public static void main(String[] args) throws Exception {

     Process spark = new SparkLauncher()

       .setAppResource("/my/app.jar")

       .setMainClass("my.spark.app.Main")

       .setMaster("local")

       .setConf(SparkLauncher.DRIVER_MEMORY, "2g")

        .launch();

      spark.waitFor();

   }

  }

}



On Wed, Jun 17, 2015 at 5:51 PM, Corey Nolet <cjno...@gmail.com> wrote:

> An example of being able to do this is provided in the Spark Jetty Server
> project [1]
>
> [1] https://github.com/calrissian/spark-jetty-server
>
> On Wed, Jun 17, 2015 at 8:29 PM, Elkhan Dadashov <elkhan8...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> Is there any way running Spark job in programmatic way on Yarn cluster
>> without using spark-submit script ?
>>
>> I cannot include Spark jars on my Java application (due o dependency
>> conflict and other reasons), so I'll be shipping Spark assembly uber jar
>> (spark-assembly-1.3.1-hadoop2.3.0.jar) to Yarn cluster, and then execute
>> job (Python or Java) on Yarn-cluster.
>>
>> So is there any way running Spark job implemented in python file/Java
>> class without calling it through spark-submit script ?
>>
>> Thanks.
>>
>>
>>
>


-- 

Best regards,
Elkhan Dadashov

Reply via email to