How to submit a job to Spark cluster?

Tao Xiao Thu, 20 Feb 2014 07:13:52 -0800

My application source file,  *SimpleDistributedApp.scala*, is as  follows:

__________________________________________________________________
import org.apache.spark.{SparkConf, SparkContext}


object SimpleDistributedApp {
    def main(args: Array[String]) = {
        val filepath = "hdfs://
hadoop-1.certus.com:54310/user/root/samples/data"

        val conf = new SparkConf()
                    .setMaster("spark://hadoop-1.certus.com:7077")
                    .setAppName("**SimpleDistributedApp**")

.setSparkHome("/home/xt/soft/spark-0.9.0-incubating-bin-hadoop1")

.setJars(Array("target/scala-2.10/simple-distributed-app_2.10-1.0.jar"))
                    .set("spark.executor.memory", "1g")

        val sc = new SparkContext(conf)
        val text = sc.textFile(filepath, 3)

        val numOfHello = text.filter(line => line.contains("hello")).count()

        println("number of lines containing 'hello' is " + numOfHello)
        println("down")
    }
}
______________________________________________________________________



The corresponding sbt file, *$SPARK_HOME/simple.sbt*,  is as follows:
_________________________________________________________________

name := "Simple Distributed App"

version := "1.0"

scalaVersion := "2.10.3"

libraryDependencies += "org.apache.spark" %% "spark-core" %
"0.9.0-incubating"

resolvers += "Akka Repository" at "http://repo.akka.io/releases/";
_________________________________________________________________


I built the application into
*$SPARK_HOME/target/scala-2.10/simple-distributed-app_2.10-1.0.jar*, using
the command
        SPARK_HADOOP_VERSION=1.2.1   sbt/sbt   package

I ran it using the command "sbt/sbt run" and it finished running
successfully.

But I'm not sure what's the correct and general way to submit and run a job
in Spark cluster. To be specific,after having built a job into a JAR file,
say *simpleApp.jar*, where should I put it and how should I submit it to
Spark cluster?

How to submit a job to Spark cluster?

Reply via email to