In a Hadoop cluster, the following command is the general way to submit a
job:
bin/hadoop jar <job-jar> <arguments>
Is there such a general way to submit a job into Spark cluster?
Besides, my job finished successfully, and the Spark Web UI shows that this
application's state is *FINISHED*, but each executor's state is *KILLED*. I
could see this application has produced the expected result, why is each
executor's state reported as *KILLED* ?
Completed Applications IDNameCoresMemory per NodeSubmitted TimeUserState
Duration
app-20140220173957-0001<http://hadoop-1.certus.com:8080/app?appId=app-20140220173957-0001>
**SimpleDistributedApp** <http://hadoop-1.certus.com:4040/> 12 1024.0
MB 2014/02/20
17:39:57rootFINISHED13 s
Executor Summary ExecutorIDWorkerCoresMemoryStateLogs2
worker-20140220162542-hadoop-2.certus.com-49805<http://hadoop-2.certus.com:8081/>
41024KILLEDstdout<http://hadoop-2.certus.com:8081/logPage?appId=app-20140220173957-0001&executorId=2&logType=stdout>
stderr<http://hadoop-2.certus.com:8081/logPage?appId=app-20140220173957-0001&executorId=2&logType=stderr>
1worker-20140220162542-hadoop-4.certus.com-40528<http://hadoop-4.certus.com:8081/>
41024KILLEDstdout<http://hadoop-4.certus.com:8081/logPage?appId=app-20140220173957-0001&executorId=1&logType=stdout>
stderr<http://hadoop-4.certus.com:8081/logPage?appId=app-20140220173957-0001&executorId=1&logType=stderr>
0worker-20140220162542-hadoop-3.certus.com-47386<http://hadoop-3.certus.com:8081/>
41024KILLEDstdout<http://hadoop-3.certus.com:8081/logPage?appId=app-20140220173957-0001&executorId=0&logType=stdout>
stderr<http://hadoop-3.certus.com:8081/logPage?appId=app-20140220173957-0001&executorId=0&logType=stderr>
Thanks
Tao
2014-02-21 0:00 GMT+08:00 Mayur Rustagi <[email protected]>:
> You are specifying the spark master in the jar
> .setMaster("spark://hadoop-1.certus.com:7077")
> so sbt run is deploying the jar into the master cluster and running it.
> Regards
> Mayur
>
> Mayur Rustagi
> Ph: +919632149971
> h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
> https://twitter.com/mayur_rustagi
>
>
>
> On Thu, Feb 20, 2014 at 7:22 AM, Nan Zhu <[email protected]> wrote:
>
>> I'm not sure if I understand your question correctly
>>
>> do you mean you didn't see the application information in Spark Web UI
>> even it generates the expected results?
>>
>> Best,
>>
>> --
>> Nan Zhu
>>
>> On Thursday, February 20, 2014 at 10:13 AM, Tao Xiao wrote:
>>
>> My application source file, *SimpleDistributedApp.scala*, is as
>> follows:
>>
>> __________________________________________________________________
>> import org.apache.spark.{SparkConf, SparkContext}
>>
>> object SimpleDistributedApp {
>> def main(args: Array[String]) = {
>> val filepath = "hdfs://
>> hadoop-1.certus.com:54310/user/root/samples/data"
>>
>> val conf = new SparkConf()
>> .setMaster("spark://hadoop-1.certus.com:7077")
>> .setAppName("**SimpleDistributedApp**")
>>
>> .setSparkHome("/home/xt/soft/spark-0.9.0-incubating-bin-hadoop1")
>>
>> .setJars(Array("target/scala-2.10/simple-distributed-app_2.10-1.0.jar"))
>> .set("spark.executor.memory", "1g")
>>
>> val sc = new SparkContext(conf)
>> val text = sc.textFile(filepath, 3)
>>
>> val numOfHello = text.filter(line =>
>> line.contains("hello")).count()
>>
>> println("number of lines containing 'hello' is " + numOfHello)
>> println("down")
>> }
>> }
>> ______________________________________________________________________
>>
>>
>>
>> The corresponding sbt file, *$SPARK_HOME/simple.sbt*, is as follows:
>> _________________________________________________________________
>>
>> name := "Simple Distributed App"
>>
>> version := "1.0"
>>
>> scalaVersion := "2.10.3"
>>
>> libraryDependencies += "org.apache.spark" %% "spark-core" %
>> "0.9.0-incubating"
>>
>> resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
>> _________________________________________________________________
>>
>>
>> I built the application into
>> *$SPARK_HOME/target/scala-2.10/simple-distributed-app_2.10-1.0.jar*,
>> using the command
>> SPARK_HADOOP_VERSION=1.2.1 sbt/sbt package
>>
>> I ran it using the command "sbt/sbt run" and it finished running
>> successfully.
>>
>> But I'm not sure what's the correct and general way to submit and run a
>> job in Spark cluster. To be specific,after having built a job into a JAR
>> file, say *simpleApp.jar*, where should I put it and how should I submit
>> it to Spark cluster?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>