Thanks Akhil, that will help a lot !

It turned out that spark-jobserver does not work in "development mode"  but
if you deploy a server it works (looks like the dependencies when running
jobserver from sbt are not right)

On Thu, Jan 1, 2015 at 5:22 AM, Akhil Das <>

> Hi Fernando,
> Here's a <> simple log
> parser/analyser written in scala (you can run it without
> spark-shell/submit).
> Basically to run a spark job without spark-submit or shell you need a build
> file <>
> which will pull in all the dependecies, and the main program
> <>
> in which you will specify your cluster details while creating the
> SparkContext.
> Thanks
> Best Regards
> On Wed, Dec 31, 2014 at 10:54 PM, Fernando O. <> wrote:
>> Before jumping into a sea of dependencies and bash files:
>> Does anyone have an example of how to run a spark job without using
>> spark-submit or shell ?
>> On Tue, Dec 30, 2014 at 3:23 PM, Fernando O. <> wrote:
>>> Hi all,
>>>     I'm investigating spark for a new project and I'm trying to use
>>> spark-jobserver because... I need to reuse and share RDDs and from what I
>>> read in the forum that's the "standard" :D
>>> Turns out that spark-jobserver doesn't seem to work on yarn, or at least
>>> it does not on 1.1.1
>>> My config is spark 1.1.1 (moving to 1.2.0 soon), hadoop 2.6 (which seems
>>> compatible with 2.4 from spark point of view... at least I was able to run
>>> spark-submit and shell tasks both in yarn-client and yarn-cluster modes)
>>> going back to my original point, I did some changes in spark-jobserver
>>> and how I can submit a job but I get:
>>> ....
>>> [2014-12-30 18:20:19,769] INFO  e.spark.deploy.yarn.Client []
>>> [akka://JobServer/user/context-supervisor/f983d86e-spark.jobserver.WordCountExample]
>>> - Max mem capabililty of a single resource in this cluster 15000
>>> [2014-12-30 18:20:19,770] INFO  e.spark.deploy.yarn.Client []
>>> [akka://JobServer/user/context-supervisor/f983d86e-spark.jobserver.WordCountExample]
>>> - Preparing Local resources
>>> [2014-12-30 18:20:20,041] INFO  e.spark.deploy.yarn.Client []
>>> [akka://JobServer/user/context-supervisor/f983d86e-spark.jobserver.WordCountExample]
>>> - Prepared Local resources Map(__spark__.jar -> resource { scheme: "file"
>>> port: -1 file:
>>> "/home/ec2-user/.ivy2/cache/org.apache.spark/spark-yarn_2.10/jars/spark-yarn_2.10-1.1.1.jar"
>>> } size: 343226 timestamp: 1416429031000 type: FILE visibility: PRIVATE)
>>> [...]
>>> [2014-12-30 18:20:20,139] INFO  e.spark.deploy.yarn.Client []
>>> [akka://JobServer/user/context-supervisor/f983d86e-spark.jobserver.WordCountExample]
>>> - Yarn AM launch context:
>>> [2014-12-30 18:20:20,140] INFO  e.spark.deploy.yarn.Client []
>>> [akka://JobServer/user/context-supervisor/f983d86e-spark.jobserver.WordCountExample]
>>> -   class:   org.apache.spark.deploy.yarn.ExecutorLauncher
>>> [2014-12-30 18:20:20,140] INFO  e.spark.deploy.yarn.Client []
>>> [akka://JobServer/user/context-supervisor/f983d86e-spark.jobserver.WordCountExample]
>>> -   env:     Map(CLASSPATH ->
>>> $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*:$PWD/__app__.jar:$PWD/*,
>>> .sparkStaging/application_1419963137232_0001/,
>>> 1416429031000, SPARK_YARN_CACHE_FILES ->
>>> file:/home/ec2-user/.ivy2/cache/org.apache.spark/spark-yarn_2.10/jars/spark-yarn_2.10-1.1.1.jar#__spark__.jar)
>>> [...]
>>> [2014-12-30 18:03:04,474] INFO  YarnClientSchedulerBackend []
>>> [akka://JobServer/user/context-supervisor/ebac0153-spark.jobserver.WordCountExample]
>>> - Application report from ASM:
>>>  appMasterRpcPort: -1
>>>  appStartTime: 1419962580444
>>>  yarnAppState: FAILED
>>> [2014-12-30 18:03:04,475] ERROR .jobserver.JobManagerActor []
>>> [akka://JobServer/user/context-supervisor/ebac0153-spark.jobserver.WordCountExample]
>>> - Failed to create context ebac0153-spark.jobserver.WordCountExample,
>>> shutting down actor
>>> org.apache.spark.SparkException: Yarn application already ended,might be
>>> killed or not able to launch application master.
>>> at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApp(YarnClientSchedulerBackend.scala:117)
>>> at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:93)
>>> In the hadoop console I can get the detailed issue
>>> Diagnostics: File
>>> file:/home/ec2-user/.ivy2/cache/org.apache.spark/spark-yarn_2.10/jars/spark-yarn_2.10-1.1.1.jar
>>> does not exist
>>> File
>>> file:/home/ec2-user/.ivy2/cache/org.apache.spark/spark-yarn_2.10/jars/spark-yarn_2.10-1.1.1.jar
>>> does not exist
>>> now... it seems like spark is actually use a file I used for launching
>>> the task in other nodes
>>> Can anyone point me in the right direction of where that might be being
>>> set?

Reply via email to