I have a mesos cluster which runs marathon. 

I am using marathon to launch a long running spark streaming job which
consumes a Kafka Input Stream.

With one worker node in the cluster, I can successsfully launch the driver
job in marathon, which in turn launches a task in mesos via spark (spark is
using the coarse mode driver), which consumes just fine from kafka.

When I add nodes to the cluster, and start the driver job the first spark
mesos task runs fine for a few minutes, then exits (exit code 1) and the
tasks that are subsequently launched just sit there doing nothing, with this
being the last log output


14/03/26 19:06:47 INFO slf4j.Slf4jLogger: Slf4jLogger started
14/03/26 19:06:47 INFO Remoting: Starting remoting
14/03/26 19:06:47 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkExecutor@ip-xxx.ec2.internaxl:34488]
14/03/26 19:06:47 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://sparkExecutor@xxxx.ec2.internal:34488]
14/03/26 19:06:47 INFO executor.CoarseGrainedExecutorBackend: Connecting to
driver: akka.tcp://spark@xxxx.ec2.internal:35332/user/CoarseGrainedScheduler
14/03/26 19:06:47 INFO executor.CoarseGrainedExecutorBackend: Successfully
registered with driver
14/03/26 19:06:48 INFO slf4j.Slf4jLogger: Slf4jLogger started
14/03/26 19:06:48 INFO Remoting: Starting remoting
14/03/26 19:06:48 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://spark@xxxx.ec2.internal:59070]
14/03/26 19:06:48 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://spark@xxxx.ec2.internal:59070]
14/03/26 19:06:48 INFO spark.SparkEnv: Connecting to BlockManagerMaster:
akka.tcp://spark@xxxx.ec2.internal:35332/user/BlockManagerMaster
14/03/26 19:06:48 INFO storage.DiskBlockManager: Created local directory at
/tmp/spark-local-20140326190648-fa77
14/03/26 19:06:48 INFO storage.MemoryStore: MemoryStore started with
capacity 294.4 MB.
14/03/26 19:06:48 INFO network.ConnectionManager: Bound socket to port 55018
with id = ConnectionManagerId(xxxx.ec2.internal,55018)
14/03/26 19:06:48 INFO storage.BlockManagerMaster: Trying to register
BlockManager
14/03/26 19:06:48 INFO storage.BlockManagerMaster: Registered BlockManager
14/03/26 19:06:48 INFO spark.SparkEnv: Connecting to MapOutputTracker:
akka.tcp://spark@xxxx.ec2.internal:35332/user/MapOutputTracker
14/03/26 19:06:48 INFO spark.HttpFileServer: HTTP File server directory is
/tmp/spark-15f43b7b-6f7c-48dd-8bd8-5663a00fd314
14/03/26 19:06:48 INFO spark.HttpServer: Starting HTTP Server
14/03/26 19:06:48 INFO server.Server: jetty-7.x.y-SNAPSHOT
14/03/26 19:06:48 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:45286



Any pointers as to un-stick this??






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Kafka-Mesos-Marathon-strangeness-tp3285.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to