I run my Spark over Mesos by either running spark submit in a Docker
container using Marathon or from one of the node in mesos cluster.  I am on
mesos 0.21. I have tried both spark 1.3.1 and 1.2.1 with rebuild of hadoop
2.4 and above.

Some details on the configuration: I made sure that spark is using ip
addresses for all communication by
defining spark.driver.host, SPARK_PUBLIC_DNS, SPARK_LOCAL_IP, SPARK_LOCAL_HOST
in the right place.

Hope this help.

Yang.

On Fri, Apr 24, 2015 at 5:15 PM, Stephen Carman <scar...@coldlight.com>
wrote:

> So I can’t for the life of me to get something even simple working for
> Spark on Mesos.
>
> I installed a 3 master, 3 slave mesos cluster, which is all configured,
> but I can’t for the life of me even get the spark shell to work properly.
>
> I get errors like this
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 5
> in stage 0.0 failed 4 times, most recent failure: Lost task 5.3 in stage
> 0.0 (TID 23, 10.253.1.117): ExecutorLostFailure (executor
> 20150424-104711-1375862026-5050-20113-S1 lost)
> Driver stacktrace:
>         at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
>         at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
>         at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
>         at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
>         at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>         at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>         at scala.Option.foreach(Option.scala:236)
>         at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
>         at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
>         at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
>         at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>
> I tried both mesos 0.21 and 0.22 and they both produce the same error…
>
> My version of spark is 1.3.1 with hadoop 2.6, I just downloaded the
> pre-build from the site, or is that wrong and i have to build it myself?
>
> I have my mesos_native_java_library, spark executor URI and mesos master
> set in my spark-env.sh, they to the best of my abilities seem correct.
>
> Does anyone have any insight into this at all? I’m running this on red hat
> 7 with 8 CPU cores and 14gb of ram per slave, so 24 cores total and 42gb of
> ram total.
>
> Anyone have any idea at all what is going on here?
>
> Thanks,
> Steve
> This e-mail is intended solely for the above-mentioned recipient and it
> may contain confidential or privileged information. If you have received it
> in error, please notify us immediately and delete the e-mail. You must not
> copy, distribute, disclose or take any action in reliance on it. In
> addition, the contents of an attachment to this e-mail may contain software
> viruses which could damage your own computer system. While ColdLight
> Solutions, LLC has taken every reasonable precaution to minimize this risk,
> we cannot accept liability for any damage which you sustain as a result of
> software viruses. You should perform your own virus checks before opening
> the attachment.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to