Hi Tim,

I tested the scenario again with settings as below,

[dcos@agent spark-2.0.2-bin-hadoop2.7]$ cat conf/spark-defaults.conf
spark.deploy.recoveryMode  ZOOKEEPER
spark.deploy.zookeeper.url 192.168.111.53:2181
spark.deploy.zookeeper.dir /spark
spark.executor.memory 512M
spark.mesos.principal agent-dev-1


However, the case still failed. After master restarted, spark framework did not 
re-register.
From spark framework log, it seemed that below method in MesosClusterScheduler 
was not called.
override def reregistered(driver: SchedulerDriver, masterInfo: MasterInfo): Unit

Did I miss something? Any advice?



Thanks,

Jared, (韦煜)
Software developer
Interested in open source software, big data, Linux


________________________________
From: Timothy Chen <tnac...@gmail.com>
Sent: Friday, March 31, 2017 5:13 AM
To: Yu Wei
Cc: us...@spark.apache.org; dev
Subject: Re: [Spark on mesos] Spark framework not re-registered and lost after 
mesos master restarted

I think failover isn't enabled on regular Spark job framework, since we assume 
jobs are more ephemeral.

It could be a good setting to add to the Spark framework to enable failover.

Tim

On Mar 30, 2017, at 10:18 AM, Yu Wei 
<yu20...@hotmail.com<mailto:yu20...@hotmail.com>> wrote:


Hi guys,

I encountered a problem about spark on mesos.

I setup mesos cluster and launched spark framework on mesos successfully.

Then mesos master was killed and started again.

However, spark framework couldn't be re-registered again as mesos agent does. I 
also couldn't find any error logs.

And MesosClusterDispatcher is still running there.


I suspect this is spark framework issue.

What's your opinion?



Thanks,

Jared, (韦煜)
Software developer
Interested in open source software, big data, Linux

Reply via email to