Re: Spark master shuts down when one of zookeeper dies

2023-11-07 Thread Mich Talebzadeh
Hi,

Spark standalone mode does not use or rely on ZooKeeper by default. The
Spark master and workers communicate directly with each other without using
ZooKeeper. However, it appears that in your case you are relying on
ZooKeeper to provide high availability for your standalone cluster. By
configuring Spark to use ZooKeeper for leader election, you can ensure that
there is always a Spark master running, even if one of the ZooKeeper
servers goes down.

To use ZooKeeper for high availability in Spark standalone mode, you need
to configure the following properties:

spark.deploy.recoveryMode: Set to ZOOKEEPER to enable high availability
spark.deploy.zookeeper.url: The ZooKeeper cluster URL

Now the Spark master shuts down when a Zookeeper instance is down because
it loses its leadership. Zookeeper uses a leader election algorithm to
ensure that there is always a single leader in the cluster. When a
Zookeeper instance goes down, the remaining Zookeeper instances will elect
a new leader.

The original master that was down never comes up because it has lost its
state. The Spark master stores its state in Zookeeper. When the Zookeeper
instance that the master was connected to goes down, the master loses its
state. This means that the master cannot restart without losing data.

To avoid this problem, you can run multiple Spark masters in high
availability mode. This means that you will have at least two Spark masters
running at all times. When a Zookeeper instance goes down, the remaining
Spark masters will continue to run and serve applications. As stated, to
run Spark masters in high availability mode, you will need to configure the
spark.deploy.recoveryMode property to ZOOKEEPER. You will also need to
configure the spark.deploy.zookeeper.url property to point to your
Zookeeper cluster.

HTH,

Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom

Mich Talebzadeh (Ph.D.) | LinkedIn


https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 6 Nov 2023 at 15:19, Kaustubh Ghode  wrote:

> I am using spark-3.4.1 I have a setup with three ZooKeeper servers, Spark
> master shuts down when a Zookeeper instance is down a new master is elected
> as leader and the cluster is up. But the original master that was down
> never comes up. can you please help me with this issue?
>
> Stackoverflow link:- https://stackoverflow.com/questions/77431515
>
> Thanks,
> Kaustubh
>


Spark master shuts down when one of zookeeper dies

2023-11-06 Thread Kaustubh Ghode
I am using spark-3.4.1 I have a setup with three ZooKeeper servers, Spark
master shuts down when a Zookeeper instance is down a new master is elected
as leader and the cluster is up. But the original master that was down
never comes up. can you please help me with this issue?

Stackoverflow link:- https://stackoverflow.com/questions/77431515

Thanks,
Kaustubh


Re: Spark master shuts down when one of zookeeper dies

2016-06-30 Thread Ted Yu
Looking at Master.scala, I don't see code that would bring master back up
automatically.
Probably you can implement monitoring tool so that you get some alert when
master goes down.

e.g.
http://stackoverflow.com/questions/12896998/how-to-set-up-alerts-on-ganglia

More experienced users may have better suggestion.

On Thu, Jun 30, 2016 at 2:09 AM, vimal dinakaran 
wrote:

> Hi Ted,
>  Thanks for the pointers. I had a three node zookeeper setup . Now the
> master alone dies when  a zookeeper instance is down and a new master is
> elected as leader and the cluster is up.
> But the master that was down , never comes up.
>
> Is this the expected ? Is there a way to get alert when a master is down ?
> How to make sure that there is atleast one back up master is up always ?
>
> Thanks
> Vimal
>
>
>
>
> On Tue, Jun 28, 2016 at 7:24 PM, Ted Yu  wrote:
>
>> Please see some blog w.r.t. the number of nodes in the quorum:
>>
>>
>> http://stackoverflow.com/questions/13022244/zookeeper-reliability-three-versus-five-nodes
>>
>> http://www.ibm.com/developerworks/library/bd-zookeeper/
>>   the paragraph starting with 'A quorum is represented by a strict
>> majority of nodes'
>>
>> FYI
>>
>> On Tue, Jun 28, 2016 at 5:52 AM, vimal dinakaran 
>> wrote:
>>
>>> I am using zookeeper for providing HA for spark cluster.  We have two
>>> nodes zookeeper cluster.
>>>
>>> When one of the zookeeper dies then the entire spark cluster goes down .
>>>
>>> Is this expected behaviour ?
>>> Am I missing something in config ?
>>>
>>> Spark version - 1.6.1.
>>> Zookeeper version - 3.4.6
>>> // spark-env.sh
>>> SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER
>>> -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181"
>>>
>>> Below is the log from spark master:
>>> ZooKeeperLeaderElectionAgent: We have lost leadership
>>> 16/06/27 09:39:30 ERROR Master: Leadership has been revoked -- master
>>> shutting down.
>>>
>>> Thanks
>>> Vimal
>>>
>>>
>>>
>>>
>>
>


Re: Spark master shuts down when one of zookeeper dies

2016-06-30 Thread vimal dinakaran
Hi Ted,
 Thanks for the pointers. I had a three node zookeeper setup . Now the
master alone dies when  a zookeeper instance is down and a new master is
elected as leader and the cluster is up.
But the master that was down , never comes up.

Is this the expected ? Is there a way to get alert when a master is down ?
How to make sure that there is atleast one back up master is up always ?

Thanks
Vimal




On Tue, Jun 28, 2016 at 7:24 PM, Ted Yu  wrote:

> Please see some blog w.r.t. the number of nodes in the quorum:
>
>
> http://stackoverflow.com/questions/13022244/zookeeper-reliability-three-versus-five-nodes
>
> http://www.ibm.com/developerworks/library/bd-zookeeper/
>   the paragraph starting with 'A quorum is represented by a strict
> majority of nodes'
>
> FYI
>
> On Tue, Jun 28, 2016 at 5:52 AM, vimal dinakaran 
> wrote:
>
>> I am using zookeeper for providing HA for spark cluster.  We have two
>> nodes zookeeper cluster.
>>
>> When one of the zookeeper dies then the entire spark cluster goes down .
>>
>> Is this expected behaviour ?
>> Am I missing something in config ?
>>
>> Spark version - 1.6.1.
>> Zookeeper version - 3.4.6
>> // spark-env.sh
>> SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER
>> -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181"
>>
>> Below is the log from spark master:
>> ZooKeeperLeaderElectionAgent: We have lost leadership
>> 16/06/27 09:39:30 ERROR Master: Leadership has been revoked -- master
>> shutting down.
>>
>> Thanks
>> Vimal
>>
>>
>>
>>
>


Re: Spark master shuts down when one of zookeeper dies

2016-06-28 Thread Ted Yu
Please see some blog w.r.t. the number of nodes in the quorum:

http://stackoverflow.com/questions/13022244/zookeeper-reliability-three-versus-five-nodes

http://www.ibm.com/developerworks/library/bd-zookeeper/
  the paragraph starting with 'A quorum is represented by a strict majority
of nodes'

FYI

On Tue, Jun 28, 2016 at 5:52 AM, vimal dinakaran 
wrote:

> I am using zookeeper for providing HA for spark cluster.  We have two
> nodes zookeeper cluster.
>
> When one of the zookeeper dies then the entire spark cluster goes down .
>
> Is this expected behaviour ?
> Am I missing something in config ?
>
> Spark version - 1.6.1.
> Zookeeper version - 3.4.6
> // spark-env.sh
> SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER
> -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181"
>
> Below is the log from spark master:
> ZooKeeperLeaderElectionAgent: We have lost leadership
> 16/06/27 09:39:30 ERROR Master: Leadership has been revoked -- master
> shutting down.
>
> Thanks
> Vimal
>
>
>
>


Spark master shuts down when one of zookeeper dies

2016-06-28 Thread vimal dinakaran
I am using zookeeper for providing HA for spark cluster.  We have two nodes
zookeeper cluster.

When one of the zookeeper dies then the entire spark cluster goes down .

Is this expected behaviour ?
Am I missing something in config ?

Spark version - 1.6.1.
Zookeeper version - 3.4.6
// spark-env.sh
SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181"

Below is the log from spark master:
ZooKeeperLeaderElectionAgent: We have lost leadership
16/06/27 09:39:30 ERROR Master: Leadership has been revoked -- master
shutting down.

Thanks
Vimal