chendihao created HBASE-10345:
---------------------------------
Summary: HMaster should not serve when disconnected with ZooKeeper
Key: HBASE-10345
URL: https://issues.apache.org/jira/browse/HBASE-10345
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.94.3
Reporter: chendihao
Refer to HBASE-9468(Previous active master can still serves RPC request when it
is trying recovering expired zk session), we can fail fast to avoid existing
double masters at the same time. But this problem may occur before session
expired. When receive Disconnected event, we can't make sure of that this
active master can communicate with zk later. And it doesn't know whether backup
master has become the new active master or not until it receives Expired
event(which may lose forever). During this unsure-who-is-active-master period,
the current active master should not serve(maybe turn off RpcServer).
Here is the statement from "ZooKeeper Distributed Process Coordination" P101
{quote}
If the developer is not careful, the old leader will continue to act as a
leader and may take actions that conflict with those of the new leader. For
this reason, when a process receives a Disconnected event, the process should
suspend actions taken as a leader until it reconnects. Normally this reconnect
happens very quickly.
{quote}
So it's equally necessary to handle Disconnected event and Expired event.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)