[ https://issues.apache.org/jira/browse/HBASE-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chendihao reassigned HBASE-10283: --------------------------------- Assignee: chendihao > Client can't connect with all the running zk servers in MiniZooKeeperCluster > ---------------------------------------------------------------------------- > > Key: HBASE-10283 > URL: https://issues.apache.org/jira/browse/HBASE-10283 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.3 > Reporter: chendihao > Assignee: chendihao > > Refer to HBASE-3052, multiple zk servers can run together in minicluster. The > problem is that client can only connect with the first zk server and if you > kill the first one, it fails to access the cluster even though other zk > servers are serving. > It's easy to repro. Firstly `TEST_UTIL.startMiniZKCluster(3)`. Secondly call > `killCurrentActiveZooKeeperServer` in MiniZooKeeperCluster. Then when you > construct the zk client, it can't connect with the zk cluster for any way. > Here is the simple log you can refer. > {noformat} > 2014-01-03 12:06:58,625 INFO [main] zookeeper.MiniZooKeeperCluster(194): > Started MiniZK Cluster and connect 1 ZK server on client port: 55227 > ...... > 2014-01-03 12:06:59,134 INFO [main] zookeeper.MiniZooKeeperCluster(264): > Kill the current active ZK servers in the cluster on client port: 55227 > 2014-01-03 12:06:59,134 INFO [main] zookeeper.MiniZooKeeperCluster(272): > Activate a backup zk server in the cluster on client port: 55228 > 2014-01-03 12:06:59,366 INFO [main-EventThread] zookeeper.ZooKeeper(434): > Initiating client connection, connectString=localhost:55227 > sessionTimeout=3000 > watcher=com.xiaomi.infra.timestamp.TimestampWatcher@a383118 > (then it throws exceptions......) > {noformat} > The log is kind of problematic because it always show "Started MiniZK Cluster > and connect 1 ZK server" but actually there're three zk servers. > Looking deeply we find that the client is still trying to connect with the > dead zk server's port. When I print out the zkQuorum it used, only the first > zk server's hostport is there and it will not change no matter you kill the > server or not. The reason for this is in ZKConfig which will convert HBase > settings into zk's. MiniZooKeeperCluster create three servers with the same > host name, "localhost", and different ports. But HBase self force to use the > same port for each zk server and ZKConfig will ignore the other two servers > which have the same host name. > MiniZooKeeperCluster works improperly before we fix this. The bug is not > found because we never test whether HBase works or not if we kill the zk > active or backup servers in ut. -- This message was sent by Atlassian JIRA (v6.1.5#6160)