[ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127252#comment-13127252 ]
Hudson commented on HBASE-4568: ------------------------------- Integrated in HBase-0.92 #64 (See [https://builds.apache.org/job/HBase-0.92/64/]) HBASE-4568 Make zk dump jsp response faster nspiegelberg : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java * /hbase/branches/0.92/src/main/resources/hbase-webapps/master/zk.jsp > Make zk dump jsp response more quickly > -------------------------------------- > > Key: HBASE-4568 > URL: https://issues.apache.org/jira/browse/HBASE-4568 > Project: HBase > Issue Type: Improvement > Reporter: Liyin Tang > Assignee: Liyin Tang > Fix For: 0.92.0, 0.94.0 > > Attachments: HBASE-4568.patch > > > 1) For each zk dump, currently hbase will create a zk client instance every > time. > This is quite slow when any machines in the quorum is dead. Because it will > connect to each machine in the zk quorum again. > <code> > HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER); > Configuration conf = master.getConfiguration(); > HBaseAdmin hbadmin = new HBaseAdmin(conf); > HConnection connection = hbadmin.getConnection(); > ZooKeeperWatcher watcher = connection.getZooKeeperWatcher(); > </code> > So we can simplify this: > <code> > HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER); > ZooKeeperWatcher watcher = master.getZooKeeperWatcher(); > </code> > 2) Also when hbase call getServerStats() for each machine in the zk quorum, > it hard coded the default time out as 1 min. > It would be nice to make this configurable and set it to a low time out. > When hbase tries to connect to each machine in the zk quorum, it will create > the socket, and then set the socket time out, and read it with this time out. > It means hbase will create a socket and connect to the zk server with 0 time > out at first, which will take a long time. > Because a timeout of zero is interpreted as an infinite timeout. The > connection will then block until established or an error occurs. > 3) The recoverable zookeeper should be real exponentially backoff when there > is connection loss exception, which will give hbase much longer time window > to recover from zk machine failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira