Hari Krishna Dara created HBASE-15992: -----------------------------------------
Summary: Preserve original KeeperException when converted to external exceptions Key: HBASE-15992 URL: https://issues.apache.org/jira/browse/HBASE-15992 Project: HBase Issue Type: Brainstorming Components: hbase Affects Versions: 0.98.14 Reporter: Hari Krishna Dara Priority: Minor During an investigation in which we were seeing unexpected {{NoServerForRegionException}} errors, the root cause turned out to be a {{KeeperException}} that got lost and so resulted in a misleading top level indication. The underlying exception with partial stacktrace is this: {noformat} org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hbase/meta-region-server at org.apache.zookeeper.KeeperException.create(KeeperException.java:123) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1289) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684) at org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:2032) at org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:203) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:58) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateMeta(HConnectionManager.java:1209) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1175) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1301) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1178) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1135) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:976) {noformat} Here is some additional information: * The exception first gets caught [here|https://github.com/apache/hbase/blob/rel/0.98.14/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java#L366] * It gets logged and rethrown from [here|https://github.com/apache/hbase/blob/rel/0.98.14/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java#L279] * It gets caught again, logged and rethrown [here|https://github.com/apache/hbase/blob/rel/0.98.14/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L693] * This finally gets caught and rethrown as InterruptedException [here|https://github.com/apache/hbase/blob/rel/0.98.14/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L2037] When thrown as {{InterruptedException}}, the cause is lost, so [the code catching it|https://github.com/apache/hbase/blob/rel/0.98.14/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperRegistry.java#L65] can't (and currently doesn't) determine the cause. Perhaps the exception should be preserved and passed on to [the caller|https://github.com/apache/hbase/blob/rel/0.98.14/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java#L1312] such that it is available when finally the {{NoServerForRegionException}} is thrown [here|https://github.com/apache/hbase/blob/rel/0.98.14/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java#L1281]. Alternatively, a more meaningful exception could also be thrown instead of a misleading {{NoServerForRegionException}}, especially in cases where the failure indicates a more permanent condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)