[ https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Enis Soztutar resolved HBASE-10785. ----------------------------------- Resolution: Fixed Hadoop Flags: Reviewed Committed this to hbase-10070 branch. We would like to get the branch merged back soon to trunk sooner rather than later, so we will get this in trunk as well. > Metas own location should be cached > ----------------------------------- > > Key: HBASE-10785 > URL: https://issues.apache.org/jira/browse/HBASE-10785 > Project: HBase > Issue Type: Improvement > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: hbase-10070 > > Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch, > hbase-10785_v3.patch > > > With ROOT table gone, we no longer cache the location of the meta table (in > MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not > root. > However, not caching the metas own location means that we are doing a > zookeeper request every time we want to look up a regions location from meta. > This means that there is a significant spike in zk requests whenever a region > server goes down. > This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've > discovered the issue in hbase-10070 because of the integration test > (HBASE-10572) results in 150K requests to zk in 10min. > A thread dump from one of the runs have 100+ threads from client in this > stack trace: > {code} > "TimeBoundedMultiThreadedReaderThread_20" prio=10 > tid=0x00007f852c2f2000 nid=0x57b6 in Object.wait() [0x00007f85059e7000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at > org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309) > - locked <0x00000000ea71aa78> (a > org.apache.zookeeper.ClientCnxn$Packet) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853) > at > org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186) > at > org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257) > - locked <0x00000000e9bcf238> (a > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas) > at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818) > at > org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288) > at > org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249) > at > org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192) > at > org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)