[jira] [Commented] (HBASE-10785) Metas own location should be cached

2015-08-20 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705488#comment-14705488
 ] 

Jerry He commented on HBASE-10785:
--

Was there any reason this major improvement didn't go into 0.98?

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2015-08-20 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705506#comment-14705506
 ] 

Enis Soztutar commented on HBASE-10785:
---

bq. Was there any reason this major improvement didn't go into 0.98?
We thought that it maybe risky for 0.98 at that time. I think there was a 
follow up needed. Now, we are running with this patch for quite a while and our 
internal 0.98 has this patch running as well (with the safeguard config 
though).  

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2015-08-20 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705528#comment-14705528
 ] 

Jerry He commented on HBASE-10785:
--

bq. I think there was a follow up needed.
Detail on this?

We have people reporting this problem on 0.98.
It is not a trivial backport because of the refactoring work from 0.98 to 
version 1. But it should be a simple backport.
Let me open a backport JIRA and see what Andrew thinks?

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2015-08-20 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705704#comment-14705704
 ] 

Enis Soztutar commented on HBASE-10785:
---

bq. Detail on this?
I've checked, it seems that it was related to meta region replicas, not this. 
bq. Let me open a backport JIRA and see what Andrew thinks?
Sounds good. 


> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2015-08-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712087#comment-14712087
 ] 

Hudson commented on HBASE-10785:


SUCCESS: Integrated in HBase-0.98 #1099 (See 
[https://builds.apache.org/job/HBase-0.98/1099/])
HBASE-14275 Backport to 0.98 HBASE-10785 Metas own location should be cached 
(jerryjch: rev c0abda2e110772c58b5520789ffd53f287b399cf)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java


> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2015-08-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712174#comment-14712174
 ] 

Hudson commented on HBASE-10785:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1053 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1053/])
HBASE-14275 Backport to 0.98 HBASE-10785 Metas own location should be cached 
(jerryjch: rev c0abda2e110772c58b5520789ffd53f287b399cf)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java


> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-03-24 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945754#comment-13945754
 ] 

Devaraj Das commented on HBASE-10785:
-

Looks good. Some nits:
1. We don't need the config (once we have tested this on real clusters)
2. The tableName argument in locateRegion is always going to be META. Might as 
well not pass it and just declare it within the method.

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-03-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945771#comment-13945771
 ] 

stack commented on HBASE-10785:
---

This should be final?

+private boolean cacheMetaLocationEnabled; // whether to cache meta's own 
location

Can fix on commit.

Is there code duplication in locateMeta?  If so, does there have to be (no 
biggie.. just asking).

Else, nice find.

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-03-25 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946871#comment-13946871
 ] 

Nicolas Liochon commented on HBASE-10785:
-

Nice catch

{code}
+// If we are not supposed to be using the cache, delete any existing 
cached location
+// so it won't interfere.
+metaCache.clearCache(tableName, metaCacheKey, replicaId);
+  }
{code}
I know it's a copy paste; but I don't think we should do that: often the second 
try is w/o cache to be sure, but trashing the cache for the others is bad, as 
the default for a second try is nearly always to bypass the cache, whatever the 
reason.

I think this patch should work, as in the recent past we had a cache for meta. 

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-04-09 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964079#comment-13964079
 ] 

Nicolas Liochon commented on HBASE-10785:
-

Hi [~enis], do you plan to commit this one? It seems important. Note that in 
HBASE-10018, I removed the "delete any existing cached location so it won't 
interfere." :-)

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-04-09 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964111#comment-13964111
 ] 

Enis Soztutar commented on HBASE-10785:
---

Indeed I was looking at this. I need to address the comments real quick. 
We have been running a lot of tests with this, so maybe Devaraj is right that 
we can remove the conf parameter. 

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-04-09 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964114#comment-13964114
 ] 

Nicolas Liochon commented on HBASE-10785:
-

Yes, I agree as well, we should be better without the parameter.

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-04-10 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965409#comment-13965409
 ] 

Enis Soztutar commented on HBASE-10785:
---

I think I've got +1s for earlier versions. The v3 does not add anything new. 
I'll commit this tomorrow unless objection. 

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: hbase-10070
>
> Attachments: hbase-10785_v1.patch, hbase-10785_v2.patch, 
> hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-06-27 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046634#comment-14046634
 ] 

Enis Soztutar commented on HBASE-10785:
---

Attaching rebased patch for master that is committed

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-06-27 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046682#comment-14046682
 ] 

Enis Soztutar commented on HBASE-10785:
---

Committed to master as part of hbase-10070 branch merge

> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10785) Metas own location should be cached

2014-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046759#comment-14046759
 ] 

Hudson commented on HBASE-10785:


FAILURE: Integrated in HBase-TRUNK #5245 (See 
[https://builds.apache.org/job/HBase-TRUNK/5245/])
HBASE-10785 Metas own location should be cached (enis: rev 
48ffa4d5e615c78f6db8f6c2beddd93460887642)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java
HBASE-11332 Fix for metas location cache from HBASE-10785 (enis: rev 
14a09e79bdb9b925ce9548b45fc4e26caadaec07)
* hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java


> Metas own location should be cached
> ---
>
> Key: HBASE-10785
> URL: https://issues.apache.org/jira/browse/HBASE-10785
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: 
> 0034-HBASE-10785-Metas-own-location-should-be-cached.patch, 
> hbase-10785_v1.patch, hbase-10785_v2.patch, hbase-10785_v3.patch
>
>
> With ROOT table gone, we no longer cache the location of the meta table (in 
> MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not 
> root.
> However, not caching the metas own location means that we are doing a 
> zookeeper request every time we want to look up a regions location from meta. 
> This means that there is a significant spike in zk requests whenever a region 
> server goes down. 
> This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've 
> discovered the issue in hbase-10070 because of the integration test 
> (HBASE-10572) results in 150K requests to zk in 10min. 
> A thread dump from one of the runs have 100+ threads from client in this 
> stack trace: 
>   {code}
>   "TimeBoundedMultiThreadedReaderThread_20" prio=10 
> tid=0x7f852c2f2000 nid=0x57b6 in Object.wait() [0x7f85059e7000]
>  java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>   - locked <0xea71aa78> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853)
>   at 
> org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257)
>   - locked <0xe9bcf238> (a 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192)
>   at 
> org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
>   {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)