[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2015-07-31 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648808#comment-14648808
 ] 

Dave Latham commented on HBASE-12219:
-

Sorry I missed this JIRA at the time, but I have a couple of concerns if I'm 
understanding this change correctly.  I'd like to check to see if I am.  It 
looks like the effect is to back out the directory modtime caching and instead 
have the master just maintain a persistent in memory cache.

Previous to this change it was safe to have any processes on the cluster use 
FSTableDescriptors to read or write table descriptors.  Updates would be atomic 
and consistent, immediately available to all readers.  However, the cost of 
that was a HDFS NN operation on every table descriptor read to prove it had not 
changed.  Since in practice it appears that only the master process ever 
updates an existing table descriptor, it should be safe to have the master skip 
the directory modtime checks and proactively update its cached copy.  (hbck can 
also create table descriptors for orphaned tables but hopefully those don't 
happen to tables the master has already cached).

This change makes the master descriptor reads faster but imposes the constraint 
that only the active master should update table descriptors - any other writers 
would cause the master cache to become stale.  It also means that no other 
processes should use the cache the same way as the master could change the data 
and cause stale caches.  Assuming this is the case, I think we'd be better 
served by reflecting that in the FSTableDescriptors API and javadoc.  For 
example, currently most constructors and usages now default to keeping a 
persistent cache as well as allowing updates which sets a bad example for new 
uses.  There are also no warnings in the javadoc about the new contract.  
Possibly better would be to make the default constructor be read only and have 
persistent caching disabled.  Then another constructor for the master allowing 
both writes and persistent caching.

This change also seems to remove all table descriptor caching from the region 
servers (the old directory modtime caching is gone and the new caching is 
disabled for region servers).  Thanks to HBASE-8778 reloading from the FS each 
time is cheaper than it used to be, but this change still increases the cost 
from 1 NN operation (check directory modtime) to 2 NN + 3 DN operations (find 
current file, get its block locations, open block, read close block).  This 
slows things down a bit again for mass assignments/balances on huge tables.  It 
seems better for the region servers to retain the directory modtime caching, 
but simply skip the modtime check when running inside the master.

Does that understanding of this change sound correct - or did I botch it?  
Sorry I missed it at the time.  If that sounds right, a follow up JIRA may be 
good, and if I see our table assignments slower from this and no one else gets 
to it I can try to put up the changes.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.94.patch, HBASE-12219-0.98.patch, 
 HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196744#comment-14196744
 ] 

Andrew Purtell commented on HBASE-12219:


I'm planning to roll the 0.98.8 RC0 this Friday. We'll need a fix for truncate 
issues on 0.98 branch before then to avoid a revert of this change. Thanks!

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, 
 HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-04 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197055#comment-14197055
 ] 

Esteban Gutierrez commented on HBASE-12219:
---

[~apurtell] the addendum for 0.98 should solve the original problem, can you 
give it a try?

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, 
 HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197081#comment-14197081
 ] 

stack commented on HBASE-12219:
---

I applied HBASE-12219-0.99.v1.patch  Lets see if branch-1 stays stable.  If so, 
will apply 0.98.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, 
 HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197109#comment-14197109
 ] 

Andrew Purtell commented on HBASE-12219:


I applied the addendum patch to 0.98. A quick test with the minicluster and 
shell truncate command looks ok. TestAdmin truncate tests pass. Please apply 
the addendum to 0.98 whenever ready [~stack]

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, 
 HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197366#comment-14197366
 ] 

Hudson commented on HBASE-12219:


FAILURE: Integrated in HBase-0.98 #653 (See 
[https://builds.apache.org/job/HBase-0.98/653/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors 
(addendum) (stack: rev a4b800c8241a64bcdc3c83f6700bbc9eb3e798bc)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/TruncateTableHandler.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, 
 HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197398#comment-14197398
 ] 

Hudson commented on HBASE-12219:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #622 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/622/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors 
(addendum) (stack: rev a4b800c8241a64bcdc3c83f6700bbc9eb3e798bc)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/TruncateTableHandler.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, 
 HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195662#comment-14195662
 ] 

Andrew Purtell commented on HBASE-12219:


Looks like truncation isn't working now in 0.98. I was complaining on the wrong 
issue before, see 
https://issues.apache.org/jira/browse/HBASE-12142?focusedCommentId=14195604page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14195604
 for log detail and steps to reproduce.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-02 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193727#comment-14193727
 ] 

Dima Spivak commented on HBASE-12219:
-

I disabled forked process timeouts and reran TestAdmin before [~stack]'s 
commit, which revealed this:
{code}
testTruncateTablePreservingSplits(org.apache.hadoop.hbase.client.TestAdmin)  
Time elapsed: 242.311 sec   ERROR!
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=7, exceptions:
Sat Nov 01 22:54:33 PDT 2014, null, java.net.SocketTimeoutException: 
callTimeout=6, callDuration=291372: row '' on table 
'testTruncateTablePreservingSplits

at 
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:261)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:199)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:196)
at 
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287)
at 
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
at 
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
at 
org.apache.hadoop.hbase.client.ClientScanner.init(ClientScanner.java:134)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:789)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.countRows(HBaseTestingUtility.java:1894)
at 
org.apache.hadoop.hbase.client.TestAdmin.testTruncateTable(TestAdmin.java:393)
at 
org.apache.hadoop.hbase.client.TestAdmin.testTruncateTablePreservingSplits(TestAdmin.java:369)
Caused by: java.net.SocketTimeoutException: callTimeout=6, 
callDuration=291372: row '' on table 'testTruncateTablePreservingSplits
at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:155)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get 
the location
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)
at 
org.apache.hadoop.hbase.client.ScannerCallable.prepare(ScannerCallable.java:136)
at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:121)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server 
address listed in hbase:meta for region 
testTruncateTablePreservingSplits,,1414907431375.c73f9cd53b251842620244ea108213d5.
 containing row 
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1227)
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1093)
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1064)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:288)
at 
org.apache.hadoop.hbase.client.ScannerCallable.prepare(ScannerCallable.java:136)
at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:121)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275)
at 

[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-02 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194081#comment-14194081
 ] 

stack commented on HBASE-12219:
---

Good digging [~dimaspivak] How you disable?  Setting in pom?  You ran locally 
or up on Apache.  Should we have a config which disables forking when we have a 
zombie amok?

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-02 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194086#comment-14194086
 ] 

Dima Spivak commented on HBASE-12219:
-

I ran on my internal rig (don't have an Apache account) and added 
{{-Dsurefire.timeout=0}} to the mvn command which set the process timeout to 
unlimited. Can definitely have a Jenkins parameter in the builds job that does 
that when we notice that the build has gone red because of zombies.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-02 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194092#comment-14194092
 ] 

Andrew Purtell commented on HBASE-12219:


The timeout setting above is a nice tip. When hunting zombies locally you can 
also set the first part and second part fork modes to always so each test 
runs in its own VM, then loop the unit test suite, watch for stragglers, then 
jstack. Works well because you can be sure every stack in the dump is relevant 
for the hung test.  

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193614#comment-14193614
 ] 

Hudson commented on HBASE-12219:


SUCCESS: Integrated in HBase-1.0 #406 (See 
[https://builds.apache.org/job/HBase-1.0/406/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors; 
REVERTgit log! branch-1 patch AND addendum (stack: rev 
0aca51e89cd0fe69d9cd57648949df5c5b506c53)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193669#comment-14193669
 ] 

stack commented on HBASE-12219:
---

Builds on branch-1 are blue again after backing this out.  I think this the 
zombie maker.  Leaving open till we figure why.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-01 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193693#comment-14193693
 ] 

Dima Spivak commented on HBASE-12219:
-

Using [~manukranthk]'s awesome findHangingTests script, it looks like the set 
of runs that were red all had org.apache.hadoop.hbase.client.TestAdmin hang, 
which caused the Surefire-forked process to time out after 15 minutes and fail 
the Maven build.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191614#comment-14191614
 ] 

Hadoop QA commented on HBASE-12219:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12678443/HBASE-12219-0.99.patch
  against trunk revision .
  ATTACHMENT ID: 12678443

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11539//console

This message is automatically generated.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-0.99.patch, HBASE-12219-v1.patch, 
 HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, 
 HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191648#comment-14191648
 ] 

Esteban Gutierrez commented on HBASE-12219:
---

Cancelled 0.99 patch for now, it was consistent but had some formatting issues.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192091#comment-14192091
 ] 

Hadoop QA commented on HBASE-12219:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12678498/HBASE-12219-0.99.patch
  against trunk revision .
  ATTACHMENT ID: 12678498

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11545//console

This message is automatically generated.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-0.99.patch, HBASE-12219-v1.patch, 
 HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, 
 HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192219#comment-14192219
 ] 

stack commented on HBASE-12219:
---

Applied addendum. Thanks [~esteban]

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192340#comment-14192340
 ] 

Hudson commented on HBASE-12219:


FAILURE: Integrated in HBase-1.0 #400 (See 
[https://builds.apache.org/job/HBase-1.0/400/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors 
(stack: rev 233fb8bf1880c6297419fe62e6891195771e2f42)
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors 
(addendum) (stack: rev 1f18d706a8da51641776c33a594391e69003da3a)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.99.addendum.patch, 
 HBASE-12219-0.99.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192404#comment-14192404
 ] 

Andrew Purtell commented on HBASE-12219:


0.98 patch doesn't apply cleanly:
{noformat}
patching file 
hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
patching file 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Hunk #1 succeeded at 362 (offset 3 lines).
Hunk #2 succeeded at 489 (offset 3 lines).
Hunk #3 succeeded at 811 with fuzz 1 (offset 3 lines).
patching file 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
patching file 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Hunk #4 FAILED at 1309.
1 out of 10 hunks FAILED -- saving rejects to file 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
patching file 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
Hunk #3 FAILED at 88.
Hunk #4 FAILED at 120.
Hunk #5 FAILED at 130.
Hunk #6 FAILED at 147.
Hunk #7 succeeded at 197 (offset 7 lines).
Hunk #8 succeeded at 270 (offset 7 lines).
Hunk #9 succeeded at 289 (offset 7 lines).
Hunk #10 succeeded at 307 (offset 7 lines).
Hunk #11 succeeded at 321 (offset 7 lines).
Hunk #12 succeeded at 337 (offset 7 lines).
Hunk #13 succeeded at 356 (offset 7 lines).
Hunk #14 succeeded at 396 (offset 7 lines).
Hunk #15 succeeded at 421 (offset 7 lines).
Hunk #16 succeeded at 465 (offset 7 lines).
Hunk #17 succeeded at 473 (offset 7 lines).
Hunk #18 FAILED at 481.
Hunk #19 succeeded at 541 (offset 7 lines).
Hunk #20 succeeded at 581 (offset 7 lines).
Hunk #21 succeeded at 596 (offset 7 lines).
Hunk #22 succeeded at 622 (offset 7 lines).
Hunk #23 succeeded at 648 (offset 7 lines).
Hunk #24 succeeded at 686 (offset 7 lines).
Hunk #25 succeeded at 712 (offset 7 lines).
Hunk #26 succeeded at 720 (offset 7 lines).
Hunk #27 succeeded at 752 (offset 7 lines).
5 out of 27 hunks FAILED -- saving rejects to file 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java.rej
patching file 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
patching file 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
Hunk #3 succeeded at 395 (offset 1 line).
{noformat}


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.99.addendum.patch, 
 HBASE-12219-0.99.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192599#comment-14192599
 ] 

Esteban Gutierrez commented on HBASE-12219:
---

I see, I made the patch in a branch that was 29 commits behind. Let me fix that 
for you.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.99.addendum.patch, 
 HBASE-12219-0.99.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192929#comment-14192929
 ] 

Hudson commented on HBASE-12219:


FAILURE: Integrated in HBase-0.98 #645 (See 
[https://builds.apache.org/job/HBase-0.98/645/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors 
(apurtell: rev a21352f4de3e2ddb6ae23ce97d3570647d28a705)
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192949#comment-14192949
 ] 

Hudson commented on HBASE-12219:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #614 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/614/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors 
(apurtell: rev a21352f4de3e2ddb6ae23ce97d3570647d28a705)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190725#comment-14190725
 ] 

Esteban Gutierrez commented on HBASE-12219:
---

Tests should pass once HBASE-12380 is committed.


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219.v0.txt, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191019#comment-14191019
 ] 

Hadoop QA commented on HBASE-12219:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678315/HBASE-12219-v1.patch
  against trunk revision .
  ATTACHMENT ID: 12678315

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3776 checkstyle errors (more than the trunk's current 3774 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  tds.put(this.metaTableDescritor.getNameAsString(), new 
TableDescriptor(metaTableDescritor, TableState.State.ENABLED));
+  public static TableDescriptor getTableDescriptorFromFs(FileSystem fs, Path 
tableDir, boolean rewritePb)
+FSTableDescriptors htds = new 
FSTableDescriptorsTest(UTIL.getConfiguration(), fs, rootdir, false, false);
+FSTableDescriptors nonchtds = new 
FSTableDescriptorsTest(UTIL.getConfiguration(), fs, rootdir, false, false);

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11525//console

This message is automatically generated.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and 

[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191031#comment-14191031
 ] 

Hadoop QA commented on HBASE-12219:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678319/HBASE-12219.v2.patch
  against trunk revision .
  ATTACHMENT ID: 12678319

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.util.TestFSTableDescriptors

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11526//console

This message is automatically generated.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191080#comment-14191080
 ] 

Esteban Gutierrez commented on HBASE-12219:
---

v2 failed after turning off the cache in the the FSTableDescriptors 
constructor. I think it should be fine to use v1 instead, will upload new patch.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191195#comment-14191195
 ] 

Hadoop QA commented on HBASE-12219:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678360/HBASE-12219.v3.patch
  against trunk revision .
  ATTACHMENT ID: 12678360

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.coprocessor.TestCoprocessorHConnection

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hdfs.TestDecommission.testDecommission(TestDecommission.java:574)
at 
org.apache.hadoop.hdfs.TestDecommission.testDecommissionFederation(TestDecommission.java:422)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11530//console

This message is automatically generated.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for 

[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191273#comment-14191273
 ] 

Esteban Gutierrez commented on HBASE-12219:
---

Test failure not related,  see HBASE-11819.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191314#comment-14191314
 ] 

stack commented on HBASE-12219:
---

Applied to master.  You want to make a 0.99 patch [~esteban] (it didn't 
cherry-pick nicely... lots 'off')

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191383#comment-14191383
 ] 

Hudson commented on HBASE-12219:


FAILURE: Integrated in HBase-TRUNK #5728 (See 
[https://builds.apache.org/job/HBase-TRUNK/5728/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors 
(stack: rev ba7344f5d166e8f3df18258be13240993af1c8d4)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
 HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188259#comment-14188259
 ] 

Hadoop QA commented on HBASE-12219:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677859/HBASE-12219-v1.patch
  against trunk revision .
  ATTACHMENT ID: 12677859

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3792 checkstyle errors (more than the trunk's current 3790 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  tds.put(this.metaTableDescritor.getNameAsString(), new 
TableDescriptor(metaTableDescritor, TableState.State.ENABLED));
+  public static TableDescriptor getTableDescriptorFromFs(FileSystem fs, Path 
tableDir, boolean rewritePb)
+FSTableDescriptors htds = new 
FSTableDescriptorsTest(UTIL.getConfiguration(), fs, rootdir, false, false);
+FSTableDescriptors nonchtds = new 
FSTableDescriptorsTest(UTIL.getConfiguration(), fs, rootdir, false, false);

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestRegionServerNoMaster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11506//console

This message is automatically generated.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219-v1.patch, HBASE-12219.v0.txt, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients 

[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-17 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175232#comment-14175232
 ] 

Esteban Gutierrez commented on HBASE-12219:
---

Changed summary to reflect this no longer requires to configure a TTL to 
refresh the cached entries. Also aded a new property to load the HTDs at 
startup of the master (enabled by default) 
{{hbase.master.preload.tabledescriptors}}


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219.v0.txt


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-10-17 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175268#comment-14175268
 ] 

Sean Busbey commented on HBASE-12219:
-

Can you update the patch to be based on master? It looks like the same NN 
reading code is in the cache there.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Attachments: HBASE-12219.v0.txt, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)