[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484734#comment-13484734 ] ramkrishna.s.vasudevan commented on HBASE-5898: --- We have recently hit this issue.. My major concern here is there is contention only happening right? But here in our case the scan itself did not happen for almost 10 mins? The thread dump clearly says what was found over in this JIRA. Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-TestBlocksRead.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6651) Thread safety of HTablePool is doubtful
[ https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484735#comment-13484735 ] Hadoop QA commented on HBASE-6651: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550927/HBASE-6651-V5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 11 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 95 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3154//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3154//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3154//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3154//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3154//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3154//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3154//console This message is automatically generated. Thread safety of HTablePool is doubtful --- Key: HBASE-6651 URL: https://issues.apache.org/jira/browse/HBASE-6651 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Priority: Minor Attachments: HBASE-6651.patch, HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip There are some operations in HTablePool to access to PoolMap in multiple times without any explict synchronization. For example HTablePool.closeTablePool() calles PoolMap.values(), and calles PoolMap.remove(). If other threads add new instances to the pool in the middle of the calls, the new added instances might be dropped. (HTablePool.closeTablePool() also has another problem that calling it by multple threads causes accessing HTable by multiple threads.) Moreover, PoolMap is not thread safe for the same reason. For example PoolMap.put() calles ConcurrentMap.get() and calles ConcurrentMap.put(). If other threads add a new instance to the concurent map in the middle of the calls, the new instance might be dropped. And also implementations of Pool have the same problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484738#comment-13484738 ] stack commented on HBASE-6410: -- I started some review here https://reviews.apache.org/r/7747/ Elliott Move RegionServer Metrics to metrics2 - Key: HBASE-6410 URL: https://issues.apache.org/jira/browse/HBASE-6410 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410.patch Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484739#comment-13484739 ] stack commented on HBASE-5898: -- [~ram_krish] How you mean Ram? It was stuck where in particular? Was it a bunch of threads getting same block? What did the thread dump look like? There is some issue in here around the wait/notify it seems as implemented. The double-checked is probably better anyways but could the issue come back just less frequently after this patch goes in? Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-TestBlocksRead.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484742#comment-13484742 ] Hudson commented on HBASE-6852: --- Integrated in HBase-0.94 #556 (See [https://builds.apache.org/job/HBase-0.94/556/]) HBASE-6852 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields (Cheng Hao and LarsH) (Revision 1402392) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Assignee: Cheng Hao Priority: Minor Labels: performance Fix For: 0.94.3 Attachments: 6852-0.94.txt, metrics_hotspots.png, onhitcache-trunk.patch The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484757#comment-13484757 ] Hudson commented on HBASE-7008: --- Integrated in HBase-TRUNK #3488 (See [https://builds.apache.org/job/HBase-TRUNK/3488/]) Add link to Lars's graphs in hbase-7008 where he plays w/ nagles and different data sizes (Revision 1402399) Result = FAILURE Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484766#comment-13484766 ] nkeywal commented on HBASE-7008: I think #1 is better than #3, because with #1 it will be used by everybody, hence it will becomes production-proven for most use cases. It's as well better for people new to HBase. Just my 2 cents :-) Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484773#comment-13484773 ] liang xie commented on HBASE-7008: -- +1 for #1, the default setting should be user-friendly for most of end-users or newbie:) we should not expect end-users tuning per doc always. Or in other words, if we leave all those things unchanged, maybe there'll be still many users complain why HBase performance so bad, or only the experts could finger the root cause out. Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484783#comment-13484783 ] Anoop Sam John commented on HBASE-5416: --- I got a chance to go throw this and the discussion around @Max clearly it is a good idea. Improvement in your scenario will be huge.. The concerns about the change is worth considering I guess. It is very critical path.. I have one idea for you to solve the problem with out 2 phase RPC How about the below way? eg: I have one table with 2 CFs(cf1, cf2) I have a SCVF condition on cf1 (cf1:c1=v1) 1. Create a Scan from the client side with only cf1 specified and with the filter {code} SingleColumnValueFilter filter = new SingleColumnValueFilter(cf1, c1, CompareOp.EQUAL, v1); Scan scan = new Scan(); scan.setFilter(filter); scan.addFamily(cf1); for (Result result : ht.getScanner(scan)){ // deal with result } {code} 2. Implement a RegionObserver CP and implement the preScannerNext() hook.. This hook execution will happen within the server In the hook for every rowkey which the scan selects, create a Get request with CF specified as the remaining CFs and add those KVs also to the Result {code} public boolean postScannerNext(ObserverContextRegionCoprocessorEnvironment e, InternalScanner s, ListResult results, int limit, boolean hasMore) throws IOException { // Next call happen on one region from HRS HRegion region = e.getEnvironment().getRegion(); ListResult finalResults = new ArrayListResult(results.size()); for (Result result : results) { // Every result corresponds to one row.. Assume there is no batching being used byte[] row = result.getRow(); Get get = new Get(row); get.addFamily(cf2);// cf1 is already fetched Result result2 = region.get(get, null); ListKeyValue finalKVs = new ArrayListKeyValue(); finalKVs.addAll(result.list()); finalKVs.addAll(result2.list()); finalResults.add(new Result(finalKVs)); } // replace the results with the new finalResults results.clear(); results.addAll(finalResults); return hasMore; } {code} This hook is at the HRS level and after the Result object preperation. Right now we dont have any other hook during the scanner next() calls down the line so that we can deal with the KVs list.. So we need to recreate the Result and some ugly way of coding... This way it should be possible to fetch the data what you want. May be not as optimal as the way with the internal change.. But still be far far better than the 2 RPC calls... Now with CP we can achieve many things.. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly,
[jira] [Updated] (HBASE-6651) Thread safety of HTablePool is doubtful
[ https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hiroshi Ikeda updated HBASE-6651: - Attachment: HBASE-6651-V6.patch Patch v6 from review board. Sorry frequent update. Thread safety of HTablePool is doubtful --- Key: HBASE-6651 URL: https://issues.apache.org/jira/browse/HBASE-6651 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Priority: Minor Attachments: HBASE-6651.patch, HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, HBASE-6651-V6.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip There are some operations in HTablePool to access to PoolMap in multiple times without any explict synchronization. For example HTablePool.closeTablePool() calles PoolMap.values(), and calles PoolMap.remove(). If other threads add new instances to the pool in the middle of the calls, the new added instances might be dropped. (HTablePool.closeTablePool() also has another problem that calling it by multple threads causes accessing HTable by multiple threads.) Moreover, PoolMap is not thread safe for the same reason. For example PoolMap.put() calles ConcurrentMap.get() and calles ConcurrentMap.put(). If other threads add a new instance to the concurent map in the middle of the calls, the new instance might be dropped. And also implementations of Pool have the same problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7051) Read/Updates (increment,checkAndPut) should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484803#comment-13484803 ] ramkrishna.s.vasudevan commented on HBASE-7051: --- Yes.. what Gregory says is possible.. i tried to reproduce this scenario that he mentioned. It happens.. So if thread B is just before waiting for MVCC to complete and by the time of thread A does the check operation it is going to succeed. Now the MVCC completes and the put as part of checkAndPut gets completed thus overwriting what the thread B has written. Read/Updates (increment,checkAndPut) should properly read MVCC -- Key: HBASE-7051 URL: https://issues.apache.org/jira/browse/HBASE-7051 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Gregory Chanan See, for example: {code} // TODO: Use MVCC to make this set of increments atomic to reads {code} Here's an example of what I can happen (would probably be good to write up a test case for each read/update): Concurrent update via increment and put. The put grabs the row lock first and updates the memstore, but releases the row lock before the MVCC is advanced. Then, the increment grabs the row lock and reads right away, reading the old value and incrementing based on that. There are a few options here: 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows. 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1) 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7051) Read/Updates (increment,checkAndPut) should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484806#comment-13484806 ] ramkrishna.s.vasudevan commented on HBASE-7051: --- Also this applies to other versions also right? Not only trunk? Read/Updates (increment,checkAndPut) should properly read MVCC -- Key: HBASE-7051 URL: https://issues.apache.org/jira/browse/HBASE-7051 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Gregory Chanan See, for example: {code} // TODO: Use MVCC to make this set of increments atomic to reads {code} Here's an example of what I can happen (would probably be good to write up a test case for each read/update): Concurrent update via increment and put. The put grabs the row lock first and updates the memstore, but releases the row lock before the MVCC is advanced. Then, the increment grabs the row lock and reads right away, reading the old value and incrementing based on that. There are a few options here: 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows. 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1) 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6651) Thread safety of HTablePool is doubtful
[ https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484812#comment-13484812 ] Hadoop QA commented on HBASE-6651: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550936/HBASE-6651-V6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 15 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 95 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3155//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3155//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3155//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3155//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3155//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3155//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3155//console This message is automatically generated. Thread safety of HTablePool is doubtful --- Key: HBASE-6651 URL: https://issues.apache.org/jira/browse/HBASE-6651 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Priority: Minor Attachments: HBASE-6651.patch, HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, HBASE-6651-V6.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip There are some operations in HTablePool to access to PoolMap in multiple times without any explict synchronization. For example HTablePool.closeTablePool() calles PoolMap.values(), and calles PoolMap.remove(). If other threads add new instances to the pool in the middle of the calls, the new added instances might be dropped. (HTablePool.closeTablePool() also has another problem that calling it by multple threads causes accessing HTable by multiple threads.) Moreover, PoolMap is not thread safe for the same reason. For example PoolMap.put() calles ConcurrentMap.get() and calles ConcurrentMap.put(). If other threads add a new instance to the concurent map in the middle of the calls, the new instance might be dropped. And also implementations of Pool have the same problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6499) StoreScanner's QueryMatcher not reset on store update
[ https://issues.apache.org/jira/browse/HBASE-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484829#comment-13484829 ] Anoop Sam John commented on HBASE-6499: --- @Max Is this issue same as HBASE-6900 which is fixed already in 0.94 and Trunk. Pls see once. StoreScanner's QueryMatcher not reset on store update - Key: HBASE-6499 URL: https://issues.apache.org/jira/browse/HBASE-6499 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Max Lapan Assignee: Max Lapan Attachments: StoreScanner_not_reset_matcher.patch When underlying store changed (due compact, bulk load, etc), we destroy current KeyValueHeap and recreate it using checkReseek call. Besides heap recreation, it resets underlying QueryMatcher instance. The problem is that checkReseek not called by seek() and reseek(), only by next(). If someone calls seek() just after store changed, it gets wrong scanner results. Call to reseek may end up with NPE. AFAIK, current codebase don't call seek and reseek, but it is quite possible in future. Personally, I spent lots of time to find source of wrong scanner results in HBASE-5416. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6499) StoreScanner's QueryMatcher not reset on store update
[ https://issues.apache.org/jira/browse/HBASE-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484835#comment-13484835 ] Max Lapan commented on HBASE-6499: -- Yes, this bug related with HBASE-6900, but also fixes seek() case. StoreScanner's QueryMatcher not reset on store update - Key: HBASE-6499 URL: https://issues.apache.org/jira/browse/HBASE-6499 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Max Lapan Assignee: Max Lapan Attachments: StoreScanner_not_reset_matcher.patch When underlying store changed (due compact, bulk load, etc), we destroy current KeyValueHeap and recreate it using checkReseek call. Besides heap recreation, it resets underlying QueryMatcher instance. The problem is that checkReseek not called by seek() and reseek(), only by next(). If someone calls seek() just after store changed, it gets wrong scanner results. Call to reseek may end up with NPE. AFAIK, current codebase don't call seek and reseek, but it is quite possible in future. Personally, I spent lots of time to find source of wrong scanner results in HBASE-5416. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484842#comment-13484842 ] Max Lapan commented on HBASE-5416: -- Yes, I think CP will work, thanks. The sad thing is that we use 0.90.6 (CDH) version of HBase, which don't have CPs. In fact, we use this patch on our production system without major issues and quite happy with it. But I don't think it's a good idea to include it in trunk, when much better approach exists. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484856#comment-13484856 ] ramkrishna.s.vasudevan commented on HBASE-5898: --- I can attach some parts of the thread dump bq.Was it a bunch of threads getting same block? Yes bq.The double-checked is probably better anyways but could the issue come back just less frequently after this patch goes in? Am not sure. We tried to restart the client twice still this persisted. Later the RS we restarted after that we could not get this. This thing repeats many times. We took 3 thread dumps in a span of 2 mins {code} IPC Server handler 42 on 60020 daemon prio=10 tid=0x7f2f38f1a000 nid=0x6c4d in Object.wait() [0x7f2f33e4f000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:77) - locked 0x0006cc2a7178 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:523) - locked 0x00069a665420 (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:399) - locked 0x00069a665420 (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3424) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3379) - locked 0x00069a7da458 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3396) - locked 0x00069a7da458 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2411) at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) {code} Also we could see that sometimes the relaseLock was also happening. But in the 3 thread dumps this came only once. {code} IPC Server handler 18 on 60020 daemon prio=10 tid=0x7f2f38ee9800 nid=0x6c35 runnable [0x7f2f35667000] java.lang.Thread.State: RUNNABLE at java.lang.Object.notify(Native Method) at org.apache.hadoop.hbase.util.IdLock.releaseLockEntry(IdLock.java:108) - locked 0x0006cc2a7178 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:523) - locked 0x00069a89d678 (a org.apache.hadoop.hbase.regionserver.StoreScanner) at
[jira] [Commented] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484870#comment-13484870 ] Hudson commented on HBASE-7008: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #239 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/239/]) Add link to Lars's graphs in hbase-7008 where he plays w/ nagles and different data sizes (Revision 1402399) Result = FAILURE Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484919#comment-13484919 ] Ted Yu commented on HBASE-6852: --- There were 10 test failures in build 556 which might be related to this JIRA. SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Assignee: Cheng Hao Priority: Minor Labels: performance Fix For: 0.94.3 Attachments: 6852-0.94.txt, metrics_hotspots.png, onhitcache-trunk.patch The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484961#comment-13484961 ] Elliott Clark commented on HBASE-6410: -- [~stack] The review board was posted a while ago (https://reviews.apache.org/r/7616/), but we can go with the one you posted since it has comments. Move RegionServer Metrics to metrics2 - Key: HBASE-6410 URL: https://issues.apache.org/jira/browse/HBASE-6410 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410.patch Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484976#comment-13484976 ] Lars Hofhansl commented on HBASE-6852: -- will check it out SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Assignee: Cheng Hao Priority: Minor Labels: performance Fix For: 0.94.3 Attachments: 6852-0.94.txt, metrics_hotspots.png, onhitcache-trunk.patch The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6651) Improve thread safety of HTablePool
[ https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6651: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Summary: Improve thread safety of HTablePool (was: Thread safety of HTablePool is doubtful) Improve thread safety of HTablePool --- Key: HBASE-6651 URL: https://issues.apache.org/jira/browse/HBASE-6651 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6651.patch, HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, HBASE-6651-V6.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip There are some operations in HTablePool to access to PoolMap in multiple times without any explict synchronization. For example HTablePool.closeTablePool() calles PoolMap.values(), and calles PoolMap.remove(). If other threads add new instances to the pool in the middle of the calls, the new added instances might be dropped. (HTablePool.closeTablePool() also has another problem that calling it by multple threads causes accessing HTable by multiple threads.) Moreover, PoolMap is not thread safe for the same reason. For example PoolMap.put() calles ConcurrentMap.get() and calles ConcurrentMap.put(). If other threads add a new instance to the concurent map in the middle of the calls, the new instance might be dropped. And also implementations of Pool have the same problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7053) port blockcache configurability (part of HBASE-6312, and HBASE-7033) to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485048#comment-13485048 ] Sergey Shelukhin commented on HBASE-7053: - Thanks! port blockcache configurability (part of HBASE-6312, and HBASE-7033) to 0.94 - Key: HBASE-7053 URL: https://issues.apache.org/jira/browse/HBASE-7053 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.94.3 Attachments: HBASE-7053.patch, HBASE-7053-v2-squashed.patch Add an option to get the improvement from 6312 w/o changing the defaults in stable release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485049#comment-13485049 ] Elliott Clark commented on HBASE-6410: -- Just got done discussing things with stack. I'll have a decent sized change to the patch in a little bit. Move RegionServer Metrics to metrics2 - Key: HBASE-6410 URL: https://issues.apache.org/jira/browse/HBASE-6410 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410.patch Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7039) Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485051#comment-13485051 ] Sergey Shelukhin commented on HBASE-7039: - can someone please review? thanks Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94 --- Key: HBASE-7039 URL: https://issues.apache.org/jira/browse/HBASE-7039 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7039-squashed.patch This is a major feature, so please -1 if you think it's too dangerous to port. However, it's also a perf improvement for recovery. The 2nd thing that HBASE-6012 addresses cannot be included without a breaking interface change (HRegionInterface openRegions doesn't return region states which are relied upon by the trunk code that is using protocol buffers API); or a non-breaking interface change with version-checking hackery to take advantage of it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6223) Document hbck improvements: HBASE-6173, HBASE-5360
[ https://issues.apache.org/jira/browse/HBASE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6223: --- Attachment: trunk-6223_v3.patch Thanks for review. I updated the patch a little bit. Now the sentence is short and easy to parse. :) Document hbck improvements: HBASE-6173, HBASE-5360 --- Key: HBASE-6223 URL: https://issues.apache.org/jira/browse/HBASE-6223 Project: HBase Issue Type: Task Components: documentation, hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6223.patch, trunk-6223_v2.patch, trunk-6223_v3.patch We had a couple hbck improvements recently: HBASE-6173 and HBASE-5360. We should document them. Especially, for HBASE-5360, it's something one normally doesn't do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7051) Read/Updates (increment,checkAndPut) should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485055#comment-13485055 ] Lars Hofhansl commented on HBASE-7051: -- I see. In HRegion.doMiniBatchPut we release the rowlock(s) before we roll the readPoint (completeMemstoreInsert). So for checkAndPut is not enough to hold the rowlock, since an earlier Put could have released the lock but not yet rolled readPoint... Is that what you guys are saying? I agree that is a problem and a side-effect of releasing the rowlock early to avoid holding it, while the WAL is sync'ed. Read/Updates (increment,checkAndPut) should properly read MVCC -- Key: HBASE-7051 URL: https://issues.apache.org/jira/browse/HBASE-7051 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Gregory Chanan See, for example: {code} // TODO: Use MVCC to make this set of increments atomic to reads {code} Here's an example of what I can happen (would probably be good to write up a test case for each read/update): Concurrent update via increment and put. The put grabs the row lock first and updates the memstore, but releases the row lock before the MVCC is advanced. Then, the increment grabs the row lock and reads right away, reading the old value and incrementing based on that. There are a few options here: 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows. 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1) 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-6852: -- Yes, these test failures are definitely related. I am going to revert the patch, until we can fix all the tests. SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Assignee: Cheng Hao Priority: Minor Labels: performance Fix For: 0.94.3 Attachments: 6852-0.94.txt, metrics_hotspots.png, onhitcache-trunk.patch The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485062#comment-13485062 ] Lars Hofhansl commented on HBASE-6852: -- These are the failing tests (in case we do not get to this before jenkins removes the old run): org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[2] org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[3] org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[4] org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[5] org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScanned org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose org.apache.hadoop.hbase.regionserver.TestStoreFile.testBloomFilter org.apache.hadoop.hbase.regionserver.TestStoreFile.testDeleteFamilyBloomFilter org.apache.hadoop.hbase.regionserver.TestStoreFile.testBloomTypes org.apache.hadoop.hbase.regionserver.TestStoreFile.testBloomEdgeCases SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Assignee: Cheng Hao Priority: Minor Labels: performance Fix For: 0.94.3 Attachments: 6852-0.94.txt, metrics_hotspots.png, onhitcache-trunk.patch The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485064#comment-13485064 ] Lars Hofhansl commented on HBASE-6852: -- Reverted for now. I think I know what is happening (the metrics are just not flushed right away), but I have no time to look into this. SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Assignee: Cheng Hao Priority: Minor Labels: performance Fix For: 0.94.3 Attachments: 6852-0.94.txt, metrics_hotspots.png, onhitcache-trunk.patch The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7057) Store Server Load in a Table
Elliott Clark created HBASE-7057: Summary: Store Server Load in a Table Key: HBASE-7057 URL: https://issues.apache.org/jira/browse/HBASE-7057 Project: HBase Issue Type: Improvement Components: metrics, UI Affects Versions: 0.96.0 Reporter: Elliott Clark Fix For: 0.98.0 Currently the server hart beat sends over server load and region loads. This is used to display and store metrics about a region server. It is also used to remember the sequence id of flushes. This should be moved into an HBase table. * Allow the last sequence id to persist over a master restart. * That would allow the balancer to have a more complete picture of what's happened in the past. * Allow tools to be created to monitor hbase using hbase. * Simplify/remove the heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7057) Store Server Load in a Table
[ https://issues.apache.org/jira/browse/HBASE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485075#comment-13485075 ] stack commented on HBASE-7057: -- Chatting w/ Elliott more on this: + sequence id should probably be in .META. + having some metrics history would be sweet. We could let it TTL out after an hour so we didn't keep too much around (or after an hour do the TSDB rollups if wanted) + A TSDB like query page for looking at the cluster over time would be sweet. + We should be careful to ensure we do not repeat the mistakes of the distant past when we used to keep a region history table into which we noted region transition; the problem there was our trying to record events like close when say, the cluster was going down, and the table we were trying to write would be down -- the writer could hang This seems like a small thing to do that could improve usability alot. Store Server Load in a Table Key: HBASE-7057 URL: https://issues.apache.org/jira/browse/HBASE-7057 Project: HBase Issue Type: Improvement Components: metrics, UI Affects Versions: 0.96.0 Reporter: Elliott Clark Labels: noob Fix For: 0.98.0 Currently the server hart beat sends over server load and region loads. This is used to display and store metrics about a region server. It is also used to remember the sequence id of flushes. This should be moved into an HBase table. * Allow the last sequence id to persist over a master restart. * That would allow the balancer to have a more complete picture of what's happened in the past. * Allow tools to be created to monitor hbase using hbase. * Simplify/remove the heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7057) Store Server Load in a Table
[ https://issues.apache.org/jira/browse/HBASE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7057: - Priority: Critical (was: Major) Tags: noob Labels: noob (was: ) Upping priority since seems like small change w/ big payoff. Making it noob for now in case someone wants to pick it up meantime. Store Server Load in a Table Key: HBASE-7057 URL: https://issues.apache.org/jira/browse/HBASE-7057 Project: HBase Issue Type: Improvement Components: metrics, UI Affects Versions: 0.96.0 Reporter: Elliott Clark Priority: Critical Labels: noob Fix For: 0.98.0 Currently the server hart beat sends over server load and region loads. This is used to display and store metrics about a region server. It is also used to remember the sequence id of flushes. This should be moved into an HBase table. * Allow the last sequence id to persist over a master restart. * That would allow the balancer to have a more complete picture of what's happened in the past. * Allow tools to be created to monitor hbase using hbase. * Simplify/remove the heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6223) Document hbck improvements: HBASE-6173, HBASE-5360
[ https://issues.apache.org/jira/browse/HBASE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485089#comment-13485089 ] Hadoop QA commented on HBASE-6223: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550986/trunk-6223_v3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 85 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3156//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3156//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3156//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3156//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3156//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3156//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3156//console This message is automatically generated. Document hbck improvements: HBASE-6173, HBASE-5360 --- Key: HBASE-6223 URL: https://issues.apache.org/jira/browse/HBASE-6223 Project: HBase Issue Type: Task Components: documentation, hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6223.patch, trunk-6223_v2.patch, trunk-6223_v3.patch We had a couple hbck improvements recently: HBASE-6173 and HBASE-5360. We should document them. Especially, for HBASE-5360, it's something one normally doesn't do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485092#comment-13485092 ] Jimmy Xiang commented on HBASE-6060: I think we can let it go for 0.94 since timeout monitor can handle it and there is no better way to fix it, because the region state in 0.94 is not so reliable. For 0.96, this one is not covered yet. It still relies on timeout monitor. Let me cook up a patch for 0.96 now. Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.92.3, 0.94.3, 0.96.0 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6223) Document hbck improvements: HBASE-6173, HBASE-5360
[ https://issues.apache.org/jira/browse/HBASE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485093#comment-13485093 ] stack commented on HBASE-6223: -- +1 Reads much better. Document hbck improvements: HBASE-6173, HBASE-5360 --- Key: HBASE-6223 URL: https://issues.apache.org/jira/browse/HBASE-6223 Project: HBase Issue Type: Task Components: documentation, hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6223.patch, trunk-6223_v2.patch, trunk-6223_v3.patch We had a couple hbck improvements recently: HBASE-6173 and HBASE-5360. We should document them. Especially, for HBASE-5360, it's something one normally doesn't do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7051) Read/Updates (increment,checkAndPut) should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485104#comment-13485104 ] Gregory Chanan commented on HBASE-7051: --- Yes, that's what I'm saying Lars. What do you think about the options I give in the description? You mentioned holding the rowlock longer, which would also work, but (presumably) slow down all operations, not just checkAndPuts/increments/appends. Read/Updates (increment,checkAndPut) should properly read MVCC -- Key: HBASE-7051 URL: https://issues.apache.org/jira/browse/HBASE-7051 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Gregory Chanan See, for example: {code} // TODO: Use MVCC to make this set of increments atomic to reads {code} Here's an example of what I can happen (would probably be good to write up a test case for each read/update): Concurrent update via increment and put. The put grabs the row lock first and updates the memstore, but releases the row lock before the MVCC is advanced. Then, the increment grabs the row lock and reads right away, reading the old value and incrementing based on that. There are a few options here: 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows. 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1) 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7051) Read/Updates (increment,checkAndPut) should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485126#comment-13485126 ] Lars Hofhansl commented on HBASE-7051: -- Holding the rowlock longer would require us to hold the lock while we sync the WAL (because MVCC visibility must come after we sync the lock). The nuclear option would be to wait for all MVCC transactions to finish: {code} MultiVersionConsistencyControl.WriteEntry w = mvcc.beginMemstoreInsert(); mvcc.advanceMemstore(w); mvcc.waitForRead(w); {code} This will wait for all prior transactions to finish. Even then there might be a race still. Read/Updates (increment,checkAndPut) should properly read MVCC -- Key: HBASE-7051 URL: https://issues.apache.org/jira/browse/HBASE-7051 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Gregory Chanan See, for example: {code} // TODO: Use MVCC to make this set of increments atomic to reads {code} Here's an example of what I can happen (would probably be good to write up a test case for each read/update): Concurrent update via increment and put. The put grabs the row lock first and updates the memstore, but releases the row lock before the MVCC is advanced. Then, the increment grabs the row lock and reads right away, reading the old value and incrementing based on that. There are a few options here: 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows. 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1) 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485127#comment-13485127 ] Hudson commented on HBASE-6852: --- Integrated in HBase-0.94 #557 (See [https://builds.apache.org/job/HBase-0.94/557/]) HBASE-6852 REVERT due to test failures. (Revision 1402588) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Assignee: Cheng Hao Priority: Minor Labels: performance Fix For: 0.94.3 Attachments: 6852-0.94.txt, metrics_hotspots.png, onhitcache-trunk.patch The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7051) Read/Updates (increment,checkAndPut) should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485128#comment-13485128 ] Gregory Chanan commented on HBASE-7051: --- Lars: that's option 1. The downside is you have to wait for writes to complete on other rows, which is unnecessary. I don't think there is a race in there, but could be wrong. Option #2 is use an MVCC per row. Is that just infeasible? We could make it an option like faster read/updates Option #3 is make the read/updates write-only. That's probably more difficult. Read/Updates (increment,checkAndPut) should properly read MVCC -- Key: HBASE-7051 URL: https://issues.apache.org/jira/browse/HBASE-7051 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Gregory Chanan See, for example: {code} // TODO: Use MVCC to make this set of increments atomic to reads {code} Here's an example of what I can happen (would probably be good to write up a test case for each read/update): Concurrent update via increment and put. The put grabs the row lock first and updates the memstore, but releases the row lock before the MVCC is advanced. Then, the increment grabs the row lock and reads right away, reading the old value and incrementing based on that. There are a few options here: 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows. 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1) 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485146#comment-13485146 ] stack commented on HBASE-7008: -- Lets do #1 then. What you think [~lhofhansl] Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485155#comment-13485155 ] stack commented on HBASE-5898: -- I've seen that thread dump before! [~eclark] has a program to try and repro the above and hopefully we can add some instrumentation and get clues on why the above happens. Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-TestBlocksRead.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485157#comment-13485157 ] Lars Hofhansl commented on HBASE-7008: -- +1 on #1 Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7039) Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485161#comment-13485161 ] Sergey Shelukhin commented on HBASE-7039: - as per Ted's request attaching test logs Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94 --- Key: HBASE-7039 URL: https://issues.apache.org/jira/browse/HBASE-7039 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7039-squashed.patch This is a major feature, so please -1 if you think it's too dangerous to port. However, it's also a perf improvement for recovery. The 2nd thing that HBASE-6012 addresses cannot be included without a breaking interface change (HRegionInterface openRegions doesn't return region states which are relied upon by the trunk code that is using protocol buffers API); or a non-breaking interface change with version-checking hackery to take advantage of it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7039) Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7039: Status: Patch Available (was: Open) Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94 --- Key: HBASE-7039 URL: https://issues.apache.org/jira/browse/HBASE-7039 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7039-squashed.patch This is a major feature, so please -1 if you think it's too dangerous to port. However, it's also a perf improvement for recovery. The 2nd thing that HBASE-6012 addresses cannot be included without a breaking interface change (HRegionInterface openRegions doesn't return region states which are relied upon by the trunk code that is using protocol buffers API); or a non-breaking interface change with version-checking hackery to take advantage of it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7039) Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7039: Attachment: test.log test-rerun.log test-am.log One seemingly unrelated test failed; passed on rerun. AM test separately passes too Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94 --- Key: HBASE-7039 URL: https://issues.apache.org/jira/browse/HBASE-7039 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7039-squashed.patch, test-am.log, test.log, test-rerun.log This is a major feature, so please -1 if you think it's too dangerous to port. However, it's also a perf improvement for recovery. The 2nd thing that HBASE-6012 addresses cannot be included without a breaking interface change (HRegionInterface openRegions doesn't return region states which are relied upon by the trunk code that is using protocol buffers API); or a non-breaking interface change with version-checking hackery to take advantage of it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7039) Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485167#comment-13485167 ] Hadoop QA commented on HBASE-7039: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551014/test.log against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3157//console This message is automatically generated. Port HBASE-5914 Bulk assign regions in the process of ServerShutdownHandler (and bugfix part of HBASE-6012) to 0.94 --- Key: HBASE-7039 URL: https://issues.apache.org/jira/browse/HBASE-7039 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7039-squashed.patch, test-am.log, test.log, test-rerun.log This is a major feature, so please -1 if you think it's too dangerous to port. However, it's also a perf improvement for recovery. The 2nd thing that HBASE-6012 addresses cannot be included without a breaking interface change (HRegionInterface openRegions doesn't return region states which are relied upon by the trunk code that is using protocol buffers API); or a non-breaking interface change with version-checking hackery to take advantage of it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6965) Generic MXBean Utility class to support all JDK vendors
[ https://issues.apache.org/jira/browse/HBASE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485170#comment-13485170 ] stack commented on HBASE-6965: -- [~kumarr] OSMXBean is a bad name for the class. It is not an MBean. I suggest that it be called JVM. We can make a new JIRA to rename it if you are in agreement (I can do it). Also, why not use exising http://hadoop.apache.org/docs/r0.20.0/api/org/apache/hadoop/util/Shell.ShellCommandExecutor.html instead of doing your own shell execution? instead of managing processes in your code added here? Thanks for list of classes and OS's tested on. Generic MXBean Utility class to support all JDK vendors --- Key: HBASE-6965 URL: https://issues.apache.org/jira/browse/HBASE-6965 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.94.1 Reporter: Kumar Ravi Assignee: Kumar Ravi Labels: patch Fix For: 0.96.0, 0.94.4 Attachments: HBASE-6965.patch, OSMXBean_HBASE-6965-0.94.patch This issue is related to JIRA https://issues.apache.org/jira/browse/HBASE-6945. This issue is opened to propose the use of a newly created generic org.apache.hadoop.hbase.util.OSMXBean class that can be used by other classes. JIRA HBASE-6945 contains a patch for the class org.apache.hadoop.hbase.ResourceChecker that uses OSMXBean. With the inclusion of this new class, HBase can be built and become functional with JDKs and JREs other than what is provided by Oracle. This class uses reflection to determine the JVM vendor (Sun, IBM) and the platform (Linux or Windows), and contains other methods that return the OS properties - 1. Number of Open File descriptors; 2. Maximum number of File Descriptors. This class compiles without any problems with IBM JDK 7, OpenJDK 6 as well as Oracle JDK 6. Junit tests (runDevTests category) completed without any failures or errors when tested on all the three JDKs.The builds and tests were attempted on branch hbase-0.94 Revision 1396305. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6945) Compilation errors when using non-Sun JDKs to build HBase-0.94
[ https://issues.apache.org/jira/browse/HBASE-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485171#comment-13485171 ] stack commented on HBASE-6945: -- [~kumarr] You seem to be making a method with an empty body with this patch. Is that intentional? I also commented over in HBASE-6965. Compilation errors when using non-Sun JDKs to build HBase-0.94 -- Key: HBASE-6945 URL: https://issues.apache.org/jira/browse/HBASE-6945 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.94.1 Environment: RHEL 6.3, IBM Java 7 Reporter: Kumar Ravi Assignee: Kumar Ravi Labels: patch Fix For: 0.94.4 Attachments: ResourceCheckerJUnitListener_HBASE_6945-trunk.patch When using IBM Java 7 to build HBase-0.94.1, the following comilation error is seen. [INFO] - [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /home/hadoop/hbase-0.94/src/test/java/org/apache/hadoop/hbase/ResourceChecker.java:[23,25] error: package com.sun.management does not exist [ERROR] /home/hadoop/hbase-0.94/src/test/java/org/apache/hadoop/hbase/ResourceChecker.java:[46,25] error: cannot find symbol [ERROR] symbol: class UnixOperatingSystemMXBean location: class ResourceAnalyzer /home/hadoop/hbase-0.94/src/test/java/org/apache/hadoop/hbase/ResourceChecker.java:[75,29] error: cannot find symbol [ERROR] symbol: class UnixOperatingSystemMXBean location: class ResourceAnalyzer /home/hadoop/hbase-0.94/src/test/java/org/apache/hadoop/hbase/ResourceChecker.java:[76,23] error: cannot find symbol [INFO] 4 errors [INFO] - [INFO] [INFO] BUILD FAILURE [INFO] I have a patch available which should work for all JDKs including Sun. I am in the process of testing this patch. Preliminary tests indicate the build is working fine with this patch. I will post this patch when I am done testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6060: --- Attachment: trunk-6060.patch Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.92.3, 0.94.3, 0.96.0 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, trunk-6060.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485173#comment-13485173 ] Jimmy Xiang commented on HBASE-6060: I uploaded a simple patch for 0.96: trunk-6060.patch. Could you please review? Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.92.3, 0.94.3, 0.96.0 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, trunk-6060.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6707) TEST org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient.testMultipleTables flaps
[ https://issues.apache.org/jira/browse/HBASE-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485180#comment-13485180 ] stack commented on HBASE-6707: -- [~jesse_yates] So what to do here now Jesse to resolve the issue? TEST org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient.testMultipleTables flaps Key: HBASE-6707 URL: https://issues.apache.org/jira/browse/HBASE-6707 Project: HBase Issue Type: Bug Components: test Reporter: Sameer Vaishampayan Assignee: Jesse Yates Priority: Critical Fix For: 0.96.0 Attachments: 6707-v4-addendum.txt, hbase-6707-v0.patch, hbase-6707-v1.patch, hbase-6707-v2.patch, hbase-6707-v3.patch, hbase-6707-v4-addendum.patch, hbase-6707-v4.patch, testZooKeeperTableArchiveClient-output.txt https://builds.apache.org/job/HBase-TRUNK/3293/ Error Message Archived HFiles (hdfs://localhost:59986/user/jenkins/hbase/.archive/otherTable/01ced3b55d7220a9c460273a4a57b198/fam) should have gotten deleted, but didn't, remaining files:[hdfs://localhost:59986/user/jenkins/hbase/.archive/otherTable/01ced3b55d7220a9c460273a4a57b198/fam/fc872572a1f5443eb55b6e2567cfeb1c] Stacktrace java.lang.AssertionError: Archived HFiles (hdfs://localhost:59986/user/jenkins/hbase/.archive/otherTable/01ced3b55d7220a9c460273a4a57b198/fam) should have gotten deleted, but didn't, remaining files:[hdfs://localhost:59986/user/jenkins/hbase/.archive/otherTable/01ced3b55d7220a9c460273a4a57b198/fam/fc872572a1f5443eb55b6e2567cfeb1c] at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertNull(Assert.java:551) at org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient.testMultipleTables(TestZooKeeperTableArchiveClient.java:291) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6223) Document hbck improvements: HBASE-6173, HBASE-5360
[ https://issues.apache.org/jira/browse/HBASE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6223: --- Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Document hbck improvements: HBASE-6173, HBASE-5360 --- Key: HBASE-6223 URL: https://issues.apache.org/jira/browse/HBASE-6223 Project: HBase Issue Type: Task Components: documentation, hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: trunk-6223.patch, trunk-6223_v2.patch, trunk-6223_v3.patch We had a couple hbck improvements recently: HBASE-6173 and HBASE-5360. We should document them. Especially, for HBASE-5360, it's something one normally doesn't do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485188#comment-13485188 ] stack commented on HBASE-6060: -- How does this patch address this issue Jimmy? Way back I was trying to harden who owns the region in an earlier incarnation of this patch. We had gray areas where region could be PENDING_OPEN and it was unclear if region was owned by the master or owned by the regionserver on failure, though repair was different depending on who had region control. Fixing region states is for another JIRA? Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.92.3, 0.94.3, 0.96.0 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, trunk-6060.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485189#comment-13485189 ] Ted Yu commented on HBASE-6060: --- {code} -this.stamp.set(System.currentTimeMillis()); +setTimestamp(System.currentTimeMillis()); {code} EnvironmentEdgeManager should be used. It would be nice to include test(s) from Rajesh's patch(s). Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.92.3, 0.94.3, 0.96.0 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, trunk-6060.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485190#comment-13485190 ] Ted Yu commented on HBASE-6410: --- Would be nice if Stack's questions on review board are answered. Move RegionServer Metrics to metrics2 - Key: HBASE-6410 URL: https://issues.apache.org/jira/browse/HBASE-6410 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410.patch Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485206#comment-13485206 ] Hadoop QA commented on HBASE-6060: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551016/trunk-6060.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 85 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3158//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3158//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3158//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3158//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3158//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3158//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3158//console This message is automatically generated. Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.92.3, 0.94.3, 0.96.0 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, trunk-6060.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485209#comment-13485209 ] Jimmy Xiang commented on HBASE-6060: @Stack, here is my understanding on the problem. Master calls a rs to open a region. Now, in master memory, the region is in pending_open state with this rs' server name. Now the rs dies. When SSH starts, it goes to meta to find all the regions on this rs, minus those regions already in transition, then assign the remaining regions. If the pending_open region (it could be opening too depending on timing) was on this region server before, SSH will take care of it. Otherwise, if it was on a different region server, SSH will not pick it up. In this patch, I just times out the region transition so that tm can change the state and re-assign it, instead of waiting for a long time (now, 20 minutes by default). I'd like to make sure the region states in master memory is reliable. Otherwise, it is of not much use. So I think master always has region control. In 0.96, I think region states is very reliable now. Of course, there could be bugs I am not aware of yet. @Ted, good point. I will include the test. For EnvironmentEdgeManager, I will leave it to another jira. Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.92.3, 0.94.3, 0.96.0 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, trunk-6060.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6223) Document hbck improvements: HBASE-6173, HBASE-5360
[ https://issues.apache.org/jira/browse/HBASE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485224#comment-13485224 ] Hudson commented on HBASE-6223: --- Integrated in HBase-TRUNK #3489 (See [https://builds.apache.org/job/HBase-TRUNK/3489/]) HBASE-6223 Document hbck improvements: HBASE-6173, HBASE-5360 (Revision 1402650) Result = FAILURE jxiang : Files : * /hbase/trunk/src/docbkx/book.xml Document hbck improvements: HBASE-6173, HBASE-5360 --- Key: HBASE-6223 URL: https://issues.apache.org/jira/browse/HBASE-6223 Project: HBase Issue Type: Task Components: documentation, hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: trunk-6223.patch, trunk-6223_v2.patch, trunk-6223_v3.patch We had a couple hbck improvements recently: HBASE-6173 and HBASE-5360. We should document them. Especially, for HBASE-5360, it's something one normally doesn't do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5360) [uberhbck] Add options for how to handle offline split parents.
[ https://issues.apache.org/jira/browse/HBASE-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485225#comment-13485225 ] Hudson commented on HBASE-5360: --- Integrated in HBase-TRUNK #3489 (See [https://builds.apache.org/job/HBase-TRUNK/3489/]) HBASE-6223 Document hbck improvements: HBASE-6173, HBASE-5360 (Revision 1402650) Result = FAILURE jxiang : Files : * /hbase/trunk/src/docbkx/book.xml [uberhbck] Add options for how to handle offline split parents. Key: HBASE-5360 URL: https://issues.apache.org/jira/browse/HBASE-5360 Project: HBase Issue Type: Improvement Components: hbck Affects Versions: 0.90.7, 0.92.1, 0.94.0 Reporter: Jonathan Hsieh Assignee: Jimmy Xiang Fix For: 0.90.7, 0.92.2, 0.94.1, 0.96.0 Attachments: 5360-0.90-hbase.patch, 5360-0.92-hbase.patch, 5360-0.94-hbase.patch, 5360_hbase_v4.patch, hbase-5360.path In a recent case, we attempted to repair a cluster that suffered from HBASE-4238 that had about 6-7 generations of leftover split data. The hbck repair options in an development version of HBASE-5128 treat HDFS as ground truth but didn't check SPLIT and OFFLINE flags only found in meta. The net effect was that it essentially attempted to merge many regions back into its eldest geneneration's parent's range. More safe guards to prevent mega-merges are being added on HBASE-5128. This issue would automate the handling of the mega-merge avoiding cases such as lingering grandparents. The strategy here would be to add more checks against .META., and perform part of the catalog janitor's responsibilities for lingering grandparents. This would potentially include options to sideline regions, deleting grandparent regions, min size for sidelining, and mechanisms for cleaning .META.. Note: There already exists an mechanism to reload these regions -- the bulk loaded mechanisms in LoadIncrementalHFiles can be used to re-add grandparents (automatically splitting them if necessary) to HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6173) hbck check specified tables only
[ https://issues.apache.org/jira/browse/HBASE-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485223#comment-13485223 ] Hudson commented on HBASE-6173: --- Integrated in HBase-TRUNK #3489 (See [https://builds.apache.org/job/HBase-TRUNK/3489/]) HBASE-6223 Document hbck improvements: HBASE-6173, HBASE-5360 (Revision 1402650) Result = FAILURE jxiang : Files : * /hbase/trunk/src/docbkx/book.xml hbck check specified tables only Key: HBASE-6173 URL: https://issues.apache.org/jira/browse/HBASE-6173 Project: HBase Issue Type: Improvement Components: hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.90.7, 0.92.2, 0.94.1, 0.96.0 Attachments: 6173-hbase-0.90.patch, 6173-hbase-0.90_v2.patch, 6173-hbase-0.92.patch, 6173-hbase-0.94.patch, 6173-hbase-v2.patch, hbase-6173.patch Currently hbck can fix specified tables so that we can fix one table each time. However, it doesn't check the health of the specified tables only. It still check the health of the whole system. If tables are specified, we can check the health of these tables only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7055: Attachment: HBASE-6371-squashed.patch Except for some trivial parts port was mostly manual. updateConfiguration code is not ported except as far as it's necessary for the tests (similar code is already being added elsewhere) mvn test appears to pass against latest trunk. port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7055: Status: Patch Available (was: Open) port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485251#comment-13485251 ] Sergey Shelukhin commented on HBASE-6371: - trunk patch v1 is in HBASE-7055 [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485256#comment-13485256 ] Ted Yu commented on HBASE-7055: --- Compaction related tests pass. {code} + * Copyright 2012 The Apache Software Foundation {code} The above line is no longer needed in license header. {code} +public class CompactionConfiguration { + + static final Log LOG = LogFactory.getLog(CompactionManager.class); {code} Class names mismatch. Please add annotation for audience and stability. Do this for other new classes. {code} + +public class TestTierCompactSelection extends TestDefaultCompactSelection { {code} Please add test category. port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7024) TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass
[ https://issues.apache.org/jira/browse/HBASE-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7024: - Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Incompatible change,Reviewed Status: Resolved (was: Patch Available) Thanks Dave for the patch. Committed to trunk. Aligns w/ our purging Writables. Thanks too for looking over hadoop way to see what it does. TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass --- Key: HBASE-7024 URL: https://issues.apache.org/jira/browse/HBASE-7024 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Dave Beech Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7024.patch The various initTableMapperJob methods in TableMapReduceUtil take outputKeyClass and outputValueClass parameters which need to extend WritableComparable and Writable respectively. Because of this, it is not convenient to use an alternative serialization like Avro. (I wanted to set these parameters to AvroKey and AvroValue). The methods in the MapReduce API to set map output key and value types do not impose this restriction, so is there a reason to do it here? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT
[ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485262#comment-13485262 ] stack commented on HBASE-6070: -- [~tychang] Would you mind making a new issue to remove the dead code? Thank you. AM.nodeDeleted and SSH races creating problems for regions under SPLIT -- Key: HBASE-6070 URL: https://issues.apache.org/jira/browse/HBASE-6070 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.94.1, 0.96.0 Attachments: HBASE-6070_0.92_1.patch, HBASE-6070_0.92.patch, HBASE-6070_0.94_1.patch, HBASE-6070_0.94.patch, HBASE-6070_trunk_1.patch, HBASE-6070_trunk.patch We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806. While doing some more we found still there is one race condition. - Split has just started and the znode is in RS_SPLIT state. - RS goes down. - First call back for SSH comes. - As part of the fix for HBASE-5806 SSH knows that some region is in RIT. - But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT. - After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned. When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485263#comment-13485263 ] Sergey Shelukhin commented on HBASE-6466: - Hi. Is there consensus on the approach? I think multiple threads make sense; as far as I see there's no bottleneck (e.g. several parallel writers to the same spindle or such). It's as if we are starting the requisite number of flushes in async manner one after another... Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3925) Make Shell's -d and debug cmd behave the same
[ https://issues.apache.org/jira/browse/HBASE-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485267#comment-13485267 ] stack commented on HBASE-3925: -- [~xieliang007] Mind attaching console output that demonstrates this patch does what it claims? Patch looks good. Make Shell's -d and debug cmd behave the same - Key: HBASE-3925 URL: https://issues.apache.org/jira/browse/HBASE-3925 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3, 0.90.7, 0.92.2, 0.94.3, 0.96.0, 0.98.0 Reporter: Lars George Assignee: liang xie Priority: Trivial Labels: patch Attachments: HBASE-3925.patch, HBASE-3925-v2.txt The -d option switches log4j to DEBUG and leaves the backtrace level at the default. When using the supplied debug command we only switch the backtrace, but I would think this also should set the log4j levels: {noformat} # Debugging method def debug if @shell.debug @shell.debug = false conf.back_trace_limit = 0 else @shell.debug = true conf.back_trace_limit = 100 end debug? end {noformat} could be something like {noformat} # Debugging method def debug if @shell.debug @shell.debug = false conf.back_trace_limit = 0 log_level = org.apache.log4j.Level::ERROR else @shell.debug = true conf.back_trace_limit = 100 log_level = org.apache.log4j.Level::DEBUG end org.apache.log4j.Logger.getLogger(org.apache.zookeeper).setLevel(log_level) org.apache.log4j.Logger.getLogger(org.apache.hadoop.hbase).setLevel(log_level) debug? end {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6793) Make hbase-examples module
[ https://issues.apache.org/jira/browse/HBASE-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485276#comment-13485276 ] stack commented on HBASE-6793: -- [~sershe] I added comments over in review. Does the patch here address Jesse's comments? If so, I'll commit. Make hbase-examples module -- Key: HBASE-6793 URL: https://issues.apache.org/jira/browse/HBASE-6793 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Enis Soztutar Assignee: Sergey Shelukhin Labels: noob Attachments: HBASE-6793.patch, HBASE-6793-v2.patch, HBASE-6793-v3-thrift-0.9.0.patch There are some examples under /examples/, which are not compiled as a part of the build. We can move them to an hbase-examples module. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485279#comment-13485279 ] Hadoop QA commented on HBASE-7055: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551034/HBASE-6371-squashed.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 13 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 83 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//console This message is automatically generated. port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6894) Adding metadata to a table in the shell is both arcane and painful
[ https://issues.apache.org/jira/browse/HBASE-6894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485284#comment-13485284 ] stack commented on HBASE-6894: -- [~sershe] Pardon my not getting to this earlier (I think it important patch). Patch looks good. I need to try it. Does the help need to change to match your changes? Say, the help around 'create' and for 'alter'? Would you mind listing in the release notes for this issue how the user's view on shell operations has changed? Once that is there I will have something to test against. Thanks for doing this Sergey. Adding metadata to a table in the shell is both arcane and painful -- Key: HBASE-6894 URL: https://issues.apache.org/jira/browse/HBASE-6894 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.96.0 Reporter: stack Assignee: Sergey Shelukhin Labels: noob Attachments: HBASE-6894.patch, HBASE-6894.patch, HBASE-6894.patch, HBASE-6894-v2.patch In production we have hundreds of tables w/ whack names like 'aliaserv', 'ashish_bulk', 'age_gender_topics', etc. It be grand if you could look in master UI and see stuff like owner, eng group responsible, miscellaneous description, etc. Now, HTD has support for this; each carries a dictionary. Whats a PITA though is adding attributes to the dictionary. Here is what seems to work on trunk (though I do not trust it is doing the right thing): {code} hbase create 'SOME_TABLENAME', {NAME = 'd', VERSION = 1, COMPRESSION = 'LZO'} hbase # Here is how I added metadata hbase disable 'SOME_TABLENAME' hbase alter 'SOME_TABLENAME', METHOD = 'table_att', OWNER = 'SOMEON', CONFIG = {'ENVIRONMENT' = 'BLAH BLAH', 'SIZING' = 'The size should be between 0-10K most of the time with new URLs coming in and getting removed as they are processed unless the pipeline has fallen behind', 'MISCELLANEOUS' = 'Holds the list of URLs waiting to be processed in the parked page detection analyzer in ingestion pipeline.'} ... describe... enable... {code} The above doesn't work in 0.94. Complains about the CONFIG, the keyword we are using for the HTD dictionary. It works in 0.96 though I'd have to poke around some more to ensure it is doing the right thing. But this METHOD = 'table_att' stuff is really ugly can we fix it? And I can't add table attributes on table create seemingly. A little bit of thought and a bit of ruby could clean this all up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6968) Several HBase write perf improvement
[ https://issues.apache.org/jira/browse/HBASE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6968: - Affects Version/s: 0.90.6 0.92.2 0.94.2 Several HBase write perf improvement Key: HBASE-6968 URL: https://issues.apache.org/jira/browse/HBASE-6968 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.2, 0.94.2 Reporter: Liyin Tang Here are 2 hbase write performance improvements recently: 1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation. In HLog.java: orig: {code:title=HLog.java} newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf)); {code} new: {code} newWriter = createWriter(fs, newPath, conf); {code} 2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case. orig: {code:title=HBaseRpcMetrics.java} public synchronized void inc(String name, int amt) { MetricsTimeVaryingRate m = get(name); if (m == null) { m = create(name); } m.inc(amt); } {code} new: {code} public void inc(String name, int amt) { MetricsTimeVaryingRate m = get(name); if (m == null) { synchronized (this) { if ((m = get(name)) == null) { m = create(name); } } } m.inc(amt); } {code} = orig: {code:title=MemStoreFlusher.java} public synchronized void reclaimMemStoreMemory() { if (this.server.getGlobalMemstoreSize().get() = globalMemStoreLimit) { flushSomeRegions(); } } {code} new: {code} public void reclaimMemStoreMemory() { if (this.server.getGlobalMemstoreSize().get() = globalMemStoreLimit) { flushSomeRegions(); } } private synchronized void flushSomeRegions() { if (this.server.getGlobalMemstoreSize().get() globalMemStoreLimit) { return; // double check the global memstore size inside of the synchronized block. } ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7024) TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass
[ https://issues.apache.org/jira/browse/HBASE-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485287#comment-13485287 ] Hudson commented on HBASE-7024: --- Integrated in HBase-TRUNK #3490 (See [https://builds.apache.org/job/HBase-TRUNK/3490/]) HBASE-7024 TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass (Revision 1402710) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapred/TableMapReduceUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass --- Key: HBASE-7024 URL: https://issues.apache.org/jira/browse/HBASE-7024 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Dave Beech Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7024.patch The various initTableMapperJob methods in TableMapReduceUtil take outputKeyClass and outputValueClass parameters which need to extend WritableComparable and Writable respectively. Because of this, it is not convenient to use an alternative serialization like Avro. (I wanted to set these parameters to AvroKey and AvroValue). The methods in the MapReduce API to set map output key and value types do not impose this restriction, so is there a reason to do it here? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485289#comment-13485289 ] Sergey Shelukhin commented on HBASE-7055: - looks like a real failure, I'll take a look port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6775) Use ZK.multi when available for HBASE-6710 0.92/0.94 compatibility fix
[ https://issues.apache.org/jira/browse/HBASE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485294#comment-13485294 ] Gregory Chanan commented on HBASE-6775: --- I'm not sure we will require ZK 3.4+ in 0.94. Perhaps we could put this behind a Configuration parameter (hbase.zookeeper.serverSupportsMulti or hbase.zookeeper.useMulti), which would give the user the option of using multi if they know their cluster supports it. I also contemplated trying to figure this out programmaticaly (calling srvr via send4LetterWord on the zookeeper quorum), but I've been told this is a bad idea. Trying to get more information on that approach. Use ZK.multi when available for HBASE-6710 0.92/0.94 compatibility fix -- Key: HBASE-6775 URL: https://issues.apache.org/jira/browse/HBASE-6775 Project: HBase Issue Type: Improvement Components: Zookeeper Affects Versions: 0.94.2 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.94.4 HBASE-6710 fixed a 0.92/0.94 compatibility issue by writing two znodes in different formats. If a ZK failure occurs between the writing of the two znodes, strange behavior can result. This issue is a reminder to change these two ZK writes to use ZK.multi when we require ZK 3.4+. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6793) Make hbase-examples module
[ https://issues.apache.org/jira/browse/HBASE-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485295#comment-13485295 ] Sergey Shelukhin commented on HBASE-6793: - [~stack] Not the 2nd iteration of comments, no. I will get to it probably on Monday. Thanks! Make hbase-examples module -- Key: HBASE-6793 URL: https://issues.apache.org/jira/browse/HBASE-6793 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Enis Soztutar Assignee: Sergey Shelukhin Labels: noob Attachments: HBASE-6793.patch, HBASE-6793-v2.patch, HBASE-6793-v3-thrift-0.9.0.patch There are some examples under /examples/, which are not compiled as a part of the build. We can move them to an hbase-examples module. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6173) hbck check specified tables only
[ https://issues.apache.org/jira/browse/HBASE-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485296#comment-13485296 ] Hudson commented on HBASE-6173: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #240 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/240/]) HBASE-6223 Document hbck improvements: HBASE-6173, HBASE-5360 (Revision 1402650) Result = FAILURE jxiang : Files : * /hbase/trunk/src/docbkx/book.xml hbck check specified tables only Key: HBASE-6173 URL: https://issues.apache.org/jira/browse/HBASE-6173 Project: HBase Issue Type: Improvement Components: hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.90.7, 0.92.2, 0.94.1, 0.96.0 Attachments: 6173-hbase-0.90.patch, 6173-hbase-0.90_v2.patch, 6173-hbase-0.92.patch, 6173-hbase-0.94.patch, 6173-hbase-v2.patch, hbase-6173.patch Currently hbck can fix specified tables so that we can fix one table each time. However, it doesn't check the health of the specified tables only. It still check the health of the whole system. If tables are specified, we can check the health of these tables only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6223) Document hbck improvements: HBASE-6173, HBASE-5360
[ https://issues.apache.org/jira/browse/HBASE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485298#comment-13485298 ] Hudson commented on HBASE-6223: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #240 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/240/]) HBASE-6223 Document hbck improvements: HBASE-6173, HBASE-5360 (Revision 1402650) Result = FAILURE jxiang : Files : * /hbase/trunk/src/docbkx/book.xml Document hbck improvements: HBASE-6173, HBASE-5360 --- Key: HBASE-6223 URL: https://issues.apache.org/jira/browse/HBASE-6223 Project: HBase Issue Type: Task Components: documentation, hbck Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: trunk-6223.patch, trunk-6223_v2.patch, trunk-6223_v3.patch We had a couple hbck improvements recently: HBASE-6173 and HBASE-5360. We should document them. Especially, for HBASE-5360, it's something one normally doesn't do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7024) TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass
[ https://issues.apache.org/jira/browse/HBASE-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485297#comment-13485297 ] Hudson commented on HBASE-7024: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #240 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/240/]) HBASE-7024 TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass (Revision 1402710) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapred/TableMapReduceUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass --- Key: HBASE-7024 URL: https://issues.apache.org/jira/browse/HBASE-7024 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Dave Beech Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7024.patch The various initTableMapperJob methods in TableMapReduceUtil take outputKeyClass and outputValueClass parameters which need to extend WritableComparable and Writable respectively. Because of this, it is not convenient to use an alternative serialization like Avro. (I wanted to set these parameters to AvroKey and AvroValue). The methods in the MapReduce API to set map output key and value types do not impose this restriction, so is there a reason to do it here? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5360) [uberhbck] Add options for how to handle offline split parents.
[ https://issues.apache.org/jira/browse/HBASE-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485299#comment-13485299 ] Hudson commented on HBASE-5360: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #240 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/240/]) HBASE-6223 Document hbck improvements: HBASE-6173, HBASE-5360 (Revision 1402650) Result = FAILURE jxiang : Files : * /hbase/trunk/src/docbkx/book.xml [uberhbck] Add options for how to handle offline split parents. Key: HBASE-5360 URL: https://issues.apache.org/jira/browse/HBASE-5360 Project: HBase Issue Type: Improvement Components: hbck Affects Versions: 0.90.7, 0.92.1, 0.94.0 Reporter: Jonathan Hsieh Assignee: Jimmy Xiang Fix For: 0.90.7, 0.92.2, 0.94.1, 0.96.0 Attachments: 5360-0.90-hbase.patch, 5360-0.92-hbase.patch, 5360-0.94-hbase.patch, 5360_hbase_v4.patch, hbase-5360.path In a recent case, we attempted to repair a cluster that suffered from HBASE-4238 that had about 6-7 generations of leftover split data. The hbck repair options in an development version of HBASE-5128 treat HDFS as ground truth but didn't check SPLIT and OFFLINE flags only found in meta. The net effect was that it essentially attempted to merge many regions back into its eldest geneneration's parent's range. More safe guards to prevent mega-merges are being added on HBASE-5128. This issue would automate the handling of the mega-merge avoiding cases such as lingering grandparents. The strategy here would be to add more checks against .META., and perform part of the catalog janitor's responsibilities for lingering grandparents. This would potentially include options to sideline regions, deleting grandparent regions, min size for sidelining, and mechanisms for cleaning .META.. Note: There already exists an mechanism to reload these regions -- the bulk loaded mechanisms in LoadIncrementalHFiles can be used to re-add grandparents (automatically splitting them if necessary) to HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6410: - Attachment: HBASE-6410-6.patch * Moved Metrics classes out of metrics sub packages so there are fewer public accessors needed. * Cleaned up comments * Removed table level roll ups. * Cleaned up JMX Bean naming. * Added a region metrics test. Move RegionServer Metrics to metrics2 - Key: HBASE-6410 URL: https://issues.apache.org/jira/browse/HBASE-6410 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485311#comment-13485311 ] Sergey Shelukhin commented on HBASE-7055: - I know what the issue is, don't exclude bulk setting is gone from trunk, I will remove it from patch. port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485313#comment-13485313 ] Lars Hofhansl commented on HBASE-5898: -- Sounds like a missed notify or a deadlock. Although looking at the code I do not see how that can happen. The use of notify (vs. notifyAll) seems correct in IdLock since all waiting threads wait for the same condition and only one thread will be able to proceed. @Ram: Which version of HBase? Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-TestBlocksRead.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7051) Read/Updates (increment,checkAndPut) should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485316#comment-13485316 ] Lars Hofhansl commented on HBASE-7051: -- Can you explain #3 more. In the end this was really broken by the work to release the lock before the WAL is sync'ed (and since for that be correct MVCC need to be rolled forward after the WAL sync'ed, which means after the lock was released). Read/Updates (increment,checkAndPut) should properly read MVCC -- Key: HBASE-7051 URL: https://issues.apache.org/jira/browse/HBASE-7051 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Gregory Chanan See, for example: {code} // TODO: Use MVCC to make this set of increments atomic to reads {code} Here's an example of what I can happen (would probably be good to write up a test case for each read/update): Concurrent update via increment and put. The put grabs the row lock first and updates the memstore, but releases the row lock before the MVCC is advanced. Then, the increment grabs the row lock and reads right away, reading the old value and incrementing based on that. There are a few options here: 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows. 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1) 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7055: Attachment: HBASE-6371-v2-squashed.patch Failed test now passes on local. Also addressed the code review feedback, except for stability annotations - these are internal interfaces, I wonder if stability ones necessary? They're not used on similar classes. port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485322#comment-13485322 ] Hadoop QA commented on HBASE-6410: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551044/HBASE-6410-6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 157 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 87 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3160//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3160//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3160//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3160//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3160//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3160//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3160//console This message is automatically generated. Move RegionServer Metrics to metrics2 - Key: HBASE-6410 URL: https://issues.apache.org/jira/browse/HBASE-6410 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485325#comment-13485325 ] Ted Yu commented on HBASE-7055: --- The new classes are public, so they should have audience annotation. If the audience is Private, stability can be omitted. port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7008: - Attachment: 7008-0.94-v3.txt 0.94 patch I plan to commit. Enables tcpnodelay and sets scanner caching default to 100. Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-0.94-v3.txt, 7008-trunk-v5.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7008: - Attachment: 7008-trunk-v5.txt Trunk patch Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-0.94-v3.txt, 7008-trunk-v5.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7008: - Status: Patch Available (was: Open) Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-0.94-v3.txt, 7008-trunk-v5.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485328#comment-13485328 ] Lars Hofhansl commented on HBASE-5898: -- Clearly this can happen when HDFS has a problem. One thread tries to load the block, and if that is delayed due to HDFS all other threads need to queue up and wait. Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-TestBlocksRead.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485332#comment-13485332 ] Hadoop QA commented on HBASE-7055: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551047/HBASE-6371-v2-squashed.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 13 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 83 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.catalog.TestMetaReaderEditor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//console This message is automatically generated. port HBASE-6371 tier-based compaction from 0.89-fb to trunk --- Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485334#comment-13485334 ] Hadoop QA commented on HBASE-7008: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551050/7008-trunk-v5.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 85 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestScannerTimeout Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3162//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3162//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3162//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3162//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3162//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3162//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3162//console This message is automatically generated. Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-0.94-v3.txt, 7008-trunk-v5.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7008) Set scanner caching to a better default
[ https://issues.apache.org/jira/browse/HBASE-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7008: - Attachment: 7008-trunk-v6.txt Trunk version that fixes TestScannerTimeout Set scanner caching to a better default --- Key: HBASE-7008 URL: https://issues.apache.org/jira/browse/HBASE-7008 Project: HBase Issue Type: Bug Components: Client Reporter: liang xie Assignee: liang xie Fix For: 0.94.3, 0.96.0 Attachments: 7008-0.94.txt, 7008-0.94-v2.txt, 7008-0.94-v3.txt, 7008-trunk-v5.txt, 7008-trunk-v6.txt, 7008-v3.txt, 7008-v4.txt, HBASE-7008.patch, HBASE-7008-v2.patch per http://search-hadoop.com/m/qaRu9iM2f02/Set+scanner+caching+to+a+better+default%253Fsubj=Set+scanner+caching+to+a+better+default+ let's set to 100 by default -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6846) BitComparator bug - ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HBASE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6846: - Description: The HBase 0.94.1 BitComparator introduced a bug in the method compareTo: {code} @Override public int compareTo(byte[] value, int offset, int length) { if (length != this.value.length) { return 1; } int b = 0; //Iterating backwards is faster because we can quit after one non-zero byte. for (int i = value.length - 1; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[i] value[i+offset]) 0xff; break; case OR: b = (this.value[i] | value[i+offset]) 0xff; break; case XOR: b = (this.value[i] ^ value[i+offset]) 0xff; break; } } return b == 0 ? 1 : 0; } {code} I've encountered this problem when using a BitComparator with a configured this.value.length=8, and in the HBase table there were KeyValues with keyValue.getBuffer().length=207911 bytes. In this case: {code} for (int i = 207910; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[207910] ... == ArrayIndexOutOfBoundsException break; {code} That loop should use: {code} for (int i = length - 1; i = 0 b == 0; i--) { (or this.value.length.) {code} Should I provide a patch for correcting the problem? was: The HBase 0.94.1 BitComparator introduced a bug in the method compareTo: @Override public int compareTo(byte[] value, int offset, int length) { if (length != this.value.length) { return 1; } int b = 0; //Iterating backwards is faster because we can quit after one non-zero byte. for (int i = value.length - 1; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[i] value[i+offset]) 0xff; break; case OR: b = (this.value[i] | value[i+offset]) 0xff; break; case XOR: b = (this.value[i] ^ value[i+offset]) 0xff; break; } } return b == 0 ? 1 : 0; } I've encountered this problem when using a BitComparator with a configured this.value.length=8, and in the HBase table there were KeyValues with keyValue.getBuffer().length=207911 bytes. In this case: for (int i = 207910; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[207910] ... == ArrayIndexOutOfBoundsException break; That loop should use: for (int i = length - 1; i = 0 b == 0; i--) { (or this.value.length.) Should I provide a patch for correcting the problem? BitComparator bug - ArrayIndexOutOfBoundsException -- Key: HBASE-6846 URL: https://issues.apache.org/jira/browse/HBASE-6846 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.1 Environment: HBase 0.94.1 + Hadoop 2.0.0-cdh4.0.1 Reporter: Lucian George Iordache Attachments: HBASE-6846.patch The HBase 0.94.1 BitComparator introduced a bug in the method compareTo: {code} @Override public int compareTo(byte[] value, int offset, int length) { if (length != this.value.length) { return 1; } int b = 0; //Iterating backwards is faster because we can quit after one non-zero byte. for (int i = value.length - 1; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[i] value[i+offset]) 0xff; break; case OR: b = (this.value[i] | value[i+offset]) 0xff; break; case XOR: b = (this.value[i] ^ value[i+offset]) 0xff; break; } } return b == 0 ? 1 : 0; } {code} I've encountered this problem when using a BitComparator with a configured this.value.length=8, and in the HBase table there were KeyValues with keyValue.getBuffer().length=207911 bytes. In this case: {code} for (int i = 207910; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[207910] ... == ArrayIndexOutOfBoundsException break; {code} That loop should use: {code} for (int i = length - 1; i = 0 b == 0; i--) { (or this.value.length.) {code} Should I provide a patch for correcting the problem? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6846) BitComparator bug - ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HBASE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6846: - Attachment: 6846-trunk.txt +1 on patch (passing 0 here as length is fine as the loop will not be entered in that case anyway). Here's a patch that applies to trunk. BitComparator bug - ArrayIndexOutOfBoundsException -- Key: HBASE-6846 URL: https://issues.apache.org/jira/browse/HBASE-6846 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.1 Environment: HBase 0.94.1 + Hadoop 2.0.0-cdh4.0.1 Reporter: Lucian George Iordache Fix For: 0.94.3, 0.96.0 Attachments: 6846-trunk.txt, HBASE-6846.patch The HBase 0.94.1 BitComparator introduced a bug in the method compareTo: {code} @Override public int compareTo(byte[] value, int offset, int length) { if (length != this.value.length) { return 1; } int b = 0; //Iterating backwards is faster because we can quit after one non-zero byte. for (int i = value.length - 1; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[i] value[i+offset]) 0xff; break; case OR: b = (this.value[i] | value[i+offset]) 0xff; break; case XOR: b = (this.value[i] ^ value[i+offset]) 0xff; break; } } return b == 0 ? 1 : 0; } {code} I've encountered this problem when using a BitComparator with a configured this.value.length=8, and in the HBase table there were KeyValues with keyValue.getBuffer().length=207911 bytes. In this case: {code} for (int i = 207910; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[207910] ... == ArrayIndexOutOfBoundsException break; {code} That loop should use: {code} for (int i = length - 1; i = 0 b == 0; i--) { (or this.value.length.) {code} Should I provide a patch for correcting the problem? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6846) BitComparator bug - ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HBASE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6846: - Fix Version/s: 0.96.0 0.94.3 Assignee: Lars Hofhansl BitComparator bug - ArrayIndexOutOfBoundsException -- Key: HBASE-6846 URL: https://issues.apache.org/jira/browse/HBASE-6846 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.1 Environment: HBase 0.94.1 + Hadoop 2.0.0-cdh4.0.1 Reporter: Lucian George Iordache Assignee: Lars Hofhansl Fix For: 0.94.3, 0.96.0 Attachments: 6846-trunk.txt, HBASE-6846.patch The HBase 0.94.1 BitComparator introduced a bug in the method compareTo: {code} @Override public int compareTo(byte[] value, int offset, int length) { if (length != this.value.length) { return 1; } int b = 0; //Iterating backwards is faster because we can quit after one non-zero byte. for (int i = value.length - 1; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[i] value[i+offset]) 0xff; break; case OR: b = (this.value[i] | value[i+offset]) 0xff; break; case XOR: b = (this.value[i] ^ value[i+offset]) 0xff; break; } } return b == 0 ? 1 : 0; } {code} I've encountered this problem when using a BitComparator with a configured this.value.length=8, and in the HBase table there were KeyValues with keyValue.getBuffer().length=207911 bytes. In this case: {code} for (int i = 207910; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[207910] ... == ArrayIndexOutOfBoundsException break; {code} That loop should use: {code} for (int i = length - 1; i = 0 b == 0; i--) { (or this.value.length.) {code} Should I provide a patch for correcting the problem? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485339#comment-13485339 ] Lars Hofhansl commented on HBASE-5898: -- I think we should commit this fix (after checking out the TestStoreFile failure) and investigate Ram's issue in a different jira, these look like two different scenarios. Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-TestBlocksRead.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6846) BitComparator bug - ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HBASE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485340#comment-13485340 ] stack commented on HBASE-6846: -- That is a silly bug (I probably did it!). +1 on committing. BitComparator bug - ArrayIndexOutOfBoundsException -- Key: HBASE-6846 URL: https://issues.apache.org/jira/browse/HBASE-6846 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.1 Environment: HBase 0.94.1 + Hadoop 2.0.0-cdh4.0.1 Reporter: Lucian George Iordache Assignee: Lars Hofhansl Fix For: 0.94.3, 0.96.0 Attachments: 6846-trunk.txt, HBASE-6846.patch The HBase 0.94.1 BitComparator introduced a bug in the method compareTo: {code} @Override public int compareTo(byte[] value, int offset, int length) { if (length != this.value.length) { return 1; } int b = 0; //Iterating backwards is faster because we can quit after one non-zero byte. for (int i = value.length - 1; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[i] value[i+offset]) 0xff; break; case OR: b = (this.value[i] | value[i+offset]) 0xff; break; case XOR: b = (this.value[i] ^ value[i+offset]) 0xff; break; } } return b == 0 ? 1 : 0; } {code} I've encountered this problem when using a BitComparator with a configured this.value.length=8, and in the HBase table there were KeyValues with keyValue.getBuffer().length=207911 bytes. In this case: {code} for (int i = 207910; i = 0 b == 0; i--) { switch (bitOperator) { case AND: b = (this.value[207910] ... == ArrayIndexOutOfBoundsException break; {code} That loop should use: {code} for (int i = length - 1; i = 0 b == 0; i--) { (or this.value.length.) {code} Should I provide a patch for correcting the problem? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira