[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...
[ https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070331#comment-13070331 ] gaojinchao commented on HBASE-4064: --- Master may be crashed because of pool shutdown is asynchronous. The master show : 2011-07-22 13:33:27,806 INFO org.apache.hadoop.hbase.master.handler.EnableTableHandler: Table has 2156 regions of which 2156 are online. 2011-07-22 13:34:28,646 INFO org.apache.hadoop.hbase.master.handler.EnableTableHandler: Table has 2156 regions of which 982 are online. 2011-07-22 13:34:31,079 WARN org.apache.hadoop.hbase.master.AssignmentManager: gjc:xxx ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. 2011-07-22 13:34:31,080 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x31502ef4f0 Creating (or updating) unassigned node for c9b1c97ac6c00033ceb1890e45e66229 with OFFLINE state 2011-07-22 13:34:31,104 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. state=OFFLINE, ts=1311312871080 2011-07-22 13:34:31,121 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. so generated a random one; hri=ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229., src=, dest=C4C2.site,60020,1311310281335; 3 (online=3, exclude=null) available servers 2011-07-22 13:34:31,121 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. to C4C2.site,60020,1311310281335 2011-07-22 13:34:31,122 WARN org.apache.hadoop.hbase.master.AssignmentManager: gjc:xxx ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. 2011-07-22 13:34:31,123 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected state trying to OFFLINE; ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. state=PENDING_OPEN, ts=1311312871121 java.lang.IllegalStateException at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1081) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1036) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:864) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:844) at java.lang.Thread.run(Thread.java:662) 2011-07-22 13:34:31,125 INFO org.apache.hadoop.hbase.master.HMaster: Aborting Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long... Key: HBASE-4064 URL: https://issues.apache.org/jira/browse/HBASE-4064 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.3 Reporter: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, HBASE-4064_branch90V2.patch, disableflow.png 1. If there is a rubbish RegionState object with PENDING_CLOSE in regionsInTransition(The RegionState was remained by some exception which should be removed, that's why I called it as rubbish object), but the region is not currently assigned anywhere, TimeoutMonitor will fall into an endless loop: 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:21,438 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:21,441 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:31,215 DEBUG
[jira] [Updated] (HBASE-4120) isolation and allocation
[ https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Jia updated HBASE-4120: --- Attachment: System Structure.jpg the relationship between groups and table priority isolation and allocation Key: HBASE-4120 URL: https://issues.apache.org/jira/browse/HBASE-4120 Project: HBase Issue Type: New Feature Components: master, regionserver Affects Versions: 0.90.2 Reporter: Liu Jia Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, HBase_isolation_and_allocation_user_guide.pdf, Performance_of_Table_priority.pdf, System Structure.jpg The HBase isolation and allocation tool is designed to help users manage cluster resource among different application and tables. When we have a large scale of HBase cluster with many applications running on it, there will be lots of problems. In Taobao there is a cluster for many departments to test their applications performance, these applications are based on HBase. With one cluster which has 12 servers, there will be only one application running exclusively on this server, and many other applications must wait until the previous test finished. After we add allocation manage function to the cluster, applications can share the cluster and run concurrently. Also if the Test Engineer wants to make sure there is no interference, he/she can move out other tables from this group. In groups we use table priority to allocate resource, when system is busy; we can make sure high-priority tables are not affected lower-priority tables Different groups can have different region server configurations, some groups optimized for reading can have large block cache size, and others optimized for writing can have large memstore size. Tables and region servers can be moved easily between groups; after changing the configuration, a group can be restarted alone instead of restarting the whole cluster. git entry : https://github.com/ICT-Ope/HBase_allocation . We hope our work is helpful. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070342#comment-13070342 ] fulin wang commented on HBASE-4124: --- I can't find where does it call getRegionsInTransitionInRS().add()? So I do not understand why add this function. About 'already online on this server' of error, I want that the region should be closed or reassinged. I am trying to make a patch. ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070349#comment-13070349 ] dhruba borthakur commented on HBASE-4132: - stack: the LogCleaner api allows archived logs to be deleted according to a configurable policy. One can set hbase.master.logcleaner.plugins to setup his own policy. In that sense, it is already pluggable. Moreover, this is done by the master whereas the WALObserver interface is in the RegionServer. Given the above, do you think that this patch needs to touch LogCleaner at all? If so, what is ur proposal? Andrew: The WALObserver API additions should follow the same practice as providing before (pre) and after (post) hooks as everywhere else. In that sense, it already has logRollRequested and logRolled. Similarly, I added logArchiveStart and logArchiveCompleted. The remaining one is visitLogEntryBeforeWrite. are you suggesting that we add a visitLogEntryAfterWrite as well? Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4134) the total numbers of Regions was more than the actual regions count while balancing after the hbck fixing.
[ https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feng xu updated HBASE-4134: --- Description: 1. I found the problem(Some regions were multiplied) while running hbck to check the cluster's health. Here's the result: {noformat} ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is listed in META on region server 158-1-91-103:20020 but is multiply assigned to region servers 158-1-91-103:20020, 158-1-91-105:20020 Summary: -ROOT- is okay. Number of regions: 1 Deployed on: 158-1-91-105:20020 .META. is okay. Number of regions: 1 Deployed on: 158-1-91-103:20020 test1 is okay. Number of regions: 25297 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 14829 inconsistencies detected. Status: INCONSISTENT {noformat} 2. Then I tried use hbck -fix to fix the problems, everything seemed ok. But I found that the total numbers of Regions(35029) was more than the actual regions count(25299) while balancing after the fixing. Here's the related logs snippet: {noformat} 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=25299 average=8433.0 mostloaded=8433 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=35029 average=11676.333 mostloaded=11677 leastloaded=11676 {noformat} 3. I tracked one region's behavior during the time. Taking the region of test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example: (1) It was assigned to 158-1-91-101 at first. (2) HBCK sent closing request to RegionServer. And RegionServer closed it silently without noticing to HMaster. (3) The region was still carried by RS 158-1-91-103 which was known to HMaster. (4) HBCK will tricker a new assignment. The fact is, the region was assigned again, but the old assignment information was still remained in the sets of AM#regions,AM#servers. That's why did the problem of regions count was larger than the actual number occur. {noformat} Line 178967: 2011-07-22 02:47:51,247 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., server=HBCKServerName, state=M_ZK_REGION_OFFLINE) Line 178968: 2011-07-22 02:47:51,247 INFO org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, region=52782c0241a598b3e37ca8729da0 Line 178969: 2011-07-22 02:47:51,248 INFO org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering assignment of region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. Line 178970: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) available servers Line 178971: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 158-1-91-101,20020,1311231878544 Line 178983: 2011-07-22 02:47:51,285 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, region=52782c0241a598b3e37ca8729da0 Line 179001: 2011-07-22 02:47:51,318 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, region=52782c0241a598b3e37ca8729da0 Line 179002: 2011-07-22 02:47:51,319 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED event for 52782c0241a598b3e37ca8729da0; deleting unassigned node Line 179003: 2011-07-22 02:47:51,319 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:2-0x1314ac5addb0042-0x1314ac5addb0042 Deleting existing unassigned node for 52782c0241a598b3e37ca8729da0 that is in expected state RS_ZK_REGION_OPENED Line 179007: 2011-07-22 02:47:51,326 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:2-0x1314ac5addb0042-0x1314ac5addb0042 Successfully
[jira] [Updated] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service
[ https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vamshi updated HBASE-4137: -- Description: To implement a scalable data structure i.e Distributed hash table in the HBase. In the Hbase to perform fast lookup we can take the help of Distributed Hash Table (DHT), a scalable data structure. (was: To implement a scalable data structure i.e Distributed hash table in the HBase. ) Summary: Implementation of Distributed Hash Table(DHT) for lookup service (was: Implementation of Distributed Hash Table(DHT) ) Implementation of Distributed Hash Table(DHT) for lookup service - Key: HBASE-4137 URL: https://issues.apache.org/jira/browse/HBASE-4137 Project: HBase Issue Type: Improvement Components: performance Affects Versions: 0.90.3 Reporter: vamshi To implement a scalable data structure i.e Distributed hash table in the HBase. In the Hbase to perform fast lookup we can take the help of Distributed Hash Table (DHT), a scalable data structure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster
[ https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070371#comment-13070371 ] vamshi commented on HBASE-1938: --- Hi stack, how can we perform lookup/scanning in HBase? can we use Distributed Hashing (DHT) for that? I want to implement scalable data structure i.e DHT in HBase, for how can i proceed? please help me..Thank you Make in-memory table scanning faster Key: HBASE-1938 URL: https://issues.apache.org/jira/browse/HBASE-1938 Project: HBase Issue Type: Improvement Components: performance Reporter: stack Assignee: stack Priority: Blocker Attachments: MemStoreScanPerformance.java, MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch This issue is about profiling hbase to see if I can make hbase scans run faster when all is up in memory. Talking to some users, they are seeing about 1/4 million rows a second. It should be able to go faster than this (Scanning an array of objects, they can do about 4-5x this). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070374#comment-13070374 ] ramkrishna.s.vasudevan commented on HBASE-3845: --- Ted, I tried using the 'this.cacheFlushLock.isHeldByCurrentThread()'. The problem here is as HLog.append() may be called by other thread whereas the HRegion.internalFlushCache() is called by memstoreflusher thread. So if we check this.cacheFlushLock.isHeldByCurrentThread() it returns false. So as per your suggestion i have inlined the isFlushInProgress into wal.startCacheFlush() and wal.abortCacheFlush() and still going with AtomicBoolean. Is it fine Ted ? I am planning to upload the patch with these changes. data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845__trunk.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.
If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node. --- Key: HBASE-4138 URL: https://issues.apache.org/jira/browse/HBASE-4138 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.3 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.4 Change the zookeeper.znode.parent property (default is /hbase). Now do not specify this change in the client code. Use the HTable Object. The HTable is not able to find the root region and keeps continuously looping. Find the stack trace: Object.wait(long) line: not available [native method] RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122 RootRegionTracker.waitRootRegionLocation(long) line: 73 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 578 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[], byte[], byte[], boolean, Object) line: 687 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 589 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[], byte[], byte[], boolean, Object) line: 687 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 593 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HTable.init(Configuration, byte[]) line: 171 HTable.init(Configuration, String) line: 145 HBaseTest.test() line: 45 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2827) HBase Client doesn't handle master failover
[ https://issues.apache.org/jira/browse/HBASE-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070383#comment-13070383 ] vamshi commented on HBASE-2827: --- Hi jonathan,may be in this place this question is irrelevant. But please let me know whether we can implement a distributed hashing in HBase for fast lookup/ scanning purpose?? i want to implement scalable data structure i.e DHT in Hbase, for that how can i proceed? Thank you. HBase Client doesn't handle master failover --- Key: HBASE-2827 URL: https://issues.apache.org/jira/browse/HBASE-2827 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.0 Reporter: Nicolas Spiegelberg Assignee: Jonathan Gray A client on our beta tier was stuck in this exception loop when we issued a new HMaster after the old one died: Exception while trying to connect hBase java.lang.reflect.UndeclaredThrowableException at $Proxy1.getClusterStatus(Unknown Source) at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:912) at org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:170) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:143) ... at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) Caused by: java.net.SocketTimeoutException: 2 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.18.34.212:6] at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:309) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:856) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:724) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:252) ... 20 more 12:52:55,863 [pool-4-thread-5182] INFO PersistentUtil:153 - Retry after 1 second... Looking at the client code, the HConnectionManager does not watch ZK for NodeDeleted NodeCreated of /hbase/master -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.
[ https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070384#comment-13070384 ] vamshi commented on HBASE-2645: --- Hi Todd,may be in this place this question is irrelevant. But please let me know whether we can implement a distributed hashing in HBase for fast lookup/ scanning purpose?? i want to implement scalable data structure i.e DHT in Hbase, for that how can i proceed? Thank you. HLog writer can do 1-2 sync operations after lease has been recovered for split process. Key: HBASE-2645 URL: https://issues.apache.org/jira/browse/HBASE-2645 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.90.4 Reporter: Cosmin Lehene Assignee: Todd Lipcon Priority: Blocker Fix For: 0.94.0 TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. This test starts a thread that writes one edit to the log, syncs and counts. During this, a HLog.splitLog operation is started. splitLog recovers the log lease before reading the log, so that the original regionserver could not wake up and write after the split process started. The test compares the number of edits reported by the split process and by the writer thread. Writer thread (called zombie in the test) should report = than the splitLog (sync() might raise after the last edit gets written and the edit won't get counted by zombie thread). However it appears that the zombie counts 1-2 more edits. So it looks like it can sync without a lease. This might be a hdfs-0.20 related issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster
[ https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070458#comment-13070458 ] nkeywal commented on HBASE-1938: Hello Stack, accesses and perhaps to make it go faster. I will have a look at it, I see as well in this test and in the global profiling that a lot of time is spent on it. scanner.next(); There are two iterators in the class(kvsetIt and snapshotIt), and getLowest compare the two to return the lowest. However, in this test, one of the list is empty, so the value is null, and hence the real comparison on byte[] is not executed. On this subject, there is a possible optimisation on the function peek, that will repeat the comparison: if peek is called multiple time, of if we often have peek() then next(), we can save the redundant comparisons. To me, it makes sense to precalculate the value returned by peek, and reuse it in next(). The profiling (method: sampling, java inlining desactivated) says something interesting: Name; total time spent org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.next()100% org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getNext(Iterator) 88% org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet$MapEntryIterator.next() 44% java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapEntryIterator.next() 36% org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint() 26% java.lang.ThreadLocal.get()21% org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet$MapEntryIterator.hasNext() 8% org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getLowest() 7% java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapIter.hasNext()3% org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getLower(KeyValue, KeyValue)3% java.lang.Long.longValue()2% So we're spending 26% of the time on this: org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint() 26% And in this getThreadReadPoint(), the actual time is spent in: java.lang.ThreadLocal.get()21% It's a TLS, so we can expect a system call to get the thread id. It would be great to save this system call in a next(). There is at least an improvement for the case when one of the list is done: don't get the data getThreadReadPoint(). That would not change the behaviour at all, but would already be interesting (may be 10% in this test). Another option is to share getThreadReadPoint() value for the two iterators, i.e. read the value in the next() function, and give it as a parameter to getNext(). In fact, as this value seems to be a TLS, I don't see how it could change during the execution of next(). What do you think? Last question on this: what is the use case when the getThreadReadPoint() will change during the whole scan (i.e.: between next)? Most of the public methods (except reseek) are synchronized, it implies that the scanner can be shared between threads? At the end, it seems that there are 3 possible things to do: 1) Replacement of KeyValue lowest = getLowest(); 2) theNext precalculation for peek() and next() 3) Depending on your feedback, one of the options above on getThreadReadPoint(). This should give 5 to 15% increase in performances, not a problem solved stuff, but could justify a first patch. I can do it (with the hbase indenting :-) On Sun, Jul 24, 2011 at 12:23 AM, stack (JIRA) j...@apache.org wrote: Make in-memory table scanning faster Key: HBASE-1938 URL: https://issues.apache.org/jira/browse/HBASE-1938 Project: HBase Issue Type: Improvement Components: performance Reporter: stack Assignee: stack Priority: Blocker Attachments: MemStoreScanPerformance.java, MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch This issue is about profiling hbase to see if I can make hbase scans run faster when all is up in memory. Talking to some users, they are seeing about 1/4 million rows a second. It should be able to go faster than this (Scanning an array of objects, they can do about 4-5x this). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070470#comment-13070470 ] Ted Yu commented on HBASE-3845: --- That is fine. data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845__trunk.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Status: Open (was: Patch Available) data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845__trunk.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Status: Patch Available (was: Open) data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Attachment: HBASE-3845_5.patch data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Attachment: HBASE-3845_trunk_2.patch data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix
[ https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070498#comment-13070498 ] Ted Yu commented on HBASE-4134: --- https://issues.apache.org/jira/browse/HBASE-4053 is in 0.90.4 RC1 Do you want to try out RC1 to see if the situation of double counting has improved ? The total number of regions was more than the actual region count after the hbck fix Key: HBASE-4134 URL: https://issues.apache.org/jira/browse/HBASE-4134 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: feng xu Fix For: 0.90.4 1. I found the problem(some regions were multiply assigned) while running hbck to check the cluster's health. Here's the result: {noformat} ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is listed in META on region server 158-1-91-103:20020 but is multiply assigned to region servers 158-1-91-103:20020, 158-1-91-105:20020 Summary: -ROOT- is okay. Number of regions: 1 Deployed on: 158-1-91-105:20020 .META. is okay. Number of regions: 1 Deployed on: 158-1-91-103:20020 test1 is okay. Number of regions: 25297 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 14829 inconsistencies detected. Status: INCONSISTENT {noformat} 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. But I found that the total number of regions reported by load balancer (35029) was more than the actual region count(25299) after the fixing. Here's the related logs snippet: {noformat} 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=25299 average=8433.0 mostloaded=8433 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=35029 average=11676.333 mostloaded=11677 leastloaded=11676 {noformat} 3. I tracked one region's behavior during the time. Taking the region of test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example: (1) It was assigned to 158-1-91-101 at first. (2) HBCK sent closing request to RegionServer. And RegionServer closed it silently without notice to HMaster. (3) The region was still carried by RS 158-1-91-103 which was known to HMaster. (4) HBCK will trigger a new assignment. The fact is, the region was assigned again, but the old assignment information still remained in AM#regions,AM#servers. That's why the problem of region count was larger than the actual number occurred. {noformat} Line 178967: 2011-07-22 02:47:51,247 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., server=HBCKServerName, state=M_ZK_REGION_OFFLINE) Line 178968: 2011-07-22 02:47:51,247 INFO org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, region=52782c0241a598b3e37ca8729da0 Line 178969: 2011-07-22 02:47:51,248 INFO org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering assignment of region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. Line 178970: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) available servers Line 178971: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 158-1-91-101,20020,1311231878544 Line 178983: 2011-07-22 02:47:51,285 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, region=52782c0241a598b3e37ca8729da0 Line 179001: 2011-07-22 02:47:51,318 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling
[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.
[ https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070520#comment-13070520 ] ramkrishna.s.vasudevan commented on HBASE-4138: --- 1. I tried to identify the problem in HBASE-4138. I ended up in the following analysis, The HMaster creates the BASENODE along with unassigned node, RS node and table node based on the zookeeper.znode.parent property. Currently when we use the HTable() as part of getConnection() if the this value is not configured we tend to create a new connection. Two points to not here is - 1)The HTable documentation clearly states us to use the same configuration object. But what if its not done, particularly someone forgets to set this base node property. Even it may be like in my RS instance i have configured the property but not in the master instance. 2)The reuse of the getConnection() logic across all levels, was it intended ? The major problem lies in the the HConnectionManager.setupZookeeperTrackers() which tries to create the BASENODES again. What i feel here is, this should not be done as only the master should have the rights to create it else there are high possibility that muliple basenodes can be created. Currently as the Client creates the node once again with the default value '/hbase' the client keeps waiting to know the root location indefinitely. What happens in the Admin case: The same thing happens in admin case but in HBaseAdmin() we call the connection.getMaster api which throws an exception. 'ZooKeeper available but no active master location found' So we should prevent the Admin or HTable (In general any client even RS ) from creating the base nodes and what ever is created by the master should be used by the clients. If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node. --- Key: HBASE-4138 URL: https://issues.apache.org/jira/browse/HBASE-4138 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.3 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.4 Change the zookeeper.znode.parent property (default is /hbase). Now do not specify this change in the client code. Use the HTable Object. The HTable is not able to find the root region and keeps continuously looping. Find the stack trace: Object.wait(long) line: not available [native method] RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122 RootRegionTracker.waitRootRegionLocation(long) line: 73 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 578 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[], byte[], byte[], boolean, Object) line: 687 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 589 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[], byte[], byte[], boolean, Object) line: 687 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 593 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HTable.init(Configuration, byte[]) line: 171 HTable.init(Configuration, String) line: 145 HBaseTest.test() line: 45 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070542#comment-13070542 ] Prakash Khemani commented on HBASE-3845: In the patch that is deployed internally we have implemented a different approach. We remove the region's entry in startCacheFlush() and save it (as opposed to the current behavior of removing the entry in completeCacheFlush()). If the flush aborts then we restore the saved entry. The approach taken in the latest patch in this jira might also be OK. I have a few comments {noformat} this.lastSeqWritten.remove(encodedRegionName); + Long seqWhileFlush = this.seqWrittenWhileFlush.get(encodedRegionName); + if (null != seqWhileFlush) { +this.lastSeqWritten.putIfAbsent(encodedRegionName, seqWhileFlush); +this.seqWrittenWhileFlush.remove(encodedRegionName); + {noformat} seqWrittenWhileFlush .get() and subsequent .remove() can be replaced by a single .remove() {code} Long seqWhileFlush = this.seqWrittenWhileFlush.remove(encodedRegionName); if (null != seqWhileFlush) { lSW.put(encodedRegionName, seqWhileFlush); else lSW.remove(encodedRegionName); {code} == The bigger problem here is that completeCacheFlush() is not called with updatedLock acquired. Therefore there might still be correctness issues with the latest patch. == {noformat} public void abortCacheFlush() { +this.isFlushInProgress.set(false); this.cacheFlushLock.unlock(); } {noformat} shouldn't seqWrittenWhileFlush be cleaned up in abort cache? data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070545#comment-13070545 ] Ted Yu commented on HBASE-3845: --- @Prakash: Would you be able to share your patch ? The bigger problem here is that completeCacheFlush() is not called with updatedLock acquired. See line 1154 in HLog: {code} synchronized (updateLock) { {code} data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070588#comment-13070588 ] Andrew Purtell commented on HBASE-4132: --- @dhruba Thanks. For example should be preArchiveStart and postArchiveStart, preArchiveCompleted and postArchiveCompleted. In part it is a naming convention, in part it is a contract: Pre hooks allow introduction of preprocessing and also important override of default behavior and associated short-circuiting of base processing or any additional coprocessors. Post hooks allow introduction of postprocessing and modification of return values. Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Attachment: HBASE-3845_6.patch data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Status: Open (was: Patch Available) data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Attachment: HBASE-3845_trunk_3.patch data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-3845: -- Status: Patch Available (was: Open) data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions
[stargate] Update ScannerModel with support for filter package additions Key: HBASE-4139 URL: https://issues.apache.org/jira/browse/HBASE-4139 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.0 Filters have been added to the o.a.h.h.filters package without updating o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070614#comment-13070614 ] Prakash Khemani commented on HBASE-3845: In the method internalFlushcache() I don't see updatesLock.writeLock() being held around the following piece of code. {code} if (wal != null) { wal.completeCacheFlush(this.regionInfo.getEncodedNameAsBytes(), regionInfo.getTableDesc().getName(), completeSequenceId, this.getRegionInfo().isMetaRegion()); } {code} == I will upload the internal patch for reference ... data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready
[ https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Dogaru updated HBASE-3899: --- Attachment: HBASE-3899.patch @stack, followup from review board: HBaseServer.Call uses warnResponseSize from parent class. Also, similar code is in production on Facebook clusters. This patch only adds and tests new behavior, but it is not actually used yet. enhance HBase RPC to support free-ing up server handler threads even if response is not ready - Key: HBASE-3899 URL: https://issues.apache.org/jira/browse/HBASE-3899 Project: HBase Issue Type: Improvement Components: ipc Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt In the current implementation, the server handler thread picks up an item from the incoming callqueue, processes it and then wraps the response as a Writable and sends it back to the IPC server module. This wastes thread-resources when the thread is blocked for disk IO (transaction logging, read into block cache, etc). It would be nice if we can make the RPC Server Handler threads pick up a call from the IPC queue, hand it over to the application (e.g. HRegion), the application can queue it to be processed asynchronously and send a response back to the IPC server module saying that the response is not ready. The RPC Server Handler thread is now ready to pick up another request from the incoming callqueue. When the queued call is processed by the application, it indicates to the IPC module that the response is now ready to be sent back to the client. The RPC client continues to experience the same behaviour as before. A RPC client is synchronous and blocks till the response arrives. This RPC enhancement allows us to do very powerful things with the RegionServer. In future, we can make enhance the RegionServer's threading model to a message-passing model for better performance. We will not be limited by the number of threads in the RegionServer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready
[ https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070727#comment-13070727 ] jirapos...@reviews.apache.org commented on HBASE-3899: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1174/#review1179 --- I think we need some additional metrics for number of outstanding (delayed) calls... how do we debug cases where calls are getting orphaned? src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java https://reviews.apache.org/r/1174/#comment2475 RPC calls can return Writables or any java primitive supported by ObjectWritable. So, this should probably be Object result. src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/1174/#comment2476 this isn't your code... but this expression is always true! src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/1174/#comment2477 this is a no-op. need proper error handling src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/1174/#comment2478 assert this.delayResponse src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/1174/#comment2479 assert !delayResponse src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/1174/#comment2480 if !delayResponse, would we ever have response == null? src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/1174/#comment2481 shouldn't this just be a call to enqueueInSelector now? - Todd On 2011-07-22 00:17:13, Vlad Dogaru wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1174/ bq. --- bq. bq. (Updated 2011-07-22 00:17:13) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Free up RPC server Handler thread if the called routine specifies the call should be delayed. The RPC client sees no difference, changes are server-side only. This is based on the previous submitted patch from Dhruba. bq. bq. bq. This addresses bug HBASE-3899. bq. https://issues.apache.org/jira/browse/HBASE-3899 bq. bq. bq. Diffs bq. - bq. bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915 bq.src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/1174/diff bq. bq. bq. Testing bq. --- bq. bq. Unit tests run. Also, the patch includes a new unit test. bq. bq. bq. Thanks, bq. bq. Vlad bq. bq. enhance HBase RPC to support free-ing up server handler threads even if response is not ready - Key: HBASE-3899 URL: https://issues.apache.org/jira/browse/HBASE-3899 Project: HBase Issue Type: Improvement Components: ipc Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt In the current implementation, the server handler thread picks up an item from the incoming callqueue, processes it and then wraps the response as a Writable and sends it back to the IPC server module. This wastes thread-resources when the thread is blocked for disk IO (transaction logging, read into block cache, etc). It would be nice if we can make the RPC Server Handler threads pick up a call from the IPC queue, hand it over to the application (e.g. HRegion), the application can queue it to be processed asynchronously and send a response back to the IPC server module saying that the response is not ready. The RPC Server Handler thread is now ready to pick up another request from the incoming callqueue. When the queued call is processed by the application, it indicates to the IPC module that the response is now ready to be sent back to the client. The RPC client continues to experience the same behaviour as before. A RPC client is synchronous and blocks till the response arrives. This RPC enhancement allows us to do very powerful things with the RegionServer. In future, we can make enhance the RegionServer's threading model to a message-passing model for better performance. We will not be limited by the number
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070752#comment-13070752 ] stack commented on HBASE-4132: -- @Dhruba bq. Given the above, do you think that this patch needs to touch LogCleaner at all? If so, what is ur proposal? No proposal. Just wanted to point you at some utility we have already that you might not have known about and that might have helped you composing your addition. Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.
[ https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070755#comment-13070755 ] stack commented on HBASE-4138: -- @Ram Your reasoning sounds right to me. I agree ...we should prevent the Admin or HTable (In general any client even RS ) from creating the base nodes and what ever is created by the master should be used by the clients. Thanks for digging in on this. If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node. --- Key: HBASE-4138 URL: https://issues.apache.org/jira/browse/HBASE-4138 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.3 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.4 Change the zookeeper.znode.parent property (default is /hbase). Now do not specify this change in the client code. Use the HTable Object. The HTable is not able to find the root region and keeps continuously looping. Find the stack trace: Object.wait(long) line: not available [native method] RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122 RootRegionTracker.waitRootRegionLocation(long) line: 73 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 578 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[], byte[], byte[], boolean, Object) line: 687 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 589 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[], byte[], byte[], boolean, Object) line: 687 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[], boolean) line: 593 HConnectionManager$HConnectionImplementation.locateRegion(byte[], byte[]) line: 558 HTable.init(Configuration, byte[]) line: 171 HTable.init(Configuration, String) line: 145 HBaseTest.test() line: 45 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service
[ https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070804#comment-13070804 ] stack commented on HBASE-4137: -- Can you describe what you are trying to do? What do you mean by 'lookup service'? What are you looking up? And why would we put in place a DHT for lookups when there is already a means of locating data. Thanks. Implementation of Distributed Hash Table(DHT) for lookup service - Key: HBASE-4137 URL: https://issues.apache.org/jira/browse/HBASE-4137 Project: HBase Issue Type: Improvement Components: performance Affects Versions: 0.90.3 Reporter: vamshi To implement a scalable data structure i.e Distributed hash table in the HBase. In the Hbase to perform fast lookup we can take the help of Distributed Hash Table (DHT), a scalable data structure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster
[ https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070819#comment-13070819 ] stack commented on HBASE-1938: -- bq. To me, it makes sense to precalculate the value returned by peek, and reuse it in next(). If there is no chance of the value changing between the peek and next, it sounds good (I've not looked at this code in a while). bq. It would be great to save this system call in a next(). Yes (I like how you figure there's a system call doing thread local get). bq. In fact, as this value seems to be a TLS, I don't see how it could change during the execution of next(). What do you think? (I'm being lazy. I've not looked at the code). The updates to RWCC happen at well-defined points so should be easy enough to elicit if there is a problem w/ your presumption above. bq. Last question on this: what is the use case when the getThreadReadPoint() will change during the whole scan (i.e.: between next)? IIRC, we want to let the scan see the most up-to-date view on a row though our guarantees are less than this (See http://hbase.apache.org/acid-semantics.html). bq. Most of the public methods (except reseek) are synchronized, it implies that the scanner can be shared between threads? That seems like a valid deduction to make. bq. 1) Replacement of KeyValue lowest = getLowest(); You mean in MemStore#reseek? What would you put in its place (Sorry if I'm not following the bouncing ball properly). bq. ...don't get the data getThreadReadPoint() So, we'd just hold to the current read point for how long? The full scan? That might be possible given our lax guarantees above though it would be nice to not have to give up on up to the millisecond views on rows. bq. Another option is to share getThreadReadPoint() value for the two iterators, i.e. read the value in the next() function, and give it as a parameter to getNext() What are the 'two iterators' here? Sorry N, I don't have my head as deep in this stuff as you do currently so my questions and answers above may be off. Please compensate appropriately. Make in-memory table scanning faster Key: HBASE-1938 URL: https://issues.apache.org/jira/browse/HBASE-1938 Project: HBase Issue Type: Improvement Components: performance Reporter: stack Assignee: stack Priority: Blocker Attachments: MemStoreScanPerformance.java, MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch This issue is about profiling hbase to see if I can make hbase scans run faster when all is up in memory. Talking to some users, they are seeing about 1/4 million rows a second. It should be able to go faster than this (Scanning an array of objects, they can do about 4-5x this). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service
[ https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-4137. -- Resolution: Incomplete The description does not seem to apply to hbase and the description has been sprayed across a few random issues which leads me to believe the author is not clear themselves on what is wanted. Resolving as incomplete. Implementation of Distributed Hash Table(DHT) for lookup service - Key: HBASE-4137 URL: https://issues.apache.org/jira/browse/HBASE-4137 Project: HBase Issue Type: Improvement Components: performance Affects Versions: 0.90.3 Reporter: vamshi To implement a scalable data structure i.e Distributed hash table in the HBase. In the Hbase to perform fast lookup we can take the help of Distributed Hash Table (DHT), a scalable data structure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070825#comment-13070825 ] Gary Helmling commented on HBASE-4132: -- Hmm, this seems to be a confusion of {{org.apache.hadoop.hbase.regionserver.wal.WALObserver}} and {{org.apache.hadoop.hbase.coprocessor.WALObserver}}. Not surprising since both classes have the same name. I think the former is the WAL listener used in replication and the latter is the coprocessor interface for WALs. I know the former has been around longer, but maybe we should consider renaming it to WALListener. Or maybe we should bite the bullet and combine these two interfaces to one. (I say that knowing very little about replication and whether it would make sense/be feasible to convert it to a coprocessor implementation). Anyway, I see no problem adding {{pre/postArchiveStart}} and {{pre/postArchiveCompleted}} to {{org.apache.hadoop.hbase.coprocessor.WALObserver}}, as Andy mentions. Would that be sufficient, or should we look at adding the logRoll and logClose events from {{o.a.h.h.regionserver.wal.WALObserver}} as well? Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster
[ https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070829#comment-13070829 ] Ted Yu commented on HBASE-1938: --- bq. 1) Replacement of KeyValue lowest = getLowest(); It is in the seek function bq. Another option is to share getThreadReadPoint() value for the two iterators N was talking about the following code in MemStore.next(): {code} if (theNext == kvsetNextRow) { kvsetNextRow = getNext(kvsetIt); } else { snapshotNextRow = getNext(snapshotIt); } {code} The initiative was to save the system call. Make in-memory table scanning faster Key: HBASE-1938 URL: https://issues.apache.org/jira/browse/HBASE-1938 Project: HBase Issue Type: Improvement Components: performance Reporter: stack Assignee: stack Priority: Blocker Attachments: MemStoreScanPerformance.java, MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch This issue is about profiling hbase to see if I can make hbase scans run faster when all is up in memory. Talking to some users, they are seeing about 1/4 million rows a second. It should be able to go faster than this (Scanning an array of objects, they can do about 4-5x this). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070832#comment-13070832 ] stack commented on HBASE-4132: -- +1 on one interface only. J-D! Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Pi updated HBASE-4027: - Attachment: slabcachepatchv4.diff Added tests for eviction, now logs finely grained stats to file. Added a bunch of documentation. A bunch - this should take care of most of the documentation concerns. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using an documented {{clean}} method. The feature will be optional. After implementing, we can benchmark for differences in speed and garbage collection observances. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix
[ https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070834#comment-13070834 ] stack commented on HBASE-4134: -- @feng nice debugging The total number of regions was more than the actual region count after the hbck fix Key: HBASE-4134 URL: https://issues.apache.org/jira/browse/HBASE-4134 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: feng xu Fix For: 0.90.4 1. I found the problem(some regions were multiply assigned) while running hbck to check the cluster's health. Here's the result: {noformat} ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is listed in META on region server 158-1-91-103:20020 but is multiply assigned to region servers 158-1-91-103:20020, 158-1-91-105:20020 Summary: -ROOT- is okay. Number of regions: 1 Deployed on: 158-1-91-105:20020 .META. is okay. Number of regions: 1 Deployed on: 158-1-91-103:20020 test1 is okay. Number of regions: 25297 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 14829 inconsistencies detected. Status: INCONSISTENT {noformat} 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. But I found that the total number of regions reported by load balancer (35029) was more than the actual region count(25299) after the fixing. Here's the related logs snippet: {noformat} 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=25299 average=8433.0 mostloaded=8433 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=35029 average=11676.333 mostloaded=11677 leastloaded=11676 {noformat} 3. I tracked one region's behavior during the time. Taking the region of test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example: (1) It was assigned to 158-1-91-101 at first. (2) HBCK sent closing request to RegionServer. And RegionServer closed it silently without notice to HMaster. (3) The region was still carried by RS 158-1-91-103 which was known to HMaster. (4) HBCK will trigger a new assignment. The fact is, the region was assigned again, but the old assignment information still remained in AM#regions,AM#servers. That's why the problem of region count was larger than the actual number occurred. {noformat} Line 178967: 2011-07-22 02:47:51,247 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., server=HBCKServerName, state=M_ZK_REGION_OFFLINE) Line 178968: 2011-07-22 02:47:51,247 INFO org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, region=52782c0241a598b3e37ca8729da0 Line 178969: 2011-07-22 02:47:51,248 INFO org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering assignment of region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. Line 178970: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) available servers Line 178971: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 158-1-91-101,20020,1311231878544 Line 178983: 2011-07-22 02:47:51,285 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, region=52782c0241a598b3e37ca8729da0 Line 179001: 2011-07-22 02:47:51,318 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, region=52782c0241a598b3e37ca8729da0 Line 179002: 2011-07-22 02:47:51,319 DEBUG
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070835#comment-13070835 ] stack commented on HBASE-4132: -- Or, one should inherit from the other rather than repeat. Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070836#comment-13070836 ] Gary Helmling commented on HBASE-3909: -- It would be really nice to have this capability but seems way out there for 0.92. We can't depend on Hadoop trunk/0.23 classes for 0.92. We could fork the HADOOP-7001 patch or come up with our own approach, but either one is going to be a lot of work. And the server related changes to support this seem fairly tricky for anything beyond trivial configuration options -- ie, how to support reconfiguring number of rpc handler threads, say. All this adds up to: I'd suggest we punt from 0.92. Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.92.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070843#comment-13070843 ] Ted Yu commented on HBASE-3909: --- +1 on moving out of 0.92 Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.92.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070845#comment-13070845 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1191/ --- Review request for hbase and Todd Lipcon. Summary --- Uploading slabcachepatchv4 to review for Li Pi. This addresses bug HBASE-4027. https://issues.apache.org/jira/browse/HBASE-4027 Diffs - conf/hbase-env.sh 2d55d27 pom.xml 729dc37 src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5963552 src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/io/hfile/SlabCache.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 src/test/java/org/apache/hadoop/hbase/io/hfile/TestSingleSlabCache.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/io/hfile/TestSlabCache.java PRE-CREATION Diff: https://reviews.apache.org/r/1191/diff Testing --- Thanks, Todd Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using an documented {{clean}} method. The feature will be optional. After implementing, we can benchmark for differences in speed and garbage collection observances. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070859#comment-13070859 ] stack commented on HBASE-4027: -- Doc is great. These could be final: + private LruBlockCache onHeapCache; + private SlabCache offHeapCache; Says 'Metrics are the combined size and hits and misses of both caches' but down in getStats we seem to be getting onheap stats only. Intentional? Same for heapSize. Do you want to leave this line in hfile? + LOG.debug(decompressedSize = + decompressedSize); Whats it mean when you say 'An exception will be thrown if the cached data is larger than the size of the allocated block'? More notes later. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.diff Java offers the creation of direct byte buffers which are allocated outside of the heap. They need to be manually free'd, which can be accomplished using an documented {{clean}} method. The feature will be optional. After implementing, we can benchmark for differences in speed and garbage collection observances. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070860#comment-13070860 ] Ted Yu commented on HBASE-3845: --- +1 on HBASE-3845_trunk_3.patch Ran unit tests and they passed. data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.90.5 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070861#comment-13070861 ] stack commented on HBASE-3909: -- My thought on moving issues in and out of releases is just do it with justification rather than make justification and not move it waiting on others to agree. For example, you make a good case for moving the issue out Gary, so go for it. If someone objects, let them counter argue and move it back. If a dispute, we can move it to the dev list to duke it out. Good stuff. Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.92.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070862#comment-13070862 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1191/#review1182 --- could do with some tests for MetaSlab. also some multi-threaded tests - see MultithreadedTestUtil, example usage in TestMemStoreLAB pom.xml https://reviews.apache.org/r/1191/#comment2484 did you determine that this ConcurrentLinkedHashMap was different than the one in Guava? I thought it got incorporated into Guava, which we already depend on. src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2485 punctuation wise, I think it would be easier to read if you hyphenated on-heap and off-heap. This applies to log messages below as well. src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2486 No need to line-break here src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2487 consider using StringUtils.humanReadableInt for these sizes. src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2488 @Override src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2489 when you're just overriding something from the superclass, no need for javadoc unless it says something new and exciting. If you feel like you want to put something there, you can use /** {@inheritDoc} */ to be explicit that you're inheriting from the superclass. src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2490 I think you should only put-back into the on-heap cache in the case that the 'caching' parameter is true. src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2491 hrm, the class javadoc says that the statistics should be cumulative, but this seems to just forward src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2492 TODOs src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java https://reviews.apache.org/r/1191/#comment2493 is this code used? seems like dead copy-paste code to me. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java https://reviews.apache.org/r/1191/#comment2497 extraneous debugging left in src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java https://reviews.apache.org/r/1191/#comment2498 I think this is usually called a slab class - I think that name would be less confusing, since Meta is already used in HBase to refer to .META. src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java https://reviews.apache.org/r/1191/#comment2499 unclear what the difference is between the two. Is slabs the list of 2GB buffers, and buffers is the list of actual items that will be allocated? I think the traditional names here are slabs and items. where each slab holds some number of allocatable items Also, rather than // comments, use /** javadoc comments */ before the vars src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java https://reviews.apache.org/r/1191/#comment2500 these vars probably better called maxBlocksPerSlab and maxSlabSize, since they're upper bounds. src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java https://reviews.apache.org/r/1191/#comment2501 I think this code would be a little easier to understand if you split it into one loop for the full slabs, and an if statement for the partially full one. Something like: int numFullSlabs = numBlocks / maxBlocksPerSlab; boolean hasPartialSlab = (numBlocks % maxBlocksPerSlab) 0; for (int i = 0; i numFullSlabs; i++) { alloc one of maxSlabSize; addBuffersForSlab(slab); } if (hasPartialSlab) { alloc the partial one addBuffersForSlab(slab); } src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java https://reviews.apache.org/r/1191/#comment2502 should be a LOG.warn src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java https://reviews.apache.org/r/1191/#comment2503 shouldn't this class have an alloc() and free() method? src/main/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCache.java https://reviews.apache.org/r/1191/#comment2511 shouldn't this implement BlockCache?
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070863#comment-13070863 ] Jean-Daniel Cryans commented on HBASE-4132: --- It could be weird, if we just merge them then the Replication class (and others implementing wal.WALObserver in the future) would have imports for CP classes since ObserverContext is passed in the cp.WALObserver methods. I'd prefer a rename of either or both. Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...
[ https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4064: -- Attachment: (was: HBASE-4064_branch90V2.patch) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long... Key: HBASE-4064 URL: https://issues.apache.org/jira/browse/HBASE-4064 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.3 Reporter: Jieshan Bean Fix For: 0.90.5 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, disableflow.png 1. If there is a rubbish RegionState object with PENDING_CLOSE in regionsInTransition(The RegionState was remained by some exception which should be removed, that's why I called it as rubbish object), but the region is not currently assigned anywhere, TimeoutMonitor will fall into an endless loop: 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:21,438 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:21,441 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere . 2 In the following scenario, two concurrent unassigning call of the same region may lead to the above problem: the first unassign call send rpc call success, the master watched the event of RS_ZK_REGION_CLOSED, process this event, will create a ClosedRegionHandler to remove the state of the region in master.eg. while ClosedRegionHandler is running in hbase.master.executor.closeregion.threads thread (A), another unassign call of same region run in another thread(B). while thread B run if (!regions.containsKey(region)), this.regions have the region info, now cpu switch to thread A. The thread A will remove the region from the sets of this.regions and regionsInTransition, then switch to thread B. the thread B run continue, will throw an exception with the msg of Server null returned java.lang.NullPointerException: Passed server is null for 9a6e26d40293663a79523c58315b930f, but without removing the new-adding RegionState from regionsInTransition,and it can not be removed for ever. public void unassign(HRegionInfo region, boolean force) { LOG.debug(Starting unassignment of region + region.getRegionNameAsString() + (offlining)); synchronized (this.regions) { //
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070877#comment-13070877 ] Andrew Purtell commented on HBASE-4132: --- bq. Hmm, this seems to be a confusion of org.apache.hadoop.hbase.regionserver.wal.WALObserver and org.apache.hadoop.hbase.coprocessor.WALObserver. Aha. I agree with J-D, should do a rename. Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070887#comment-13070887 ] Ted Yu commented on HBASE-4132: --- org.apache.hadoop.hbase.regionserver.wal.WALObserver is mostly internal. It is used for LogRoller and replication. Shall we rename it to org.apache.hadoop.hbase.regionserver.wal.WALActionsListener ? See HRegionServer.getWALActionListeners(): {code} // Replication handler is an implementation of WALActionsListener. listeners.add(this.replicationHandler); {code} Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready
[ https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070891#comment-13070891 ] jirapos...@reviews.apache.org commented on HBASE-3899: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1174/ --- (Updated 2011-07-26 01:19:52.655737) Review request for hbase. Changes --- * Add checking for number of calls currently delayed. A warning message is issued if too many calls are delayed. * Unit test to check that above warning works. * endDelay() now takes an Object as a parameter, not a Writable. Initially, I thought the method that ended the delay should pack the response (i.e. endDelay(new HbaseObjectWritable(retval))), but it makes more sense to pack it in setResponse. * Address other feedback from Todd Lipcon. Thanks! Summary --- Free up RPC server Handler thread if the called routine specifies the call should be delayed. The RPC client sees no difference, changes are server-side only. This is based on the previous submitted patch from Dhruba. This addresses bug HBASE-3899. https://issues.apache.org/jira/browse/HBASE-3899 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915 src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/1174/diff Testing --- Unit tests run. Also, the patch includes a new unit test. Thanks, Vlad enhance HBase RPC to support free-ing up server handler threads even if response is not ready - Key: HBASE-3899 URL: https://issues.apache.org/jira/browse/HBASE-3899 Project: HBase Issue Type: Improvement Components: ipc Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt In the current implementation, the server handler thread picks up an item from the incoming callqueue, processes it and then wraps the response as a Writable and sends it back to the IPC server module. This wastes thread-resources when the thread is blocked for disk IO (transaction logging, read into block cache, etc). It would be nice if we can make the RPC Server Handler threads pick up a call from the IPC queue, hand it over to the application (e.g. HRegion), the application can queue it to be processed asynchronously and send a response back to the IPC server module saying that the response is not ready. The RPC Server Handler thread is now ready to pick up another request from the incoming callqueue. When the queued call is processed by the application, it indicates to the IPC module that the response is now ready to be sent back to the client. The RPC client continues to experience the same behaviour as before. A RPC client is synchronous and blocks till the response arrives. This RPC enhancement allows us to do very powerful things with the RegionServer. In future, we can make enhance the RegionServer's threading model to a message-passing model for better performance. We will not be limited by the number of threads in the RegionServer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready
[ https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070896#comment-13070896 ] jirapos...@reviews.apache.org commented on HBASE-3899: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1174/#review1185 --- Ship it! Looks good. Have you run the full test suite with the current iteration of the patch? - Todd On 2011-07-26 01:19:52, Vlad Dogaru wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1174/ bq. --- bq. bq. (Updated 2011-07-26 01:19:52) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Free up RPC server Handler thread if the called routine specifies the call should be delayed. The RPC client sees no difference, changes are server-side only. This is based on the previous submitted patch from Dhruba. bq. bq. bq. This addresses bug HBASE-3899. bq. https://issues.apache.org/jira/browse/HBASE-3899 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915 bq.src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/1174/diff bq. bq. bq. Testing bq. --- bq. bq. Unit tests run. Also, the patch includes a new unit test. bq. bq. bq. Thanks, bq. bq. Vlad bq. bq. enhance HBase RPC to support free-ing up server handler threads even if response is not ready - Key: HBASE-3899 URL: https://issues.apache.org/jira/browse/HBASE-3899 Project: HBase Issue Type: Improvement Components: ipc Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt In the current implementation, the server handler thread picks up an item from the incoming callqueue, processes it and then wraps the response as a Writable and sends it back to the IPC server module. This wastes thread-resources when the thread is blocked for disk IO (transaction logging, read into block cache, etc). It would be nice if we can make the RPC Server Handler threads pick up a call from the IPC queue, hand it over to the application (e.g. HRegion), the application can queue it to be processed asynchronously and send a response back to the IPC server module saying that the response is not ready. The RPC Server Handler thread is now ready to pick up another request from the incoming callqueue. When the queued call is processed by the application, it indicates to the IPC module that the response is now ready to be sent back to the client. The RPC client continues to experience the same behaviour as before. A RPC client is synchronous and blocks till the response arrives. This RPC enhancement allows us to do very powerful things with the RegionServer. In future, we can make enhance the RegionServer's threading model to a message-passing model for better performance. We will not be limited by the number of threads in the RegionServer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions
[ https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4139: -- Attachment: HBASE-4139.patch [stargate] Update ScannerModel with support for filter package additions Key: HBASE-4139 URL: https://issues.apache.org/jira/browse/HBASE-4139 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.90.4, 0.92.0 Attachments: HBASE-4139.patch Filters have been added to the o.a.h.h.filters package without updating o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions
[ https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4139: -- Fix Version/s: 0.90.4 Status: Patch Available (was: Open) [stargate] Update ScannerModel with support for filter package additions Key: HBASE-4139 URL: https://issues.apache.org/jira/browse/HBASE-4139 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.90.4, 0.92.0 Attachments: HBASE-4139.patch Filters have been added to the o.a.h.h.filters package without updating o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3937) Region PENDING-OPEN timeout with un-expected ZK node state leads to an endless loop
[ https://issues.apache.org/jira/browse/HBASE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3937: -- Fix Version/s: (was: 0.92.0) 0.94.0 Region PENDING-OPEN timeout with un-expected ZK node state leads to an endless loop --- Key: HBASE-3937 URL: https://issues.apache.org/jira/browse/HBASE-3937 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.94.0 I describe the scenario of how this problem happened: 1.HMaster assigned the region A to RS1. So the RegionState was set to PENDING_OPEN. 2.For there's too many opening requests, the open process on RS1 was blocked. 3.Some time later, TimeoutMonitor found the assigning of A was timeout. For the RegionState was in PENDING_OPEN, went into the following handler process(Just put the region into an waiting-assigning set): case PENDING_OPEN: LOG.info(Region has been PENDING_OPEN for too + long, reassigning region= + regionInfo.getRegionNameAsString()); assigns.put(regionState.getRegion(), Boolean.TRUE); break; So we can see that, under this case, we consider the ZK node state was OFFLINE. Indeed, in an normal disposal, it's OK. 4.But before the real-assigning, the requests of RS1 was disposed. So that affected the new-assigning. For it update the ZK node state from OFFLINE to OPENING. 5.The new assigning started, so it send region to open in RS2. But while the opening, it should update the ZK node state from OFFLINE to OPENING. For the current state is OPENING, so this operation failed. So this region couldn't be open success anymore. So I think, to void this problem , under the case of PENDING_OPEN of TiemoutMonitor, we should transform the ZK node state to OFFLINE first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4121) improve hbck tool to fix .META. hole issue.
[ https://issues.apache.org/jira/browse/HBASE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-4121. --- Resolution: Fixed Duplicate with HBASE-4122 improve hbck tool to fix .META. hole issue. --- Key: HBASE-4121 URL: https://issues.apache.org/jira/browse/HBASE-4121 Project: HBase Issue Type: Improvement Reporter: feng xu Fix For: 0.92.0 hbase hbck tool can check the META hole, but it can not fix this problem by --fix. I plan to improve the tool. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4114) Metrics for HFile HDFS block locality
[ https://issues.apache.org/jira/browse/HBASE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070922#comment-13070922 ] Ming Ma commented on HBASE-4114: Thanks, Stack, Ted. 1. In the experiment table above, the total number of HDFS blocks that can be retrieved locally by the region server as well as total number of HDFS blocks for all HFiles are defined on the whole cluster level. The external program also calculates locality information per hfile, region as well as per region server. It uses HDFS namenode and the calculation is independent of any map reduce jobs. 2. In terms of how we can calculate this metrics inside hbase, we can do in two steps. first one is to calcluate the metrics independent of map reduce jobs; the second one is to calcuate it on per map reduce job level. 3. Calculate on the locality index, independent of map reduce jobs. a. It will first be calcuated on hfile level { total # of HDFS block, total # of local HDFS blocks }; then the data get aggregated on region level, finally get aggregated on region server level. b. Impact on namenode. There are 2 RPC calls to NN to get block info for each hfile. If we assume 100 regions per RS, 10 hfiles per region, 500 RSs, we will have 1M RPC hits to NN. Most of the time, that won't be an issue if we only calculate hfile locality index when hfile is created or region is loaded by the RS the first time. Because HDFS can still move HDFS blocks around without hbase knowing it, we still need to refresh the value periodically. c. The computation can be done in RS or HMaster. It seems RS is better in terms of design(only store knows the HDFS path for hfile location, HMaster doesn't) and extensiblity(to calculate locality index per map reduce job). The locality index can be part of HServerLoad and RegionLoad for load balancer to use. RS will rotate through all regions periodically in its main thread. The calcuation interval defined by by hbase.regionserver.msginterval might be too short for this scenario to minimize the load to NN for a large cluster ( 20 NN RPC per RS per 3 sec ). d. The locality index can be a new RS metrics. We can also put it on table.jsp for each region. e. HRegionInfo is kind of static. It doesn't change over time, however, locality index changes overtime for a given region. Maybe ClusterStatus/HServerInfo/HServerLoad/RegionLoad are better? 4. Locality index calculation for scan / map reduce job. a. The original scenario is for full table scan only. If we want to provide accurate locality index for any scan / map reduce, this could be tricky given i) map reduce job can have start/end keys and filters such as time range; ii) block cache can be used and thus hfile shouldn't be accounted for if there is cache hit. iii) data volume read from HDFS block is also a factor. Reading smaller buffer is different from reading bigger buffer. b. One useful scenario is we want to find out why map jobs run slower sometimes. So it is useful the metrics is just there as part of map reduce job status page. We can estimate by using ganglia page to get the locality index value for the RSs at the time map reduce job is run. c. To provide more accurate data, we can modify TableInputFormat, a) call HBaseAdmin.getClusterStatus to get the locality index info for each region. b) calculate the intersection between scan specification and ClusterStatus based on key range as well as column family. It isn't 100% accurate, but it might be good enough. d. To be really accurate, region server needs to provide locality index for each scan operation back to the client. Metrics for HFile HDFS block locality - Key: HBASE-4114 URL: https://issues.apache.org/jira/browse/HBASE-4114 Project: HBase Issue Type: Improvement Components: metrics, regionserver Reporter: Ming Ma Assignee: Ming Ma Normally, when we put hbase and HDFS in the same cluster ( e.g., region server runs on the datenode ), we have a reasonably good data locality, as explained by Lars. Also Work has been done by Jonathan to address the startup situation. There are scenarios where regions can be on a different machine from the machines that hold the underlying HFile blocks, at least for some period of time. This will have performance impact on whole table scan operation and map reduce job during that time. 1.After load balancer moves the region and before compaction (thus generate HFile on the new region server ) on that region, HDFS block can be remote. 2.When a new machine is added, or removed, Hbase's region assignment policy is different from HDFS's block reassignment policy. 3.Even if there is no much hbase activity, HDFS can load balance HFile blocks as other non-hbase applications push other
[jira] [Updated] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix
[ https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feng xu updated HBASE-4134: --- Fix Version/s: (was: 0.90.4) 0.94.0 The total number of regions was more than the actual region count after the hbck fix Key: HBASE-4134 URL: https://issues.apache.org/jira/browse/HBASE-4134 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: feng xu Fix For: 0.94.0 1. I found the problem(some regions were multiply assigned) while running hbck to check the cluster's health. Here's the result: {noformat} ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is listed in META on region server 158-1-91-103:20020 but is multiply assigned to region servers 158-1-91-103:20020, 158-1-91-105:20020 Summary: -ROOT- is okay. Number of regions: 1 Deployed on: 158-1-91-105:20020 .META. is okay. Number of regions: 1 Deployed on: 158-1-91-103:20020 test1 is okay. Number of regions: 25297 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 14829 inconsistencies detected. Status: INCONSISTENT {noformat} 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. But I found that the total number of regions reported by load balancer (35029) was more than the actual region count(25299) after the fixing. Here's the related logs snippet: {noformat} 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=25299 average=8433.0 mostloaded=8433 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=35029 average=11676.333 mostloaded=11677 leastloaded=11676 {noformat} 3. I tracked one region's behavior during the time. Taking the region of test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example: (1) It was assigned to 158-1-91-101 at first. (2) HBCK sent closing request to RegionServer. And RegionServer closed it silently without notice to HMaster. (3) The region was still carried by RS 158-1-91-103 which was known to HMaster. (4) HBCK will trigger a new assignment. The fact is, the region was assigned again, but the old assignment information still remained in AM#regions,AM#servers. That's why the problem of region count was larger than the actual number occurred. {noformat} Line 178967: 2011-07-22 02:47:51,247 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., server=HBCKServerName, state=M_ZK_REGION_OFFLINE) Line 178968: 2011-07-22 02:47:51,247 INFO org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, region=52782c0241a598b3e37ca8729da0 Line 178969: 2011-07-22 02:47:51,248 INFO org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering assignment of region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. Line 178970: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) available servers Line 178971: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 158-1-91-101,20020,1311231878544 Line 178983: 2011-07-22 02:47:51,285 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, region=52782c0241a598b3e37ca8729da0 Line 179001: 2011-07-22 02:47:51,318 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, region=52782c0241a598b3e37ca8729da0 Line 179002: 2011-07-22 02:47:51,319 DEBUG
[jira] [Created] (HBASE-4140) book: Update our hadoop vendor section
book: Update our hadoop vendor section -- Key: HBASE-4140 URL: https://issues.apache.org/jira/browse/HBASE-4140 Project: HBase Issue Type: Improvement Reporter: stack -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4140) book: Update our hadoop vendor section
[ https://issues.apache.org/jira/browse/HBASE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4140: - Attachment: hadoop.txt Updated Cloudera mention to recommend released CDH and to note point update. Add reference to MapR distribution. book: Update our hadoop vendor section -- Key: HBASE-4140 URL: https://issues.apache.org/jira/browse/HBASE-4140 Project: HBase Issue Type: Improvement Reporter: stack Attachments: hadoop.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival
[ https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070946#comment-13070946 ] Jean-Daniel Cryans commented on HBASE-4132: --- +1 Extend the WALObserver API to accomodate log archival - Key: HBASE-4132 URL: https://issues.apache.org/jira/browse/HBASE-4132 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.92.0 Attachments: walArchive.txt The WALObserver interface exposes the log roll events. It would be nice to extend it to accomodate log archival events as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix
[ https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070954#comment-13070954 ] feng xu commented on HBASE-4134: To Ted Yu: The HBASE-4053 patch has been integrated before this issue occurred in my test cluster. I think this issue has no relationship with HBASE-4053. the HBASE-4053 patch ensure that the region is not double counting in one regionserver. but in this issue the region was carried by two(maybe more) regionservers. The total number of regions was more than the actual region count after the hbck fix Key: HBASE-4134 URL: https://issues.apache.org/jira/browse/HBASE-4134 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: feng xu Fix For: 0.94.0 1. I found the problem(some regions were multiply assigned) while running hbck to check the cluster's health. Here's the result: {noformat} ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is listed in META on region server 158-1-91-101:20020 but is multiply assigned to region servers 158-1-91-101:20020, 158-1-91-105:20020 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is listed in META on region server 158-1-91-103:20020 but is multiply assigned to region servers 158-1-91-103:20020, 158-1-91-105:20020 Summary: -ROOT- is okay. Number of regions: 1 Deployed on: 158-1-91-105:20020 .META. is okay. Number of regions: 1 Deployed on: 158-1-91-103:20020 test1 is okay. Number of regions: 25297 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 14829 inconsistencies detected. Status: INCONSISTENT {noformat} 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. But I found that the total number of regions reported by load balancer (35029) was more than the actual region count(25299) after the fixing. Here's the related logs snippet: {noformat} 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=25299 average=8433.0 mostloaded=8433 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=3 regions=35029 average=11676.333 mostloaded=11677 leastloaded=11676 {noformat} 3. I tracked one region's behavior during the time. Taking the region of test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example: (1) It was assigned to 158-1-91-101 at first. (2) HBCK sent closing request to RegionServer. And RegionServer closed it silently without notice to HMaster. (3) The region was still carried by RS 158-1-91-103 which was known to HMaster. (4) HBCK will trigger a new assignment. The fact is, the region was assigned again, but the old assignment information still remained in AM#regions,AM#servers. That's why the problem of region count was larger than the actual number occurred. {noformat} Line 178967: 2011-07-22 02:47:51,247 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., server=HBCKServerName, state=M_ZK_REGION_OFFLINE) Line 178968: 2011-07-22 02:47:51,247 INFO org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, region=52782c0241a598b3e37ca8729da0 Line 178969: 2011-07-22 02:47:51,248 INFO org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering assignment of region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. Line 178970: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) available servers Line 178971: 2011-07-22 02:47:51,248 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 158-1-91-101,20020,1311231878544 Line 178983: 2011-07-22 02:47:51,285 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544,