[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4124: -- Attachment: HBASE-4124_Branch90V2.patch ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088146#comment-13088146 ] gaojinchao commented on HBASE-4124: --- I have finished the test. I discribe the scene: step 1: startup cluster step 2: abort the master when finish call sendRegionOpen(destination, regions) step 3: startup cluster again. above steps will reproduce the issue. when master is failover. the meta records the dead server,but the region is processing for a living region server. ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088147#comment-13088147 ] gaojinchao commented on HBASE-4124: --- sorry.step 3: startup master again . ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it
[ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088152#comment-13088152 ] stack commented on HBASE-4209: -- Is it because no shutdown hook in master and when in standalone mode all runs in the one jvm, the master's effectively? In start-hbase.sh, if distmode is false, we ONLY start master: {code} if [ $distMode == 'false' ] then $bin/hbase-daemon.sh start master else $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} start zookeeper $bin/hbase-daemon.sh --config ${HBASE_CONF_DIR} start master $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \ --hosts ${HBASE_REGIONSERVERS} start regionserver $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \ --hosts ${HBASE_BACKUP_MASTERS} start master-backup fi {code} Inside in master it will take care of starting up all the other beasties if distmode == false. The HBase hbase-daemon.sh SIGKILLs master when stopping it -- Key: HBASE-4209 URL: https://issues.apache.org/jira/browse/HBASE-4209 Project: HBase Issue Type: Bug Components: master Reporter: Roman Shaposhnik There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4124: -- Attachment: HBASE-4124_Branch90V2.patch ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4124: -- Attachment: (was: HBASE-4124_Branch90V2.patch) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4124: -- Attachment: HBASE-4124_Branch90V2.patch ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4124: -- Attachment: (was: HBASE-4124_Branch90V2.patch) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
[ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088173#comment-13088173 ] gaojinchao commented on HBASE-4124: --- I have added a test case for opening a region. ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Key: HBASE-4124 URL: https://issues.apache.org/jira/browse/HBASE-4124 Project: HBase Issue Type: Bug Components: master Reporter: fulin wang Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt Original Estimate: 0.4h Remaining Estimate: 0.4h ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'. Issue: The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache
[ https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088211#comment-13088211 ] jirapos...@reviews.apache.org commented on HBASE-4027: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1214/#review1585 --- In SingleSizeCache.cacheBlock(): CacheablePair newEntry = new CacheablePair( toBeCached.serialize(storedBlock), storedBlock); The above operation splits toBeCached into two parts: the first is for on-heap and is slim, storedBlock is for off-heap. src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java https://reviews.apache.org/r/1214/#comment3574 I think the word 'itself' in the javadoc above introduced confusion. It should be removed. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java https://reviews.apache.org/r/1214/#comment3573 As Pi explained in Cacheable interface, serialize() offloads majority of data to off-heap ByteBuffer. What gets returned is the skeleton that lives on-heap. - Ted On 2011-08-19 20:21:35, Li Pi wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1214/ bq. --- bq. bq. (Updated 2011-08-19 20:21:35) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, and Li Pi. bq. bq. bq. Summary bq. --- bq. bq. Review request - I apparently can't edit tlipcon's earlier posting of my diff, so creating a new one. bq. bq. bq. This addresses bug HBase-4027. bq. https://issues.apache.org/jira/browse/HBase-4027 bq. bq. bq. Diffs bq. - bq. bq.conf/hbase-env.sh 2d55d27 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java 3b130d8 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2c6c93 bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestCachedBlockQueue.java 1ad2ece bq.src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java f0a9832 bq. src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java d7e43a0 bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 bq. bq. Diff: https://reviews.apache.org/r/1214/diff bq. bq. bq. Testing bq. --- bq. bq. Ran benchmarks against it in HBase standalone mode. Wrote test cases for all classes, multithreaded test cases exist for the cache. bq. bq. bq. Thanks, bq. bq. Li bq. bq. Enable direct byte buffers LruBlockCache Key: HBASE-4027 URL: https://issues.apache.org/jira/browse/HBASE-4027 Project: HBase Issue Type: Improvement Reporter: Jason Rutherglen Assignee: Li Pi Priority: Minor Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, hbase4027v11.diff, hbase4027v12.1.diff,
[jira] [Commented] (HBASE-4167) Potential leak of HTable instances when using HTablePool with PoolType.ThreadLocal
[ https://issues.apache.org/jira/browse/HBASE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088214#comment-13088214 ] Ted Yu commented on HBASE-4167: --- +1 on patch. Potential leak of HTable instances when using HTablePool with PoolType.ThreadLocal -- Key: HBASE-4167 URL: https://issues.apache.org/jira/browse/HBASE-4167 Project: HBase Issue Type: Bug Components: client Reporter: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-4167.patch (Initially discussed in HBASE-4150) In HTablePool, when obtaining a table: {code} private HTableInterface findOrCreateTable(String tableName) { HTableInterface table = tables.get(tableName); if (table == null) { table = createHTable(tableName); } return table; } {code} In the case of {{ThreadLocalPool}}, it seems like there's an exposure here between when the table is created initially and when {{ThreadLocalPool.put()}} is called to set the thread local variable (on {{PooledHTable.close()}}). Potential solution described by Karthick Sankarachary: For one thing, we might want to clear the tables variable when the {{HTablePool}} is closed (as shown below). For another, we should override ThreadLocalPool#get method so that it removes the resource, otherwise it might end up referencing a HTableInterface that's has been released. {code} 1 diff --git a/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java b/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 2 index 952a3aa..c198f15 100755 3 --- a/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 4 +++ b/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 13 @@ -309,6 +310,7 @@ public class HTablePool implements Closeable { 14 for (String tableName : tables.keySet()) { 15closeTablePool(tableName); 16 } 17 +this.tables.clear(); 18} {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4222) Make HLog more resilient to write pipeline failures
[ https://issues.apache.org/jira/browse/HBASE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088219#comment-13088219 ] jirapos...@reviews.apache.org commented on HBASE-4222: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1590/#review1586 --- Ship it! TestHLog and TestLogRolling passed. - Ted On 2011-08-20 05:39:30, Gary Helmling wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1590/ bq. --- bq. bq. (Updated 2011-08-20 05:39:30) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This patch corrects a few problems, as I see it, with the current log rolling process: bq. bq. 1) HLog.LogSyncer.run() now handles an IOException in the inner while loop. Previously any IOException would cause the LogSyncer thread to exit, even if the subsequent log roll succeeded. This would mean the region server kept running without a LogSyncer thread bq. 2) Log rolls triggered by IOExceptions were being skipped in the event that there were no entries in the log. This would prevent the log from being recovered in a timely manner. bq. 3) minor - FailedLogCloseException was never actually being thrown out of HLog.cleanupCurrentWriter(), resulting in inaccurate logging on RS abort bq. bq. The bigger change is the addition of a configuration property -- hbase.regionserver.logroll.errors.tolerated -- that is checked against a counter of consecutive close errors to see whether or not an abort should be triggered. bq. bq. Prior to this patch, we could readily trigger region server aborts by rolling all the data nodes in a cluster while region servers were running. This was equally true whether write activity was happening or not. (In fact I think having concurrent write activity actually gave a better chance for the log to be rolled prior to all DNs in the write pipeline going down and thus the region server not aborting). bq. bq. With this change and hbase.regionserver.logroll.errors.tolerated=2, I can roll DNs at will without causing any loss of service. bq. bq. I'd appreciate some scrutiny on any log rolling subtleties or interactions I may be missing here. If there are alternate/better ways to handle this in the DFSClient layer, I'd also appreciate any pointers to that. bq. bq. bq. This addresses bug HBASE-4222. bq. https://issues.apache.org/jira/browse/HBASE-4222 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java 8e87c83 bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java c301d1b bq.src/main/resources/hbase-default.xml 66548ca bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 5063896 bq. bq. Diff: https://reviews.apache.org/r/1590/diff bq. bq. bq. Testing bq. --- bq. bq. Added a new test for rolling data nodes under a running cluster: TestLogRolling.testLogRollOnPipelineRestart(). bq. bq. Tested patch on a running cluster with 3 slaves, rolling data nodes with and without concurrent write activity. bq. bq. bq. Thanks, bq. bq. Gary bq. bq. Make HLog more resilient to write pipeline failures --- Key: HBASE-4222 URL: https://issues.apache.org/jira/browse/HBASE-4222 Project: HBase Issue Type: Improvement Components: wal Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 0.92.0 The current implementation of HLog rolling to recover from transient errors in the write pipeline seems to have two problems: # When {{HLog.LogSyncer}} triggers an {{IOException}} during time-based sync operations, it triggers a log rolling request in the corresponding catch block, but only after escaping from the internal while loop. As a result, the {{LogSyncer}} thread will exit and never be restarted from what I can tell, even if the log rolling was successful. # Log rolling requests triggered by an {{IOException}} in {{sync()}} or {{append()}} never happen if no entries have yet been written to the log. This means that write errors are not immediately recovered, which extends the exposure to more errors occurring in the pipeline. In addition, it seems like we should be able to better handle transient problems, like a rolling restart of DataNodes while the HBase RegionServers are running. Currently this will reliably cause RegionServer aborts during log rolling: either
[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)
[ https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088224#comment-13088224 ] Subbu M Iyer commented on HBASE-4213: - Yes. I will.. I am on it. Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) Key: HBASE-4213 URL: https://issues.apache.org/jira/browse/HBASE-4213 Project: HBase Issue Type: Improvement Reporter: Subbu M Iyer Assignee: Subbu M Iyer Fix For: 0.92.0 Attachments: HBASE-4213-Instant_schema_change.patch, HBASE-4213_Instant_schema_change_-Version_2_.patch This Jira is a slight variation in approach to what is being done as part of https://issues.apache.org/jira/browse/HBASE-1730 Support instant schema updates such as Modify Table, Add Column, Modify Column operations: 1. With out enable/disabling the table. 2. With out bulk unassign/assign of regions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4235) Attempts to reconnect to expired ZooKeeper sessions
Attempts to reconnect to expired ZooKeeper sessions --- Key: HBASE-4235 URL: https://issues.apache.org/jira/browse/HBASE-4235 Project: HBase Issue Type: Task Affects Versions: 0.92.0, 0.90.5 Reporter: Andrew Purtell Assignee: Andrew Purtell In a couple of instances of short network outages, we have observed afterward zombie HBase processes attempting over and over to reconnect to expired ZooKeeper sessions. We believe this is due to ZOOKEEPER-1159. Opening this issue as reference to that. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4229) Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088252#comment-13088252 ] Andrew Purtell commented on HBASE-4229: --- +1 Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter Key: HBASE-4229 URL: https://issues.apache.org/jira/browse/HBASE-4229 Project: HBase Issue Type: Improvement Components: wal Reporter: Riley Patterson Assignee: Riley Patterson Priority: Trivial Fix For: 0.92.0 Attachments: HBASE-4229.patch HBase makes use of both jackson (in the region server) and jettison (in HLogPrettyPrinter) for JSON encoding. Jackson seems to be better maintained, so this patch standardizes by using jackson in HLogPrettyPrinter instead of jettison. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4230) Compaction threads need names
[ https://issues.apache.org/jira/browse/HBASE-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4230: -- Status: Patch Available (was: Open) Compaction threads need names - Key: HBASE-4230 URL: https://issues.apache.org/jira/browse/HBASE-4230 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Fix For: 0.92.0 Attachments: HBASE-4230.patch The CompactSplitThread creates executors for doing compaction work, but threads end up named things like pool-2-thread-1 which isn't very useful. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4230) Compaction threads need names
[ https://issues.apache.org/jira/browse/HBASE-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4230: -- Attachment: HBASE-4230.patch Perhaps like the attached? Compaction threads need names - Key: HBASE-4230 URL: https://issues.apache.org/jira/browse/HBASE-4230 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Fix For: 0.92.0 Attachments: HBASE-4230.patch The CompactSplitThread creates executors for doing compaction work, but threads end up named things like pool-2-thread-1 which isn't very useful. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4230) Compaction threads need names
[ https://issues.apache.org/jira/browse/HBASE-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088263#comment-13088263 ] stack commented on HBASE-4230: -- +1 Compaction threads need names - Key: HBASE-4230 URL: https://issues.apache.org/jira/browse/HBASE-4230 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Fix For: 0.92.0 Attachments: HBASE-4230.patch The CompactSplitThread creates executors for doing compaction work, but threads end up named things like pool-2-thread-1 which isn't very useful. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4229) Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4229: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Thanks for the patch Riley. Applied to TRUNK. Replace Jettison JSON encoding with Jackson in HLogPrettyPrinter Key: HBASE-4229 URL: https://issues.apache.org/jira/browse/HBASE-4229 Project: HBase Issue Type: Improvement Components: wal Reporter: Riley Patterson Assignee: Riley Patterson Priority: Trivial Fix For: 0.92.0 Attachments: HBASE-4229.patch HBase makes use of both jackson (in the region server) and jettison (in HLogPrettyPrinter) for JSON encoding. Jackson seems to be better maintained, so this patch standardizes by using jackson in HLogPrettyPrinter instead of jettison. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088266#comment-13088266 ] Matt Corgan commented on HBASE-4218: I lean towards byte-encoding ints whenever they're used often enough to have an impact on memory. KeyValue could probably do better with some VInts. You can encode 128 values in 1 byte and decode it with just one branch to check if b[0] 0. Given the number of other byte comparisons going during reading the key, that doesn't seem too heavyweight (especially since many of those other byte comparisons are casting the byte to a positive integer before comparing). If you reserved 2-4 bytes for that same number, then you may be doing even more work. One problem with VInt decoders is that sometimes they do bounds checking which can slow things down a lot. I think validation should be done at write time, and then possibly using a block-level checksum when a block is copied back into memory. Then assume everything is correct. For prefix compression, we're talking about encoding things at the block level where most of the ints are internal pointers that are less than the block size of 64k, so most ints can fit in 2 bytes. But it's important that they be able to grow gracefully when block sizes grow beyond 64k or are configured to be bigger. I've been using two types of encoded integers: VInt and FInt. FInts are basically an optimization over VInts for cases where you have many ints with the same characteristics, and can therefore store their width at the block level rather than encoding it in every occurrence. VInt (variable width int) * width is not known ahead of time, so must interpret byte-by-byte * slower because of branch on each byte, but still pretty fast * only 2^7 values/byte, so 2 bytes can hold 16k values FInt (fixed width int) * width is known ahead of time and stored externally (at block level in PtBlockMeta in this project) * an FInt is faster to encode decode because of the lack of if-statements * each byte can store 2^8 values, so 2 bytes gets you 64k values (hbase block size) * a list of these numbers provides random access. important for binary searching * if encoding the numbers 0-10,000, for example, then VInts will save you 1 byte on the numbers 0-255, but that is a small % savings. so use FInts for lists of numbers - Sidenote: I've been meaning to make a CVInt (comparable variable width int) that: * sorts based on raw bytes even if different widths (good for suffixing hbase row/colQualifier values) * to interpret, count the number of leading 1 bits, and that is how many additional bytes there are beyond the first byte * bits beyond the first 0 bit comprise the value * should also be faster to decode because of fewer branches Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Reporter: Jacek Migdal Labels: compression A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression --
[jira] [Updated] (HBASE-4230) Compaction threads need names
[ https://issues.apache.org/jira/browse/HBASE-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4230: -- Resolution: Fixed Assignee: Andrew Purtell Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to trunk. Compaction threads need names - Key: HBASE-4230 URL: https://issues.apache.org/jira/browse/HBASE-4230 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Andrew Purtell Fix For: 0.92.0 Attachments: HBASE-4230.patch The CompactSplitThread creates executors for doing compaction work, but threads end up named things like pool-2-thread-1 which isn't very useful. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4236) Don't lock the stream while serializing the response
Don't lock the stream while serializing the response Key: HBASE-4236 URL: https://issues.apache.org/jira/browse/HBASE-4236 Project: HBase Issue Type: Improvement Components: ipc Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor It is not necessary to hold the lock on the stream while the response is being serialized. This unnecessarily prevents serializing responses in parallel. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4237) Directly remove the call being handled from the map of outstanding RPCs
Directly remove the call being handled from the map of outstanding RPCs --- Key: HBASE-4237 URL: https://issues.apache.org/jira/browse/HBASE-4237 Project: HBase Issue Type: Improvement Components: ipc Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor The client has to maintain a map of RPC ID to `Call' object for this RPC, for every outstanding RPC. When receiving a response, the client was getting the `Call' out of the map (one O(log n) operation) and then removing it from the map (another O(log n) operation). There is no benefit in not removing it directly from the map. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4237) Directly remove the call being handled from the map of outstanding RPCs
[ https://issues.apache.org/jira/browse/HBASE-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088299#comment-13088299 ] Benoit Sigoure commented on HBASE-4237: --- Patch @ https://github.com/tsuna/hbase/commit/1f602391ee4cd3d11eaf3067208caeadf214b3a8 Directly remove the call being handled from the map of outstanding RPCs --- Key: HBASE-4237 URL: https://issues.apache.org/jira/browse/HBASE-4237 Project: HBase Issue Type: Improvement Components: ipc Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor The client has to maintain a map of RPC ID to `Call' object for this RPC, for every outstanding RPC. When receiving a response, the client was getting the `Call' out of the map (one O(log n) operation) and then removing it from the map (another O(log n) operation). There is no benefit in not removing it directly from the map. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4237) Directly remove the call being handled from the map of outstanding RPCs
[ https://issues.apache.org/jira/browse/HBASE-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088304#comment-13088304 ] Ted Yu commented on HBASE-4237: --- +1 on patch. Directly remove the call being handled from the map of outstanding RPCs --- Key: HBASE-4237 URL: https://issues.apache.org/jira/browse/HBASE-4237 Project: HBase Issue Type: Improvement Components: ipc Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor The client has to maintain a map of RPC ID to `Call' object for this RPC, for every outstanding RPC. When receiving a response, the client was getting the `Call' out of the map (one O(log n) operation) and then removing it from the map (another O(log n) operation). There is no benefit in not removing it directly from the map. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4236) Don't lock the stream while serializing the response
[ https://issues.apache.org/jira/browse/HBASE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088305#comment-13088305 ] Ted Yu commented on HBASE-4236: --- +1 on patch. Don't lock the stream while serializing the response Key: HBASE-4236 URL: https://issues.apache.org/jira/browse/HBASE-4236 Project: HBase Issue Type: Improvement Components: ipc Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor It is not necessary to hold the lock on the stream while the response is being serialized. This unnecessarily prevents serializing responses in parallel. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4199) blockCache summary - backend
[ https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088307#comment-13088307 ] Ted Yu commented on HBASE-4199: --- Patch version 4 is in a commit-table state. Minor comments: In BlockCache: {code} + public ListBlockCacheColumnFamilySummary getBlockCacheColumnFamilySummary(Configuration conf) throws IOException { {code} I think getBlockCacheColumnFamilySummaries might be a better name. For HRegionInterface: {code} + * Performs a BlockCache summary and returns a List of BlockCacheColumnFamily objects. {code} BlockCacheColumnFamilySummary objects are returned. Again, the method name should pluralize Summaries. Good work, Doug. blockCache summary - backend Key: HBASE-4199 URL: https://issues.apache.org/jira/browse/HBASE-4199 Project: HBase Issue Type: Sub-task Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch, java_HBASE_4199_v3.patch, java_HBASE_4199_v4.patch This is the backend work for the blockCache summary. Change to BlockCache interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition to HRegionInterface, and HRegionServer. This will NOT include any of the web UI or anything else like that. That is for another sub-task. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4065) TableOutputFormat ignores failure to create table instance
[ https://issues.apache.org/jira/browse/HBASE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088309#comment-13088309 ] Ted Yu commented on HBASE-4065: --- In mapred/TableOutputFormat.java: {code} } catch(IOException e) { LOG.error(e); throw e; } {code} Should we make their behavior consistent ? TableOutputFormat ignores failure to create table instance -- Key: HBASE-4065 URL: https://issues.apache.org/jira/browse/HBASE-4065 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Todd Lipcon Assignee: Brock Noland Fix For: 0.94.0 Attachments: HBASE-4065.1.patch If TableOutputFormat in the new API fails to create a table, it simply logs this at ERROR level and then continues on its way. Then, the first write() to the table will throw a NPE since table hasn't been set. Instead, it should probably rethrow the exception as a RuntimeException in setConf, or do what the old-API TOF does and not create the HTable instance until getRecordWriter, where it can throw an IOE. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira