[jira] [Updated] (HBASE-4486) Improve Javadoc for HTableDescriptor
[ https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash Ashok updated HBASE-4486: --- Status: Patch Available (was: In Progress) Improve Javadoc for HTableDescriptor Key: HBASE-4486 URL: https://issues.apache.org/jira/browse/HBASE-4486 Project: HBase Issue Type: Improvement Components: client, documentation Reporter: Akash Ashok Assignee: Akash Ashok Priority: Minor Attachments: HBase-4486-v2.patch, HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4486) Improve Javadoc for HTableDescriptor
[ https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash Ashok updated HBASE-4486: --- Attachment: HTableDescriptor-v2.html HBase-4486-v2.patch TestShell was failing. Thus attaching the v2 patch along with the javadoc HTML. Ran all the tests Improve Javadoc for HTableDescriptor Key: HBASE-4486 URL: https://issues.apache.org/jira/browse/HBASE-4486 Project: HBase Issue Type: Improvement Components: client, documentation Reporter: Akash Ashok Assignee: Akash Ashok Priority: Minor Attachments: HBase-4486-v2.patch, HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4589) CacheOnWrite broken in some cases because it can conflict with evictOnClose
[ https://issues.apache.org/jira/browse/HBASE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128380#comment-13128380 ] Hudson commented on HBASE-4589: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4589 CacheOnWrite broken in some cases because it can conflict with evictOnClose (jgray) jgray : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java CacheOnWrite broken in some cases because it can conflict with evictOnClose --- Key: HBASE-4589 URL: https://issues.apache.org/jira/browse/HBASE-4589 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-4589-v1.patch Commit of HBASE-4078 added some extra StoreFile verification which just did an open of a StoreFile reader and then closes it, ensuring there's no exception. If evict-on-close is on, which it is by default, this causes all blocks of a file to be evicted even though it's still open. We need to add the boolean into the close call in the way we have booleans for cacheBlocks at some point since we need to make localized decisions in some cases. In lots of places, we can always rely on cacheConf.shouldEvictOnClose() so shouldn't be too burdensome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4556) Fix all incorrect uses of InternalScanner.next(...)
[ https://issues.apache.org/jira/browse/HBASE-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128378#comment-13128378 ] Hudson commented on HBASE-4556: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4556 Fix all incorrect uses of InternalScanner.next(...) larsh : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HMerge.java Fix all incorrect uses of InternalScanner.next(...) --- Key: HBASE-4556 URL: https://issues.apache.org/jira/browse/HBASE-4556 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 4556-v1.txt, 4556.txt There are cases all over the code where InternalScanner.next(...) is not used correctly. I see this a lot: {code} while(scanner.next(...)) { } {code} The correct pattern is: {code} boolean more = false; do { more = scanner.next(...); } while (more); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly
[ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128377#comment-13128377 ] Hudson commented on HBASE-4568: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4568 Make zk dump jsp response faster nspiegelberg : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java * /hbase/trunk/src/main/resources/hbase-webapps/master/zk.jsp Make zk dump jsp response more quickly -- Key: HBASE-4568 URL: https://issues.apache.org/jira/browse/HBASE-4568 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Fix For: 0.92.0, 0.94.0 Attachments: HBASE-4568.patch 1) For each zk dump, currently hbase will create a zk client instance every time. This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again. code HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER); Configuration conf = master.getConfiguration(); HBaseAdmin hbadmin = new HBaseAdmin(conf); HConnection connection = hbadmin.getConnection(); ZooKeeperWatcher watcher = connection.getZooKeeperWatcher(); /code So we can simplify this: code HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER); ZooKeeperWatcher watcher = master.getZooKeeperWatcher(); /code 2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. It would be nice to make this configurable and set it to a low time out. When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out. It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs. 3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4551) Small fixes to compile against 0.23-SNAPSHOT
[ https://issues.apache.org/jira/browse/HBASE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128379#comment-13128379 ] Hudson commented on HBASE-4551: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4551 Fix pom and some test cases to compile and run against Hadoop 0.23 todd : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/pom.xml * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java Small fixes to compile against 0.23-SNAPSHOT Key: HBASE-4551 URL: https://issues.apache.org/jira/browse/HBASE-4551 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.92.0 Attachments: hbase-4551.txt, hbase-4551.txt - fix pom.xml to properly pull the test artifacts - fix TestHLog to not use the private cluster.getNameNode() API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4078) Silent Data Offlining During HDFS Flakiness
[ https://issues.apache.org/jira/browse/HBASE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128381#comment-13128381 ] Hudson commented on HBASE-4078: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4078 Validate store files after flush/compaction nspiegelberg : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java Silent Data Offlining During HDFS Flakiness --- Key: HBASE-4078 URL: https://issues.apache.org/jira/browse/HBASE-4078 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.89.20100924, 0.90.3, 0.92.0 Reporter: Nicolas Spiegelberg Assignee: Pritam Damania Priority: Blocker Fix For: 0.92.0, 0.94.0 Attachments: 0001-Validate-store-files-after-compactions-flushes.patch, 0001-Validate-store-files.patch See HBASE-1436 . The bug fix for this JIRA is a temporary workaround for improperly moving partially-written files from TMP into the region directory when a FS error occurs. Unfortunately, the fix is to ignore all IO exceptions, which masks off-lining due to FS flakiness. We need to permanently fix the problem that created HBASE-1436 then at least have the option to not open a region during times of flakey FS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4597) [book] performance.xml Adding comment about EC2
[ https://issues.apache.org/jira/browse/HBASE-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128388#comment-13128388 ] Hudson commented on HBASE-4597: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4597 performance.xml ec2 section dmeil : Files : * /hbase/trunk/src/docbkx/performance.xml [book] performance.xml Adding comment about EC2 --- Key: HBASE-4597 URL: https://issues.apache.org/jira/browse/HBASE-4597 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: performance_HBASE_4597.xml.patch I added a section under performance reminding people that running HBase on EC2 isn't the same thing as running on a dedicated server. This type of question seems to happen fairly often on the dist-list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter
[ https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128383#comment-13128383 ] Hudson commented on HBASE-4469: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4469 Avoid top row seek by looking up bloomfilter (liyin via jgray) jgray : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java Avoid top row seek by looking up bloomfilter Key: HBASE-4469 URL: https://issues.apache.org/jira/browse/HBASE-4469 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Fix For: 0.94.0 Attachments: HBASE-4469_1.patch The problem is that when seeking for the row/col in the hfile, we will go to top of the row in order to check for row delete marker (delete family). However, if the bloomfilter is enabled for the column family, then if a delete family operation is done on a row, the row is already being added to bloomfilter. We can take advantage of this factor to avoid seeking to the top of row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4282) RegionServer should abort when WAL close encounters an error with unflushed edits
[ https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128384#comment-13128384 ] Hudson commented on HBASE-4282: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4282 RegionServer should abort when WAL close fails with unflushed edits garyh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollAbort.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java RegionServer should abort when WAL close encounters an error with unflushed edits - Key: HBASE-4282 URL: https://issues.apache.org/jira/browse/HBASE-4282 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0, 0.90.5 Reporter: Gary Helmling Assignee: Gary Helmling Priority: Blocker Fix For: 0.92.0, 0.94.0, 0.90.5 Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_0.90_final.patch, HBASE-4282_0.92_final.patch, HBASE-4282_trunk_2.patch, HBASE-4282_trunk_3.patch, HBASE-4282_trunk_final.patch, HBASE-4282_trunk_prelim.patch The ability to ride over WAL close errors on log rolling added in HBASE-4222 could lead to missing HLog entries if: * A table has DEFERRED_LOG_FLUSH=true * There are unflushed WALEdit entries for that table in the current SequenceFile writer buffer Since the writes were already acknowledged to the client, just ignoring the close error to allow for another log roll doesn't seem like the right thing to do here. We could easily flag this state and only ride over the close error if there aren't unflushed entries. This would bring the above condition back to the previous behavior of aborting the region server. However, aborting the region server in this state is still guaranteeing data loss. Is there anything we can do better in this case? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4558) Refactor TestOpenedRegionHandler and TestOpenRegionHandler.
[ https://issues.apache.org/jira/browse/HBASE-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128385#comment-13128385 ] Hudson commented on HBASE-4558: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4558 - Addendum for TestMasterFailOver (Ram) HBASE-4558 Refactor TestOpenedRegionHandler and TestOpenRegionHandler. (Ram) ramkrishna : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java ramkrishna : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/MockRegionServerServices.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/MockServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MockServer.java Refactor TestOpenedRegionHandler and TestOpenRegionHandler. --- Key: HBASE-4558 URL: https://issues.apache.org/jira/browse/HBASE-4558 Project: HBase Issue Type: Improvement Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.92.0 Attachments: HBASE-4558_1.patch, HBASE-4558_2.patch, HBASE-4558_3.patch This is an improvement task taken up to refactor TestOpenedRegionandler and TestOpenRegionHandler so that MockServer and MockRegionServerServices can be accessed from a common utility package. If we do this then one of the testcases in TestOpenedRegionHandler need not start up a cluster and also moving it into a common package will help in mocking the server for future testcases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4596) [book] chapter reordering
[ https://issues.apache.org/jira/browse/HBASE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128387#comment-13128387 ] Hudson commented on HBASE-4596: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-4596 book.xml chapter reordering dmeil : Files : * /hbase/trunk/src/docbkx/book.xml [book] chapter reordering - Key: HBASE-4596 URL: https://issues.apache.org/jira/browse/HBASE-4596 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_HBASE_4596.xml.patch Since the book grew organically things just kept getting added to the end, whether or not it was the best place for it. The first 4 chapters stay the same, the change is aimed at the chapters after HBase Shell. I'm pushing the conceptual material up front, keeping the support chapters together, and keeping the Developing HBase at the end. For example, right after the book introduces the shell, BAM! Write a MapReduce program! Even before you know how to create a table, or even what the overall datamodel is. Etc. Before... Getting started Configuration Upgrading HBase Shell HBase and MapReduce HBase and Schema Design Metrics Cluster Replication Data Model Architecture Performance Tuning Troubleshooting Building HBase Developing HBase External APIs HBase Operational Mgt After... Getting started Configuration Upgrading HBase Shell Data Model HBase and Schema Design HBase and MapReduce Architecture External APIs Performance Tuning Troubleshooting HBase Operational Mgt Building and Developing HBase (In another Jira this week, Cluster Replication was put under HBase Operational Mgt, Metrics were put under HBase Operational Mgt, and Building HBase was moved under Developing HBase) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3446) ProcessServerShutdown fails if META moves, orphaning lots of regions
[ https://issues.apache.org/jira/browse/HBASE-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128382#comment-13128382 ] Hudson commented on HBASE-3446: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-3446 ProcessServerShutdown fails if META moves, orphaning lots of regions stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaMigrationRemovingHTD.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Result.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedException.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java * /hbase/trunk/src/main/ruby/hbase/admin.rb * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditorNoCluster.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestMetaMigration.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java * /hbase/trunk/src/test/ruby/hbase/admin_test.rb * /hbase/trunk/src/test/ruby/shell/shell_test.rb ProcessServerShutdown fails if META moves, orphaning lots of regions Key: HBASE-3446 URL: https://issues.apache.org/jira/browse/HBASE-3446 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Todd Lipcon Assignee: stack Priority: Blocker Fix For: 0.92.0 Attachments: 3446-v11.txt, 3446-v12.txt, 3446-v13.txt, 3446-v14.txt, 3446-v2.txt, 3446-v3.txt, 3446-v4.txt, 3446-v7.txt, 3446-v9.txt, 3446.txt, 3446v15.txt, 3446v23.txt I ran a rolling restart on a 5 node cluster with lots of regions, and afterwards had LOTS of regions left orphaned. The issue appears to be that ProcessServerShutdown failed because the server hosting META was restarted around the same time as another server was being processed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128386#comment-13128386 ] Hudson commented on HBASE-3417: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-3417 CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme (jgray) jgray : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4598) [book] adding HDFS information, updating FAQ with an EC2 reference
[ https://issues.apache.org/jira/browse/HBASE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4598: - Attachment: docbkx_HBASE_4598.patch [book] adding HDFS information, updating FAQ with an EC2 reference -- Key: HBASE-4598 URL: https://issues.apache.org/jira/browse/HBASE-4598 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_HBASE_4598.patch book.xml * Moved EC2 remote connection question in FAQ to Troubleshooting chapter. * Created new general EC2 entry in FAQ with pointers to EC2 sections in Perf and Trouble chapters. * Added HDFS section in Architecture chapter, with link to Hadoop HDFS documentation. ** These type of questions come up from time-to-time on the dist-list. Performance.xml * Added section in Performance chapter for HDFS ** One sub-section is link to umbrella Jira for HDFS tickets for low-latency reads. ** Another is section on HBase vs. HDFS performance in a batch context. Trouble.xml * Moving EC2 entry from FAQ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4598) [book] adding HDFS information, updating FAQ with an EC2 reference
[ https://issues.apache.org/jira/browse/HBASE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4598: - Status: Patch Available (was: Open) [book] adding HDFS information, updating FAQ with an EC2 reference -- Key: HBASE-4598 URL: https://issues.apache.org/jira/browse/HBASE-4598 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_HBASE_4598.patch book.xml * Moved EC2 remote connection question in FAQ to Troubleshooting chapter. * Created new general EC2 entry in FAQ with pointers to EC2 sections in Perf and Trouble chapters. * Added HDFS section in Architecture chapter, with link to Hadoop HDFS documentation. ** These type of questions come up from time-to-time on the dist-list. Performance.xml * Added section in Performance chapter for HDFS ** One sub-section is link to umbrella Jira for HDFS tickets for low-latency reads. ** Another is section on HBase vs. HDFS performance in a batch context. Trouble.xml * Moving EC2 entry from FAQ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4598) [book] adding HDFS information, updating FAQ with an EC2 reference
[book] adding HDFS information, updating FAQ with an EC2 reference -- Key: HBASE-4598 URL: https://issues.apache.org/jira/browse/HBASE-4598 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_HBASE_4598.patch book.xml * Moved EC2 remote connection question in FAQ to Troubleshooting chapter. * Created new general EC2 entry in FAQ with pointers to EC2 sections in Perf and Trouble chapters. * Added HDFS section in Architecture chapter, with link to Hadoop HDFS documentation. ** These type of questions come up from time-to-time on the dist-list. Performance.xml * Added section in Performance chapter for HDFS ** One sub-section is link to umbrella Jira for HDFS tickets for low-latency reads. ** Another is section on HBase vs. HDFS performance in a batch context. Trouble.xml * Moving EC2 entry from FAQ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4598) [book] adding HDFS information, updating FAQ with an EC2 reference
[ https://issues.apache.org/jira/browse/HBASE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4598: - Resolution: Fixed Status: Resolved (was: Patch Available) [book] adding HDFS information, updating FAQ with an EC2 reference -- Key: HBASE-4598 URL: https://issues.apache.org/jira/browse/HBASE-4598 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_HBASE_4598.patch book.xml * Moved EC2 remote connection question in FAQ to Troubleshooting chapter. * Created new general EC2 entry in FAQ with pointers to EC2 sections in Perf and Trouble chapters. * Added HDFS section in Architecture chapter, with link to Hadoop HDFS documentation. ** These type of questions come up from time-to-time on the dist-list. Performance.xml * Added section in Performance chapter for HDFS ** One sub-section is link to umbrella Jira for HDFS tickets for low-latency reads. ** Another is section on HBase vs. HDFS performance in a batch context. Trouble.xml * Moving EC2 entry from FAQ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128415#comment-13128415 ] Ted Yu commented on HBASE-4536: --- bq. Thinking about a ScanConfig (or ScanInfo) ScanInfo seems to be a better name. bq. And then maybe a ScanType enum This is good. bq. a delete cell does not increase the version count This should be fine. Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241). There should be a way to retain those version to guard against software error. I see two options here: 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED. 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker. #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint) Comments? Any other options? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4563: -- Summary: When error occurs in this.parent.close(false) of split, the split region cannot write read (was: When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write read) When error occurs in this.parent.close(false) of split, the split region cannot write read Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4563: -- Assignee: bluedavy Summary: When error occurs in this.parent.close(false) of split, the split region cannot write or read (was: When error occurs in this.parent.close(false) of split, the split region cannot write read) When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128418#comment-13128418 ] Ted Yu commented on HBASE-4562: --- In JIRA description: bq. 5. kill the regionserver hosted the table; When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4599) [book] performance.xml - nit grammatical error in EC2 section
[book] performance.xml - nit grammatical error in EC2 section - Key: HBASE-4599 URL: https://issues.apache.org/jira/browse/HBASE-4599 Project: HBase Issue Type: Bug Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4599) [book] performance.xml - nit grammatical error in EC2 section
[ https://issues.apache.org/jira/browse/HBASE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4599: - Attachment: performance_HBASE_4599.xml.patch [book] performance.xml - nit grammatical error in EC2 section - Key: HBASE-4599 URL: https://issues.apache.org/jira/browse/HBASE-4599 Project: HBase Issue Type: Bug Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: performance_HBASE_4599.xml.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4599) [book] performance.xml - nit grammatical error in EC2 section
[ https://issues.apache.org/jira/browse/HBASE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4599: - Status: Patch Available (was: Open) [book] performance.xml - nit grammatical error in EC2 section - Key: HBASE-4599 URL: https://issues.apache.org/jira/browse/HBASE-4599 Project: HBase Issue Type: Bug Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: performance_HBASE_4599.xml.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4599) [book] performance.xml - nit grammatical error in EC2 section
[ https://issues.apache.org/jira/browse/HBASE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4599: - Resolution: Fixed Status: Resolved (was: Patch Available) [book] performance.xml - nit grammatical error in EC2 section - Key: HBASE-4599 URL: https://issues.apache.org/jira/browse/HBASE-4599 Project: HBase Issue Type: Bug Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: performance_HBASE_4599.xml.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128422#comment-13128422 ] Ted Yu commented on HBASE-4562: --- If OfflineParentInMeta() times out, SplitRequest.run() would execute the following code: {code} if (st.rollback(this.server, this.server)) { LOG.info(Successful rollback of failed split of + parent.getRegionNameAsString()); } else { this.server.abort(Abort; we got an error after point-of-no-return); {code} I agree that the comments should be consistent in all patches. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4562: -- Comment: was deleted (was: In JIRA description: bq. 5. kill the regionserver hosted the table; ) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master
[ https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128436#comment-13128436 ] jirapos...@reviews.apache.org commented on HBASE-4070: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2029/ --- (Updated 2011-10-16 16:22:16.729481) Review request for hbase and Mingjie Lai. Changes --- Updated 'testing done' section. Summary --- Proposed fix for HBASE-4070. This addresses bug HBASE-4070. https://issues.apache.org/jira/browse/HBASE-4070 Diffs - src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon abeb850 src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon be6fceb src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 01bc1dd src/main/java/org/apache/hadoop/hbase/HServerLoad.java 0c680e4 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 92c959c src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 7d2f82e src/main/java/org/apache/hadoop/hbase/master/HMaster.java 50b49a6 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e2e694a src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java eda5a9b Diff: https://reviews.apache.org/r/2029/diff Testing (updated) --- Two new tests : testRegionServerCoprocessorReported() and testMasterServerCoprocessorsReported() added to (existing) src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java. Thanks, Eugene [Coprocessors] Improve region server metrics to report loaded coprocessors to master Key: HBASE-4070 URL: https://issues.apache.org/jira/browse/HBASE-4070 Project: HBase Issue Type: Improvement Affects Versions: 0.90.3 Reporter: Mingjie Lai Assignee: Eugene Koontz Attachments: HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, master-web-ui.jpg, rs-status-web-ui.jpg HBASE-3512 is about listing loaded cp classes at shell. To make it more generic, we need a way to report this piece of information from region to master (or just at region server level). So later on, we can display the loaded class names at shell as well as web console. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4599) [book] performance.xml - nit grammatical error in EC2 section
[ https://issues.apache.org/jira/browse/HBASE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128444#comment-13128444 ] Hudson commented on HBASE-4599: --- Integrated in HBase-TRUNK #2326 (See [https://builds.apache.org/job/HBase-TRUNK/2326/]) HBASE-4599. performance.xml - correcting small error in EC2 section dmeil : Files : * /hbase/trunk/src/docbkx/performance.xml [book] performance.xml - nit grammatical error in EC2 section - Key: HBASE-4599 URL: https://issues.apache.org/jira/browse/HBASE-4599 Project: HBase Issue Type: Bug Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: performance_HBASE_4599.xml.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4598) [book] adding HDFS information, updating FAQ with an EC2 reference
[ https://issues.apache.org/jira/browse/HBASE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128443#comment-13128443 ] Hudson commented on HBASE-4598: --- Integrated in HBase-TRUNK #2326 (See [https://builds.apache.org/job/HBase-TRUNK/2326/]) HBASE-4598 book update (book.xml, perf.xml, trouble.xml) dmeil : Files : * /hbase/trunk/src/docbkx/book.xml * /hbase/trunk/src/docbkx/performance.xml * /hbase/trunk/src/docbkx/troubleshooting.xml [book] adding HDFS information, updating FAQ with an EC2 reference -- Key: HBASE-4598 URL: https://issues.apache.org/jira/browse/HBASE-4598 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_HBASE_4598.patch book.xml * Moved EC2 remote connection question in FAQ to Troubleshooting chapter. * Created new general EC2 entry in FAQ with pointers to EC2 sections in Perf and Trouble chapters. * Added HDFS section in Architecture chapter, with link to Hadoop HDFS documentation. ** These type of questions come up from time-to-time on the dist-list. Performance.xml * Added section in Performance chapter for HDFS ** One sub-section is link to umbrella Jira for HDFS tickets for low-latency reads. ** Another is section on HBase vs. HDFS performance in a batch context. Trouble.xml * Moving EC2 entry from FAQ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4600) [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example
[book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example Key: HBASE-4600 URL: https://issues.apache.org/jira/browse/HBASE-4600 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor book.xml * further explanation on what to do instead of using explicit Put timestamp. * minor reformatting in KeyValue example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4600) [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example
[ https://issues.apache.org/jira/browse/HBASE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4600: - Status: Patch Available (was: Open) [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example Key: HBASE-4600 URL: https://issues.apache.org/jira/browse/HBASE-4600 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_HBASE_4600.xml.patch book.xml * further explanation on what to do instead of using explicit Put timestamp. * minor reformatting in KeyValue example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4600) [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example
[ https://issues.apache.org/jira/browse/HBASE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4600: - Resolution: Fixed Status: Resolved (was: Patch Available) [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example Key: HBASE-4600 URL: https://issues.apache.org/jira/browse/HBASE-4600 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_HBASE_4600.xml.patch book.xml * further explanation on what to do instead of using explicit Put timestamp. * minor reformatting in KeyValue example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4600) [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example
[ https://issues.apache.org/jira/browse/HBASE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4600: - Attachment: book_HBASE_4600.xml.patch [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example Key: HBASE-4600 URL: https://issues.apache.org/jira/browse/HBASE-4600 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_HBASE_4600.xml.patch book.xml * further explanation on what to do instead of using explicit Put timestamp. * minor reformatting in KeyValue example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix
[ https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128529#comment-13128529 ] Kannan Muthukkaruppan commented on HBASE-3443: -- Now that we have lazy seeks, i.e. HBASE-4465, we should be able to revert the work/optimization done HBASE-3082, and avoid this bug. What do you folks think? ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix -- Key: HBASE-3443 URL: https://issues.apache.org/jira/browse/HBASE-3443 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable. If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so. Sample test code outline: {code} admin.createTable(desc) table = HTable.new(conf, tableName) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); admin.flush(tableName) sleep(2) del = Delete.new(Bytes.toBytes(row)) table.delete(del) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); get = Get.new(Bytes.toBytes(row)) keyValues = table.get(get).raw() keyValues.each do |keyValue| puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}; end {code} The above prints: {code} Expect 5; Got Value=10 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix
[ https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128530#comment-13128530 ] Ted Yu commented on HBASE-3443: --- +1 on the proposal. ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix -- Key: HBASE-3443 URL: https://issues.apache.org/jira/browse/HBASE-3443 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable. If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so. Sample test code outline: {code} admin.createTable(desc) table = HTable.new(conf, tableName) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); admin.flush(tableName) sleep(2) del = Delete.new(Bytes.toBytes(row)) table.delete(del) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); get = Get.new(Bytes.toBytes(row)) keyValues = table.get(get).raw() keyValues.each do |keyValue| puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}; end {code} The above prints: {code} Expect 5; Got Value=10 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4600) [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example
[ https://issues.apache.org/jira/browse/HBASE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128537#comment-13128537 ] Hudson commented on HBASE-4600: --- Integrated in HBase-TRUNK #2328 (See [https://builds.apache.org/job/HBase-TRUNK/2328/]) HBASE-4600 book.xml dmeil : Files : * /hbase/trunk/src/docbkx/book.xml [book] book.xml comment about what to do instead of using explicit timestamp, minor reformatting in KeyValue example Key: HBASE-4600 URL: https://issues.apache.org/jira/browse/HBASE-4600 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_HBASE_4600.xml.patch book.xml * further explanation on what to do instead of using explicit Put timestamp. * minor reformatting in KeyValue example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128554#comment-13128554 ] gaojinchao commented on HBASE-4511: --- Ihis cannot be reproduced in real cluster and downgrade its priority. There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 Attachments: org.apache.hadoop.hbase.master.TestMasterFailover-output.rar It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or updating) unassigned node for 1028785192 with OFFLINE state 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] zookeeper.ZooKeeperWatcher(233): master:54557-0x132b31adbb30005 Received
[jira] [Updated] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4511: -- Priority: Minor (was: Critical) There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 Attachments: org.apache.hadoop.hbase.master.TestMasterFailover-output.rar It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or updating) unassigned node for 1028785192 with OFFLINE state 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] zookeeper.ZooKeeperWatcher(233): master:54557-0x132b31adbb30005 Received ZooKeeper Event, type=NodeCreated, state=SyncConnected, path=/hbase/unassigned/1028785192 //It said
[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix
[ https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128564#comment-13128564 ] Lars Hofhansl commented on HBASE-3443: -- I agree. I think delete handling is generally a bit funky in HBase (see also HBASE-4536). ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix -- Key: HBASE-3443 URL: https://issues.apache.org/jira/browse/HBASE-3443 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable. If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so. Sample test code outline: {code} admin.createTable(desc) table = HTable.new(conf, tableName) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); admin.flush(tableName) sleep(2) del = Delete.new(Bytes.toBytes(row)) table.delete(del) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); get = Get.new(Bytes.toBytes(row)) keyValues = table.get(get).raw() keyValues.each do |keyValue| puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}; end {code} The above prints: {code} Expect 5; Got Value=10 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128573#comment-13128573 ] Lars Hofhansl commented on HBASE-4562: -- I see... Thanks Ted. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4563: Attachment: (was: HBASE-4563-0.90.patch) When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4563: Attachment: (was: HBASE-4563-trunk.patch) When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4563: Attachment: (was: HBASE-4563-0.92.patch) When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4562: Attachment: (was: HBASE-4562-0.90.patch) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4563: Attachment: HBASE-4563-trunk.patch HBASE-4563-0.92.patch HBASE-4563-0.90.patch When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4562: Attachment: (was: HBASE-4562-trunk.patch) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4562: Attachment: HBASE-4562-trunk.patch HBASE-4562-0.92.patch HBASE-4562-0.90.patch When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128575#comment-13128575 ] bluedavy commented on HBASE-4563: - I fix the formatter. When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128574#comment-13128574 ] bluedavy commented on HBASE-4562: - I fix the comments to keep consistent in all patches. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128602#comment-13128602 ] Lars Hofhansl commented on HBASE-4562: -- +1 for latest patches (assuming all tests pass) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4511: -- Fix Version/s: (was: 0.92.0) 0.94.0 There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.94.0 Attachments: org.apache.hadoop.hbase.master.TestMasterFailover-output.rar It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or updating) unassigned node for 1028785192 with OFFLINE state 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] zookeeper.ZooKeeperWatcher(233): master:54557-0x132b31adbb30005 Received ZooKeeper Event, type=NodeCreated, state=SyncConnected,
[jira] [Assigned] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-4562: - Assignee: bluedavy When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy resolved HBASE-4563. - Resolution: Fixed When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-4562 started by bluedavy. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy resolved HBASE-4562. - Resolution: Fixed When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy reopened HBASE-4562: - wait for committer commit to the svn. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy reopened HBASE-4563: - wait for committer commit to the svn. When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers
[ https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HBASE-4588: Attachment: configVerify2.txt Addressed Ted's review comments. The floating point arithmetic to validate memory allocation configurations need to be done as integers -- Key: HBASE-4588 URL: https://issues.apache.org/jira/browse/HBASE-4588 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: dhruba borthakur Priority: Minor Fix For: 0.92.0 Attachments: configVerify1.txt, configVerify2.txt The floating point arithmetic to validate memory allocation configurations need to be done as integers. On our cluster, we had block cache = 0.6 and memstore = 0.2. It was saying this was 0.8 when it is actually equal. Minor bug but annoying nonetheless. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers
[ https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HBASE-4588: Attachment: configVerify2.txt Attaching the appropriate patch file with review comments fixes. The floating point arithmetic to validate memory allocation configurations need to be done as integers -- Key: HBASE-4588 URL: https://issues.apache.org/jira/browse/HBASE-4588 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: dhruba borthakur Priority: Minor Fix For: 0.92.0 Attachments: configVerify1.txt, configVerify2.txt, configVerify2.txt The floating point arithmetic to validate memory allocation configurations need to be done as integers. On our cluster, we had block cache = 0.6 and memstore = 0.2. It was saying this was 0.8 when it is actually equal. Minor bug but annoying nonetheless. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers
[ https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128637#comment-13128637 ] Ted Yu commented on HBASE-4588: --- +1 on patch v2. The floating point arithmetic to validate memory allocation configurations need to be done as integers -- Key: HBASE-4588 URL: https://issues.apache.org/jira/browse/HBASE-4588 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: dhruba borthakur Priority: Minor Fix For: 0.92.0 Attachments: configVerify1.txt, configVerify2.txt, configVerify2.txt The floating point arithmetic to validate memory allocation configurations need to be done as integers. On our cluster, we had block cache = 0.6 and memstore = 0.2. It was saying this was 0.8 when it is actually equal. Minor bug but annoying nonetheless. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers
[ https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4588: -- Attachment: (was: configVerify2.txt) The floating point arithmetic to validate memory allocation configurations need to be done as integers -- Key: HBASE-4588 URL: https://issues.apache.org/jira/browse/HBASE-4588 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: dhruba borthakur Priority: Minor Fix For: 0.92.0 Attachments: configVerify1.txt, configVerify2.txt The floating point arithmetic to validate memory allocation configurations need to be done as integers. On our cluster, we had block cache = 0.6 and memstore = 0.2. It was saying this was 0.8 when it is actually equal. Minor bug but annoying nonetheless. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog
[ https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128650#comment-13128650 ] jirapos...@reviews.apache.org commented on HBASE-4528: -- bq. On 2011-10-15 11:55:54, Ted Yu wrote: bq. We're closer. bq. Thanks for the perseverance, Dhruba. I will post another version of this patch with some typos corrected. bq. On 2011-10-15 11:55:54, Ted Yu wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1884 bq. https://reviews.apache.org/r/2141/diff/5/?file=50446#file50446line1884 bq. bq. We know that w != null here, so w.getWriteNumber() should be passed to rollbackMemstore(). There is no need to explicitly pass in w.getWriteNumber(). All the keys that are hanging off the familyMaps variable have their memstoreTS set appropriately. These were set in the call to applyFamilyMapToMemstore(). This memstoreTS will be used in the rollback methods to ensure that only keys in the memstore that also have a matching memstoreTS value are removed. bq. On 2011-10-15 11:55:54, Ted Yu wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 2195 bq. https://reviews.apache.org/r/2141/diff/5/?file=50446#file50446line2195 bq. bq. We should have memstoreTS parameter here. No need to pass in memstoreTS. The kvs hanging off the parameter 'familyMaps' already have the memstoreTS that was used to insert these keys in the memstore. bq. On 2011-10-15 11:55:54, Ted Yu wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 2228 bq. https://reviews.apache.org/r/2141/diff/5/?file=50446#file50446line2228 bq. bq. I think this should be in a finally block corresponding to the try at line 2205. I do not think a finally block is needed. If the getlock itself threw an exception, then there is no reason to do a releaseLock. Nothing else in this code section can throw an exception. - Dhruba --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2141/#review2611 --- On 2011-10-15 07:32:28, Dhruba Borthakur wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2141/ bq. --- bq. bq. (Updated 2011-10-15 07:32:28) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. The changes the multiPut operation so that the sync to the wal occurs outside the rowlock. bq. bq. This enhancement is done only to HRegion.mut(Put[]) because this is the only method that gets invoked from an application. The HRegion.put(Put) is used only by unit tests and should possibly be deprecated. bq. bq. bq. This addresses bug HBASE-4528. bq. https://issues.apache.org/jira/browse/HBASE-4528 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183585 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 1183585 bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183585 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183585 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183585 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1183585 bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java PRE-CREATION bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1183585 bq. /src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 1183585 bq. bq. Diff: https://reviews.apache.org/r/2141/diff bq. bq. bq. Testing bq. --- bq. bq. I ran TestLogRolling over and over again, about 50 times, not failed a single time. bq. bq. bq. Thanks, bq. bq. Dhruba bq. bq. The put operation can release the rowlock before sync-ing the Hlog -- Key: HBASE-4528 URL: https://issues.apache.org/jira/browse/HBASE-4528 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt This allows for better throughput when there are hot rows. A single row update improves from 100 puts/sec/server to 5000 puts/sec/server. -- This message is
[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog
[ https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HBASE-4528: Attachment: appendNoSyncPut5.txt Fixed typos. Performance numbers return on hbase-92 with a variant of hdfs 0.20. vanilla hdfs: 1200 put/sec (no patch), 5000 puts/sec (with patch) synconsync hdfs : 80 put/sec (no patch) The synconsync-version-of-hdfs is an internal version of hdfs that makes the datanode issue a sync() on the corresponding ext3 block file for every invocation of DFSClient.sync(). This ensures that a hbase transaction is really,really on disk before the put rpc returns to the client. The put operation can release the rowlock before sync-ing the Hlog -- Key: HBASE-4528 URL: https://issues.apache.org/jira/browse/HBASE-4528 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, appendNoSyncPut5.txt This allows for better throughput when there are hot rows. A single row update improves from 100 puts/sec/server to 5000 puts/sec/server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog
[ https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128655#comment-13128655 ] jirapos...@reviews.apache.org commented on HBASE-4528: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2141/ --- (Updated 2011-10-17 04:39:55.174101) Review request for hbase. Changes --- Fixed typos. Summary --- The changes the multiPut operation so that the sync to the wal occurs outside the rowlock. This enhancement is done only to HRegion.mut(Put[]) because this is the only method that gets invoked from an application. The HRegion.put(Put) is used only by unit tests and should possibly be deprecated. This addresses bug HBASE-4528. https://issues.apache.org/jira/browse/HBASE-4528 Diffs (updated) - /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1184991 /src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 1184991 /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1184991 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1184991 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184991 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1184991 /src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1184991 /src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 1184991 Diff: https://reviews.apache.org/r/2141/diff Testing --- I ran TestLogRolling over and over again, about 50 times, not failed a single time. Thanks, Dhruba The put operation can release the rowlock before sync-ing the Hlog -- Key: HBASE-4528 URL: https://issues.apache.org/jira/browse/HBASE-4528 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, appendNoSyncPut5.txt This allows for better throughput when there are hot rows. A single row update improves from 100 puts/sec/server to 5000 puts/sec/server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk
[ https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128657#comment-13128657 ] Ted Yu commented on HBASE-2856: --- For patch v7, boolean ignoreCount is added to checkColumn(). I think javadoc for this new parameter should be added to ColumnTracker.java Javadoc for long readPointToUse of ScanQueryMatcher ctor should be added. Javadoc for boolean useRWCC of StoreFileScanner ctor and getScannersForStoreFiles() should be added. There is duplicate code in StoreFileScanner.next(): lines 164 to 172. TestAcidGuarantee broken on trunk -- Key: HBASE-2856 URL: https://issues.apache.org/jira/browse/HBASE-2856 Project: HBase Issue Type: Bug Affects Versions: 0.89.20100621 Reporter: ryan rawson Assignee: Amitanand Aiyer Priority: Blocker Fix For: 0.94.0 Attachments: 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 2856-v5.txt, acid.txt TestAcidGuarantee has a test whereby it attempts to read a number of columns from a row, and every so often the first column of N is different, when it should be the same. This is a bug deep inside the scanner whereby the first peek() of a row is done at time T then the rest of the read is done at T+1 after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' data becomes committed and flushed to disk. One possible solution is to introduce the memstoreTS (or similarly equivalent value) to the HFile thus allowing us to preserve read consistency past flushes. Another solution involves fixing the scanners so that peek() is not destructive (and thus might return different things at different times alas). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128660#comment-13128660 ] Lars Hofhansl commented on HBASE-4563: -- @Ted... You wanna commit, or should I? I'm happy to. When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128665#comment-13128665 ] Ted Yu commented on HBASE-4563: --- @Lars: Go ahead. When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-4563. -- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.90, 0.92, and trunk When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128675#comment-13128675 ] Lars Hofhansl commented on HBASE-4562: -- Committing this too. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128678#comment-13128678 ] Lars Hofhansl commented on HBASE-4562: -- 0.90 patch fails to apply. @bluedavy, could you double check all patches against the latest of the 0.90, 0.92, trunk, respectively? When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128680#comment-13128680 ] jirapos...@reviews.apache.org commented on HBASE-4536: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ --- (Updated 2011-10-17 05:32:49.376397) Review request for hbase, Ted Yu and Jonathan Gray. Changes --- New day, new version of the patch :) o Added yet more tests. o Introduced class Store.ScanInfo and enum StoreScanner.ScanType to make more sense of the options passed to ScanQueryMatcher. This is hopefully close. Summary --- HBase timerange Gets and Scans allow to do timetravel in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a raw Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536. https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) - http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1184947 Diff: https://reviews.apache.org/r/2178/diff Testing --- All tests pass now. Thanks, Lars Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete