[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128685#comment-13128685 ] bluedavy commented on HBASE-4562: - the patch-0.90 is for 0.90.4... When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk
[ https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128687#comment-13128687 ] jirapos...@reviews.apache.org commented on HBASE-2856: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2614 --- Amit: Did you rebase before uploading the new patch. That, unfortunately, is making it hard to isolate the changes between r6 and r7. Will review tomorrow morning. But I did read your description about the issues you mentioned. Regarding (b)-- we had already discussed in person. That makes sense. And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!! - Kannan On 2011-10-15 04:08:41, Amitanand Aiyer wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2224/ bq. --- bq. bq. (Updated 2011-10-15 04:08:41) bq. bq. bq. Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. bq. bq. bq. Summary bq. --- bq. bq. address the 2856 issues by writing the memstoreTS to the disk. bq. bq. version v11 of the patch. bq. bq. uploading it here for easier review process. bq. bq. bq. This addresses bug HBASE-2856. bq. https://issues.apache.org/jira/browse/HBASE-2856 bq. bq. bq. Diffs bq. - bq. bq./pom.xml 1183581 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581 bq./src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581 bq./src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581 bq./src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581 bq./src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581 bq. /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581 bq. /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581 bq. /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581 bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581 bq. bq. Diff: https://reviews.apache.org/r/2224/diff bq. bq. bq. Testing bq. --- bq. bq. mvn test bq. bq. bq. Thanks, bq. bq. Amitanand bq. bq. TestAcidGuarantee broken on trunk -- Key: HBASE-2856 URL: https://issues.apache.org/jira/browse/HBASE-2856 Project: HBase Issue Type: Bug Affects Versions: 0.89.20100621 Reporter: ryan rawson Assignee: Amitanand Aiyer Priority: Blocker Fix For: 0.94.0 Attachments: 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 2856-v5.txt, acid.txt TestAcidGuarantee has a test whereby it attempts to read a number of columns from a row, and every so often the first column of N is different, when it should be the same. This is a bug deep inside the scanner whereby the first peek() of a row is done at time T then the rest of the read is done at T+1 after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' data becomes committed and flushed to disk. One possible solution is
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128690#comment-13128690 ] bluedavy commented on HBASE-4562: - em,OK,I renamed the current patch for 0.90.4. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4562: Attachment: (was: HBASE-4562-0.90.patch) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4562: Attachment: (was: test-4562-0.90.txt) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4562: Attachment: test-4562-0.90.4.txt HBASE-4562-0.90.4.patch When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
[ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128698#comment-13128698 ] jirapos...@reviews.apache.org commented on HBASE-4536: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2616 --- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java https://reviews.apache.org/r/2178/#comment5911 Please add 'any' in front of columns and change columns to singular form. Maybe columns.size should be checked against 0 as well ? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java https://reviews.apache.org/r/2178/#comment5912 Much shorter and readable now. - Ted On 2011-10-17 05:32:49, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2178/ bq. --- bq. bq. (Updated 2011-10-17 05:32:49) bq. bq. bq. Review request for hbase, Ted Yu and Jonathan Gray. bq. bq. bq. Summary bq. --- bq. bq. HBase timerange Gets and Scans allow to do timetravel in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. bq. This did not work for deletes, however. Deletes would always mask all puts in the past. bq. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. bq. These rows are still subject to TTL and/or VERSIONS. bq. bq. This changes the following: bq. 1. There is a new flag on HColumnDescriptor enabling that behavior. bq. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. bq. 3. Do not unconditionally collect all deleted rows during a compaction. bq. 4. Allow a raw Scan, which retrieves all delete markers and deleted rows. bq. bq. The change is small'ish, but the logic is intricate, so please review carefully. bq. bq. bq. This addresses bug HBASE-4536. bq. https://issues.apache.org/jira/browse/HBASE-4536 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947 bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947 bq.
[jira] [Commented] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128709#comment-13128709 ] Hudson commented on HBASE-4563: --- Integrated in HBase-TRUNK #2329 (See [https://builds.apache.org/job/HBase-TRUNK/2329/]) HBASE-4563 When error occurs in this.parent.close(false) of split, the split region cannot write or read larsh : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bluedavy updated HBASE-4562: Attachment: test-4562-0.90.txt HBASE-4562-0.90.patch When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128713#comment-13128713 ] bluedavy commented on HBASE-4562: - @Lars I attached the patch for latest 0.90,pls apply it again commit,thks. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4601) HBase table admin tool
HBase table admin tool -- Key: HBASE-4601 URL: https://issues.apache.org/jira/browse/HBASE-4601 Project: HBase Issue Type: New Feature Reporter: Thomas Pan Similar to the HFile tool, we need to have a HBase table level tool to handle the following tasks: 1. Balance table level regions among all the region servers in the cluster. There need to be different balancing algorithms available for cherry picking. 2. Create tables with the proper settings 3. Alter existing tables 4. Compact existing tables, regardless of minor or major 5. Erases existing tables -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog
[ https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128730#comment-13128730 ] jirapos...@reviews.apache.org commented on HBASE-4528: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2141/#review2617 --- @Dhruba Any figures to know the improvement in write performance? - ramkrishna On 2011-10-17 04:39:55, Dhruba Borthakur wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2141/ bq. --- bq. bq. (Updated 2011-10-17 04:39:55) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. The changes the multiPut operation so that the sync to the wal occurs outside the rowlock. bq. bq. This enhancement is done only to HRegion.mut(Put[]) because this is the only method that gets invoked from an application. The HRegion.put(Put) is used only by unit tests and should possibly be deprecated. bq. bq. bq. This addresses bug HBASE-4528. bq. https://issues.apache.org/jira/browse/HBASE-4528 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1184991 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 1184991 bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1184991 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1184991 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184991 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1184991 bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java PRE-CREATION bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1184991 bq. /src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 1184991 bq. bq. Diff: https://reviews.apache.org/r/2141/diff bq. bq. bq. Testing bq. --- bq. bq. I ran TestLogRolling over and over again, about 50 times, not failed a single time. bq. bq. bq. Thanks, bq. bq. Dhruba bq. bq. The put operation can release the rowlock before sync-ing the Hlog -- Key: HBASE-4528 URL: https://issues.apache.org/jira/browse/HBASE-4528 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, appendNoSyncPut5.txt This allows for better throughput when there are hot rows. A single row update improves from 100 puts/sec/server to 5000 puts/sec/server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read
[ https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128742#comment-13128742 ] Hudson commented on HBASE-4563: --- Integrated in HBase-0.92 #66 (See [https://builds.apache.org/job/HBase-0.92/66/]) HBASE-4563 When error occurs in this.parent.close(false) of split, the split region cannot write or read larsh : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java When error occurs in this.parent.close(false) of split, the split region cannot write or read - Key: HBASE-4563 URL: https://issues.apache.org/jira/browse/HBASE-4563 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4, 0.92.0 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, test-4563-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the hdfs error. {code:title=SplitTransaction.java|borderStyle=solid} ListStoreFile hstoreFilesToSplit = this.parent.close(false); throw new IOException(some unexpected error in close store files); {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. scan the table,then it'll fail. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4486) Improve Javadoc for HTableDescriptor
[ https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128747#comment-13128747 ] ramkrishna.s.vasudevan commented on HBASE-4486: --- {code} + * Construct a table descriptor specifying table name. * @param name Table name. {code} We have 2 constructors. Your explanation says clearly as one constructor takes the table name the other byte array of table name. May be we can also add * @param name Table name.- as byte array. Improve Javadoc for HTableDescriptor Key: HBASE-4486 URL: https://issues.apache.org/jira/browse/HBASE-4486 Project: HBase Issue Type: Improvement Components: client, documentation Reporter: Akash Ashok Assignee: Akash Ashok Priority: Minor Attachments: HBase-4486-v2.patch, HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4591) TTL for old HLogs should be calculated from last modification time.
[ https://issues.apache.org/jira/browse/HBASE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128761#comment-13128761 ] ramkrishna.s.vasudevan commented on HBASE-4591: --- +1 TTL for old HLogs should be calculated from last modification time. --- Key: HBASE-4591 URL: https://issues.apache.org/jira/browse/HBASE-4591 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.89.20100621 Reporter: Madhuwanti Vaidya Assignee: Madhuwanti Vaidya Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0
[ https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128764#comment-13128764 ] ramkrishna.s.vasudevan commented on HBASE-4595: --- +1 HFilePrettyPrinter Scanned kv count always 0 Key: HBASE-4595 URL: https://issues.apache.org/jira/browse/HBASE-4595 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.92.0, 0.94.0, 0.92.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Attachments: HBASE-4595.patch The count variable used to print the Scanned kv count is never incremented. A local count variable in scanKeysValues() method is updated instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4462) Properly treating SocketTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-4462: - Assignee: ramkrishna.s.vasudevan Properly treating SocketTimeoutException Key: HBASE-4462 URL: https://issues.apache.org/jira/browse/HBASE-4462 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.90.5 SocketTimeoutException is currently treated like any IOE inside of HCM.getRegionServerWithRetries and I think this is a problem. This method should only do retries in cases where we are pretty sure the operation will complete, but with STE we already waited for (by default) 60 seconds and nothing happened. I found this while debugging Douglas Campbell's problem on the mailing list where it seemed like he was using the same scanner from multiple threads, but actually it was just the same client doing retries while the first run didn't even finish yet (that's another problem). You could see the first scanner, then up to two other handlers waiting for it to finish in order to run (because of the synchronization on RegionScanner). So what should we do? We could treat STE as a DoNotRetryException and let the client deal with it, or we could retry only once. There's also the option of having a different behavior for get/put/icv/scan, the issue with operations that modify a cell is that you don't know if the operation completed or not (same when a RS dies hard after completing let's say a Put but just before returning to the client). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HBASE-4602) Make the suite run in at least half the time
[ https://issues.apache.org/jira/browse/HBASE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-4602 started by nkeywal. Make the suite run in at least half the time Key: HBASE-4602 URL: https://issues.apache.org/jira/browse/HBASE-4602 Project: HBase Issue Type: Umbrella Environment: All. Reporter: nkeywal Assignee: nkeywal - Cutting down on the number of cluster spinups by coalescing related tests rather than have each spin up its own cluster - Make cluster start/stop faster - Rewriting long-running tests so they do not need to be run on a cluster; e.g. by instead mocking expected signals/messages - Move long running tests out of the unit test suite to instead run as part of the recently introduced 'integration test' step -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4602) Make the suite run in at least half the time
Make the suite run in at least half the time Key: HBASE-4602 URL: https://issues.apache.org/jira/browse/HBASE-4602 Project: HBase Issue Type: Umbrella Environment: All. Reporter: nkeywal Assignee: nkeywal - Cutting down on the number of cluster spinups by coalescing related tests rather than have each spin up its own cluster - Make cluster start/stop faster - Rewriting long-running tests so they do not need to be run on a cluster; e.g. by instead mocking expected signals/messages - Move long running tests out of the unit test suite to instead run as part of the recently introduced 'integration test' step -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4602) Make the suite run in at least half the time
[ https://issues.apache.org/jira/browse/HBASE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4602: --- Attachment: tests_list.xlsx Test list and execution time, as of 14th oct. 2011 Make the suite run in at least half the time Key: HBASE-4602 URL: https://issues.apache.org/jira/browse/HBASE-4602 Project: HBase Issue Type: Umbrella Environment: All. Reporter: nkeywal Assignee: nkeywal Attachments: tests_list.xlsx - Cutting down on the number of cluster spinups by coalescing related tests rather than have each spin up its own cluster - Make cluster start/stop faster - Rewriting long-running tests so they do not need to be run on a cluster; e.g. by instead mocking expected signals/messages - Move long running tests out of the unit test suite to instead run as part of the recently introduced 'integration test' step -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4603) Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers
Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers --- Key: HBASE-4603 URL: https://issues.apache.org/jira/browse/HBASE-4603 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.92.0 Environment: all. Reporter: nkeywal Assignee: nkeywal Priority: Minor This functions waits for at least 2 times hbase.master.wait.on.regionservers.interval, defaulted at 3 seconds, i.e. 6 seconds for every mini hbase cluster starts. In the context of a mini cluster, it's not useful, as the regions servers are created locally. Changing this to a lower value such as 100ms gives 5.8 second per HBase cluser start. It should lower the build time on the apache server by more than 8%. Beeing more aggressive (removing all the wait time) could be possible as well. To be studied later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4603) Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers
[ https://issues.apache.org/jira/browse/HBASE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4603: --- Attachment: 20111017_4603_MiniHBaseCluster.patch Fix by setting the value of hbase.master.wait.on.regionservers.interval in the MiniHBaseCluster class. Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers --- Key: HBASE-4603 URL: https://issues.apache.org/jira/browse/HBASE-4603 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.92.0 Environment: all. Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 20111017_4603_MiniHBaseCluster.patch This functions waits for at least 2 times hbase.master.wait.on.regionservers.interval, defaulted at 3 seconds, i.e. 6 seconds for every mini hbase cluster starts. In the context of a mini cluster, it's not useful, as the regions servers are created locally. Changing this to a lower value such as 100ms gives 5.8 second per HBase cluser start. It should lower the build time on the apache server by more than 8%. Beeing more aggressive (removing all the wait time) could be possible as well. To be studied later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.
[ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128867#comment-13128867 ] Jonathan Hsieh commented on HBASE-4570: --- @stack I've done testing on trunk and an 0.90 branch and the symptoms encountered with the testing programs is fixed. Would be great to get on 0.90, 0.92 and trunk. Thanks! Scan ACID problem with concurrent puts. --- Key: HBASE-4570 URL: https://issues.apache.org/jira/browse/HBASE-4570 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.1, 0.90.3 Reporter: Jonathan Hsieh Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes. In this particular case we are overwriting the contents of a Get directly back onto itself as a Put. For example, this is a two cf row (with f1, f2, .. f9 cfs). It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row. Row row024461 had time stamps: [55: keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, row024461/f0:qual/1318200440867/Put/vlen=10, row024461/f1:data/1318200440867/Put/vlen=1000, row024461/f1:qual/1318200440867/Put/vlen=10, row024461/f2:data/1318200440867/Put/vlen=1000, row024461/f2:qual/1318200440867/Put/vlen=10, row024461/f3:data/1318200440867/Put/vlen=1000, row024461/f3:qual/1318200440867/Put/vlen=10, row024461/f4:data/1318200440867/Put/vlen=1000, row024461/f4:qual/1318200440867/Put/vlen=10}, 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, row024461/f5:qual/1318200440867/Put/vlen=10, row024461/f6:data/1318200440867/Put/vlen=1000, row024461/f6:qual/1318200440867/Put/vlen=10, row024461/f7:data/1318200440867/Put/vlen=1000, row024461/f7:qual/1318200440867/Put/vlen=10, row024461/f8:data/1318200440867/Put/vlen=1000, row024461/f8:qual/1318200440867/Put/vlen=10, row024461/f9:data/1318200440867/Put/vlen=1000, row024461/f9:qual/1318200440867/Put/vlen=10}] I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4604) hbase.client.TestHTablePool could start a single cluster instead of one per method
hbase.client.TestHTablePool could start a single cluster instead of one per method -- Key: HBASE-4604 URL: https://issues.apache.org/jira/browse/HBASE-4604 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.92.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor This tests starts/stops one cluster per method, while is would be possible to start it for the whole class. Using a single cluster allows the test to take 20s instead of 175s (after HBASE-4603, much more before). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4604) hbase.client.TestHTablePool could start a single cluster instead of one per method
[ https://issues.apache.org/jira/browse/HBASE-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4604: --- Attachment: 20111017_4604_TestHTablePool.patch hbase.client.TestHTablePool could start a single cluster instead of one per method -- Key: HBASE-4604 URL: https://issues.apache.org/jira/browse/HBASE-4604 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.92.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 20111017_4604_TestHTablePool.patch This tests starts/stops one cluster per method, while is would be possible to start it for the whole class. Using a single cluster allows the test to take 20s instead of 175s (after HBASE-4603, much more before). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families
[ https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129013#comment-13129013 ] Todd Lipcon commented on HBASE-4552: The trick is making sure it's atomic inside the region server - not just that the client sends all of the files for a given region in one RPC. If there are any concurrent scanners, then they should either see all of the new data or none of the new data on a given row. So we need some region-wide coordination. I think probably we have to take a write-lock on HRegion#lock multi-CF bulk load is not atomic across column families --- Key: HBASE-4552 URL: https://issues.apache.org/jira/browse/HBASE-4552 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Fix For: 0.92.0 Currently the bulk load API simply imports one HFile at a time. With multi-column-family support, this is inappropriate, since different CFs show up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we can online them all under a single region-wide lock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4486) Improve Javadoc for HTableDescriptor
[ https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129017#comment-13129017 ] Akash Ashok commented on HBASE-4486: Thanks Ram. I shall modify that. I would be glad to incorporate if there are any other comments. Improve Javadoc for HTableDescriptor Key: HBASE-4486 URL: https://issues.apache.org/jira/browse/HBASE-4486 Project: HBase Issue Type: Improvement Components: client, documentation Reporter: Akash Ashok Assignee: Akash Ashok Priority: Minor Attachments: HBase-4486-v2.patch, HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.
[ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4570: -- Fix Version/s: 0.90.5 Scan ACID problem with concurrent puts. --- Key: HBASE-4570 URL: https://issues.apache.org/jira/browse/HBASE-4570 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.1, 0.90.3 Reporter: Jonathan Hsieh Fix For: 0.90.5 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes. In this particular case we are overwriting the contents of a Get directly back onto itself as a Put. For example, this is a two cf row (with f1, f2, .. f9 cfs). It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row. Row row024461 had time stamps: [55: keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, row024461/f0:qual/1318200440867/Put/vlen=10, row024461/f1:data/1318200440867/Put/vlen=1000, row024461/f1:qual/1318200440867/Put/vlen=10, row024461/f2:data/1318200440867/Put/vlen=1000, row024461/f2:qual/1318200440867/Put/vlen=10, row024461/f3:data/1318200440867/Put/vlen=1000, row024461/f3:qual/1318200440867/Put/vlen=10, row024461/f4:data/1318200440867/Put/vlen=1000, row024461/f4:qual/1318200440867/Put/vlen=10}, 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, row024461/f5:qual/1318200440867/Put/vlen=10, row024461/f6:data/1318200440867/Put/vlen=1000, row024461/f6:qual/1318200440867/Put/vlen=10, row024461/f7:data/1318200440867/Put/vlen=1000, row024461/f7:qual/1318200440867/Put/vlen=10, row024461/f8:data/1318200440867/Put/vlen=1000, row024461/f8:qual/1318200440867/Put/vlen=10, row024461/f9:data/1318200440867/Put/vlen=1000, row024461/f9:qual/1318200440867/Put/vlen=10}] I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.
[ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129026#comment-13129026 ] Todd Lipcon commented on HBASE-4570: Cool, I will commit this to 90, 92, and trunk momentarily. Scan ACID problem with concurrent puts. --- Key: HBASE-4570 URL: https://issues.apache.org/jira/browse/HBASE-4570 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.1, 0.90.3 Reporter: Jonathan Hsieh Fix For: 0.90.5 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes. In this particular case we are overwriting the contents of a Get directly back onto itself as a Put. For example, this is a two cf row (with f1, f2, .. f9 cfs). It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row. Row row024461 had time stamps: [55: keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, row024461/f0:qual/1318200440867/Put/vlen=10, row024461/f1:data/1318200440867/Put/vlen=1000, row024461/f1:qual/1318200440867/Put/vlen=10, row024461/f2:data/1318200440867/Put/vlen=1000, row024461/f2:qual/1318200440867/Put/vlen=10, row024461/f3:data/1318200440867/Put/vlen=1000, row024461/f3:qual/1318200440867/Put/vlen=10, row024461/f4:data/1318200440867/Put/vlen=1000, row024461/f4:qual/1318200440867/Put/vlen=10}, 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, row024461/f5:qual/1318200440867/Put/vlen=10, row024461/f6:data/1318200440867/Put/vlen=1000, row024461/f6:qual/1318200440867/Put/vlen=10, row024461/f7:data/1318200440867/Put/vlen=1000, row024461/f7:qual/1318200440867/Put/vlen=10, row024461/f8:data/1318200440867/Put/vlen=1000, row024461/f8:qual/1318200440867/Put/vlen=10, row024461/f9:data/1318200440867/Put/vlen=1000, row024461/f9:qual/1318200440867/Put/vlen=10}] I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4570) Scan ACID problem with concurrent puts.
[ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HBASE-4570. Resolution: Fixed Fix Version/s: 0.92.0 Assignee: Jonathan Hsieh Hadoop Flags: Reviewed Fixed in 90, 92, trunk branches Scan ACID problem with concurrent puts. --- Key: HBASE-4570 URL: https://issues.apache.org/jira/browse/HBASE-4570 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.1, 0.90.3 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: 0.92.0, 0.90.5 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes. In this particular case we are overwriting the contents of a Get directly back onto itself as a Put. For example, this is a two cf row (with f1, f2, .. f9 cfs). It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row. Row row024461 had time stamps: [55: keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, row024461/f0:qual/1318200440867/Put/vlen=10, row024461/f1:data/1318200440867/Put/vlen=1000, row024461/f1:qual/1318200440867/Put/vlen=10, row024461/f2:data/1318200440867/Put/vlen=1000, row024461/f2:qual/1318200440867/Put/vlen=10, row024461/f3:data/1318200440867/Put/vlen=1000, row024461/f3:qual/1318200440867/Put/vlen=10, row024461/f4:data/1318200440867/Put/vlen=1000, row024461/f4:qual/1318200440867/Put/vlen=10}, 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, row024461/f5:qual/1318200440867/Put/vlen=10, row024461/f6:data/1318200440867/Put/vlen=1000, row024461/f6:qual/1318200440867/Put/vlen=10, row024461/f7:data/1318200440867/Put/vlen=1000, row024461/f7:qual/1318200440867/Put/vlen=10, row024461/f8:data/1318200440867/Put/vlen=1000, row024461/f8:qual/1318200440867/Put/vlen=10, row024461/f9:data/1318200440867/Put/vlen=1000, row024461/f9:qual/1318200440867/Put/vlen=10}] I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2739) Master should fail to start if it cannot successfully split logs
[ https://issues.apache.org/jira/browse/HBASE-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129027#comment-13129027 ] Alex Newman commented on HBASE-2739: poke Master should fail to start if it cannot successfully split logs Key: HBASE-2739 URL: https://issues.apache.org/jira/browse/HBASE-2739 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.20.4, 0.90.0 Reporter: Todd Lipcon Assignee: Alex Newman Priority: Critical In trunk, in splitLogAfterStartup(), we log the error splitting, but don't shut down. Depending on configuration, we should probably shut down here rather than continue with dataloss. In 0.20, we print the stacktrace to stdout in verifyClusterState, but continue through and often fail to start up -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats
[ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129035#comment-13129035 ] Todd Lipcon commented on HBASE-3929: Thanks for updating the patch to trunk. A couple of comments (fun to look back over my own code from a few months back): - let's rename {{pkv}} to {{prevKV}} - in the case of an empty HFile, we would currently throw a divide-by-zero. In LongStats.toString, we should check for count == 0 and return no data or something Add option to HFile tool to produce basic stats --- Key: HBASE-3929 URL: https://issues.apache.org/jira/browse/HBASE-3929 Project: HBase Issue Type: New Feature Components: io Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.94.0 Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it: - min/mean/max key size, value size (uncompressed) - min/mean/max number of columns per row (uncompressed) - min/mean/max number of bytes per row (uncompressed) - the key of the largest row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2739) Master should fail to start if it cannot successfully split logs
[ https://issues.apache.org/jira/browse/HBASE-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129036#comment-13129036 ] Ted Yu commented on HBASE-2739: --- In TRUNK, MasterFileSystem.splitLog() has the following: {code} } catch (IOException e) { LOG.error(Failed distributed splitting + serverNames, e); } ... } catch (IOException e) { LOG.error(Failed splitting + logDir.toString(), e); } finally { {code} Master should fail to start if it cannot successfully split logs Key: HBASE-2739 URL: https://issues.apache.org/jira/browse/HBASE-2739 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.20.4, 0.90.0 Reporter: Todd Lipcon Assignee: Alex Newman Priority: Critical In trunk, in splitLogAfterStartup(), we log the error splitting, but don't shut down. Depending on configuration, we should probably shut down here rather than continue with dataloss. In 0.20, we print the stacktrace to stdout in verifyClusterState, but continue through and often fail to start up -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4605) Add constraints as a top-level feature
Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files
[ https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans resolved HBASE-4579. --- Resolution: Fixed Release Note: MemStoreFlusher will now request only 1 compaction when waiting because there are too many store files, instead of one per check Hadoop Flags: Reviewed Committed to 0.92 and trunk, thanks for the review Ted! CST.requestCompaction semantics changed, logs are now spammed when too many store files --- Key: HBASE-4579 URL: https://issues.apache.org/jira/browse/HBASE-4579 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: HBASE-4579-v2.patch, HBASE-4579.patch Another bug I'm not so sure what's going on. I see this in my log: {quote} 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact {quote} It spams for a while, and a little later instead I get: {quote} 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. {quote} I believe I also saw something like that for flushes, but the region was closing so at least I know why it was spamming (would be nice if it just unrequested the flush): {quote} 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,921 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,922 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false,
[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129064#comment-13129064 ] Jesse Yates commented on HBASE-4605: I've been thinking about how to go about implementing this and have two ways to go about it. Method 1: My idea is to write a ConstraintProcessor that is a system level CP with system wide support for setting constraints on a table. this requires add 'top-level' configuration values that the user would set for constraints to run (which would be ordered like coprocessors), but they would just implement the 'Constraint' interface. This means modifying HTD and the shell to enable all these This means people need to distribute the jars and set conf values similar to what they would have to do before, but we would handle making sure the implemented Constraints get run in the right place, and propagate the errors (e.g. ConstraintFailedExecption) back to the users. Pluses: 1) Allows users to get back multiple reasons why a put failed. 2) Allows a ConstraintImpl to be a subclass of any arbitrary class and not bound to some abstract constraint. 3) People don't have to worry about it being a coprocessor - it is notionally divorced. Minuses: 1) Requires changing a bunch of code in HTableDescriptor and essentially duplicating a lot the checking/setting already done for coprocessors. This can be gotten around by generalizing the mechanism for storing classes in the HTD. Method 2 (already implemented, patch coming): Add superclass AbstractConstraint which only exposes a check(Put) method. It is actually a Coprocoessor which is loaded, processes the check and then returns the error to the client (wrapped in an IOException) on failure. Pluses: 1) We don't have to implement any new mechanisms for specifying the constraint, people just have to add it as a coprocessor. Minuses: 1) It could be confusing since with this mechanism, you just want people to think in terms of Constraints, not coprocessors 2) You are bound to extending the AbstractCoprocessor, not just implementing the interface 3) If just one constraint fails, then the put is rejected, so you can't find out all the reasons it would fail (useful if cleaning data). 4) It doesn't really help 'simplify' the use of HBase. In fact, it increases the complexity. Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4606) Remove spam in HCM and fix a list.size == 0
Remove spam in HCM and fix a list.size == 0 --- Key: HBASE-4606 URL: https://issues.apache.org/jira/browse/HBASE-4606 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.92.0 As discussed on the ML, HCM in 0.92 is being spammy with expecting X results which is a debug leftover. Also right next to it I see a list.size == 0, which should be converted into isEmpty. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4606) Remove spam in HCM and fix a list.size == 0
[ https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-4606: -- Attachment: HBASE-4606.patch Here's the patch I'm committing. Remove spam in HCM and fix a list.size == 0 --- Key: HBASE-4606 URL: https://issues.apache.org/jira/browse/HBASE-4606 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.92.0 Attachments: HBASE-4606.patch As discussed on the ML, HCM in 0.92 is being spammy with expecting X results which is a debug leftover. Also right next to it I see a list.size == 0, which should be converted into isEmpty. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4606) Remove spam in HCM and fix a list.size == 0
[ https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129077#comment-13129077 ] Ted Yu commented on HBASE-4606: --- +1 on patch. Remove spam in HCM and fix a list.size == 0 --- Key: HBASE-4606 URL: https://issues.apache.org/jira/browse/HBASE-4606 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.92.0 Attachments: HBASE-4606.patch As discussed on the ML, HCM in 0.92 is being spammy with expecting X results which is a debug leftover. Also right next to it I see a list.size == 0, which should be converted into isEmpty. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1611) Have shell output binary hex-encoded rather than octal-encoded
[ https://issues.apache.org/jira/browse/HBASE-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129085#comment-13129085 ] Alex Newman commented on HBASE-1611: I am a bit confused about this bug. When I print strings, it seems to be trying to use hexadecimal, but every-once in a while, it just falls apart. For instance the following works fine \x00\x00\x00\x00 column=on:\x00\x00\x1A\x1F, timestamp=1318533963647, value=\x00\x00\x00\x00 but \x00\x00\x00\x00 column=on:\x00\x00\x1A$, timestamp=1318533963647, value=\x00\x00\x00\x00 falls apart, for some reason. Is this what you are discussing? Have shell output binary hex-encoded rather than octal-encoded -- Key: HBASE-1611 URL: https://issues.apache.org/jira/browse/HBASE-1611 Project: HBase Issue Type: Bug Reporter: stack Labels: noob Native Ruby String dump and inspect output unprintables in octal. Don't seem to be able to change that fact. Figure way to do them as hex to match binaries in UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats
[ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-3929: --- Attachment: HBASE-3929-v2.patch Add option to HFile tool to produce basic stats --- Key: HBASE-3929 URL: https://issues.apache.org/jira/browse/HBASE-3929 Project: HBase Issue Type: New Feature Components: io Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.94.0 Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, hbase-3929-draft.txt In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it: - min/mean/max key size, value size (uncompressed) - min/mean/max number of columns per row (uncompressed) - min/mean/max number of bytes per row (uncompressed) - the key of the largest row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats
[ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129088#comment-13129088 ] Matteo Bertozzi commented on HBASE-3929: Currently HFilePrettyPrinter raise a couple of exceptions if the HFile is Empty, just because it doesn't check if seekTo() returns true or false, and the first call after seekTo() is a scanner.getKeyValue() so you get a NPE... I've added a v2 patch with the pkv rename, count == 0 handled, and seekTo checked to fix the NPE. Add option to HFile tool to produce basic stats --- Key: HBASE-3929 URL: https://issues.apache.org/jira/browse/HBASE-3929 Project: HBase Issue Type: New Feature Components: io Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.94.0 Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, hbase-3929-draft.txt In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it: - min/mean/max key size, value size (uncompressed) - min/mean/max number of columns per row (uncompressed) - min/mean/max number of bytes per row (uncompressed) - the key of the largest row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4606) Remove spam in HCM and fix a list.size == 0
[ https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans resolved HBASE-4606. --- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.92 and trunk, thanks for taking a look at the patch Ted. Remove spam in HCM and fix a list.size == 0 --- Key: HBASE-4606 URL: https://issues.apache.org/jira/browse/HBASE-4606 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.92.0 Attachments: HBASE-4606.patch As discussed on the ML, HCM in 0.92 is being spammy with expecting X results which is a debug leftover. Also right next to it I see a list.size == 0, which should be converted into isEmpty. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129094#comment-13129094 ] Gary Helmling commented on HBASE-4605: -- {quote} My idea is to write a ConstraintProcessor that is a system level CP with system wide support for setting constraints on a table. this requires add 'top-level' configuration values that the user would set for constraints to run (which would be ordered like coprocessors), but they would just implement the 'Constraint' interface. This means modifying HTD and the shell to enable all these {quote} I like the idea of using a system level coprocessor with a minimal extension interface for the checks to be performed. For the actual interface, you could even use Predicate from the google guava lib, or have Constraint just be a named interface that extends PredicatePut. Not critical, but plugging in to a standard interface instead of doing a one-off may enable future uses... For setting the constraint implementations to be applied per table, I agree that using table attributes is probably easiest. But I don't see why we need to modify HTableDescriptor to enable this? We currently have HBASE-4554, which is looking to enable setting table attributes for coprocessors from the shell. It seems like we could make that sufficiently generic to enable both the coprocessors case and this with just changes to the shell code? Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog
[ https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129096#comment-13129096 ] jirapos...@reviews.apache.org commented on HBASE-4528: -- bq. On 2011-10-17 08:46:53, ramkrishna vasudevan wrote: bq. @Dhruba bq. Any figures to know the improvement in write performance? I had updated the performance numbers in the JIRA comments. Please let me know if they are adequate. - Dhruba --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2141/#review2617 --- On 2011-10-17 04:39:55, Dhruba Borthakur wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2141/ bq. --- bq. bq. (Updated 2011-10-17 04:39:55) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. The changes the multiPut operation so that the sync to the wal occurs outside the rowlock. bq. bq. This enhancement is done only to HRegion.mut(Put[]) because this is the only method that gets invoked from an application. The HRegion.put(Put) is used only by unit tests and should possibly be deprecated. bq. bq. bq. This addresses bug HBASE-4528. bq. https://issues.apache.org/jira/browse/HBASE-4528 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1184991 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 1184991 bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1184991 bq. /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1184991 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184991 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 1184991 bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java PRE-CREATION bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1184991 bq. /src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 1184991 bq. bq. Diff: https://reviews.apache.org/r/2141/diff bq. bq. bq. Testing bq. --- bq. bq. I ran TestLogRolling over and over again, about 50 times, not failed a single time. bq. bq. bq. Thanks, bq. bq. Dhruba bq. bq. The put operation can release the rowlock before sync-ing the Hlog -- Key: HBASE-4528 URL: https://issues.apache.org/jira/browse/HBASE-4528 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, appendNoSyncPut5.txt This allows for better throughput when there are hot rows. A single row update improves from 100 puts/sec/server to 5000 puts/sec/server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4274) RS should periodically ping its HLog pipeline even if no writes are arriving
[ https://issues.apache.org/jira/browse/HBASE-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129147#comment-13129147 ] Ted Yu commented on HBASE-4274: --- Gary has addressed rolling restart of DNs. Can we move this issue to 0.94 ? RS should periodically ping its HLog pipeline even if no writes are arriving Key: HBASE-4274 URL: https://issues.apache.org/jira/browse/HBASE-4274 Project: HBase Issue Type: Improvement Components: regionserver, wal Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.92.0 If you restart HDFS underneath HBase, when HBase isn't taking any write load, the region servers won't notice that there's any problem until the next time they take a write, at which point they will abort (because the pipeline is gone from beneath them). It would be better if they wrote some garbage to their HLog once every few seconds as a sort of keepalive, so they will aggressively abort as soon as there's an issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4361) Certain filter expressions fail in the shell
[ https://issues.apache.org/jira/browse/HBASE-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4361: -- Fix Version/s: (was: 0.92.0) 0.94.0 Moving the remaining work to 0.94 Certain filter expressions fail in the shell Key: HBASE-4361 URL: https://issues.apache.org/jira/browse/HBASE-4361 Project: HBase Issue Type: Bug Components: filters, shell Affects Versions: 0.92.0 Reporter: Todd Lipcon Priority: Critical Fix For: 0.94.0 Attachments: Filter Language.docx, small-improvements.txt Running the following in the shell hangs and then fails: {noformat} scan 't1', { FILTER = SingleColumnValueFilter(, '1', 'f1', 'col_a') } {noformat} The error seems to be: org.jruby.exceptions.RaiseException: (NoMethodError) undefined method `write' for true:TrueClass -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129155#comment-13129155 ] nkeywal commented on HBASE-4605: It would be a very useful features. For what it worth, I am quite fine with method 2. Some points I found useful in the SQL engines: - capacity to disable the constraints for a connection (basically because yoy're doing a big set of operation and you want the best possible performances - capacity to disable globally (upgrade) I also wonder how we should manage the evolution of the constraint. On an SQL db, there is only one possible set; considering the amount of data, the traditional upgrade if the traditional sql db could not be possible here; so managing evolution of constraint would make sense. A comment as well: in a SQL system, the constraints are linked to the transaction: it's checked once the transaction is committed. This is important for cross table constraint checks, as there is no transaction between tables in HBase. I would tend to believe that's mainly a question of documentation (insert in the right order), but it's something to remember anyway (especially as you want to duplicate the relationships with HBase more than with a sql db)... I don't know the status of HCatalog, but I think the HCatalog schema will be transformable as HBase constraints, adding value to the two of them... Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes
[ https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129156#comment-13129156 ] Ted Yu commented on HBASE-4388: --- The new test would be similar to TestDFSUpgradeFromImage which uses hadoop-14-dfs-dir.tgz as a source for the old image. We should create some HBase table using hbase-0.90 and tar the entire dataset. Then write a unit test which untars the old dataset, and starts hbase-0.92. This should successfully read the old image and upgrade it. Second start after migration from 90 to trunk crashes - Key: HBASE-4388 URL: https://issues.apache.org/jira/browse/HBASE-4388 Project: HBase Issue Type: Bug Components: master, migration Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4388-v2.txt, 4388-v3.txt, 4388.txt, hbase-master-nase.log, meta.tgz I started a trunk cluster to upgrade from 90, inserted a ton of data, then did a clean shutdown. When I started again, I got the following exception: 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta now. 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting shutdown. java.lang.NegativeArraySizeException: -102 at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147) at org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606) at org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641) at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133) at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103) at org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228) at org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350) at org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235) at org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284) at org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298) at org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4254) Get tests passing on Hadoop 23
[ https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4254: -- Fix Version/s: (was: 0.92.0) 0.94.0 HBASE-4510 is marked for 0.94 Moving this issue to 0.94 as well. Get tests passing on Hadoop 23 -- Key: HBASE-4254 URL: https://issues.apache.org/jira/browse/HBASE-4254 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.0 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 build. It looks like most are reflection-based issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4254) Get tests passing on Hadoop 23
[ https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129168#comment-13129168 ] Todd Lipcon commented on HBASE-4254: Why move to 0.94? This isn't marked a blocker, and we should be targeting the 92 branch, even if it doesn't make it for 0.92.0. Get tests passing on Hadoop 23 -- Key: HBASE-4254 URL: https://issues.apache.org/jira/browse/HBASE-4254 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.0 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 build. It looks like most are reflection-based issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4254) Get tests passing on Hadoop 23
[ https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129178#comment-13129178 ] Ted Yu commented on HBASE-4254: --- Feel free to bring this and HBASE-4510 to 0.92, knowing they're not blockers. Get tests passing on Hadoop 23 -- Key: HBASE-4254 URL: https://issues.apache.org/jira/browse/HBASE-4254 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.0 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 build. It looks like most are reflection-based issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted
[ https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-4585: -- Attachment: (was: hbase-4585-apache.patch) Avoid seek operation when current kv is deleted --- Key: HBASE-4585 URL: https://issues.apache.org/jira/browse/HBASE-4585 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: hbase-4585-89.patch When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to seek. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted
[ https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-4585: -- Attachment: (was: hbase-4585-trunk.patch) Avoid seek operation when current kv is deleted --- Key: HBASE-4585 URL: https://issues.apache.org/jira/browse/HBASE-4585 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: hbase-4585-89.patch When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to seek. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted
[ https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-4585: -- Attachment: hbase-4585-apache.patch The patch for apache-trunk is ready. Avoid seek operation when current kv is deleted --- Key: HBASE-4585 URL: https://issues.apache.org/jira/browse/HBASE-4585 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: hbase-4585-89.patch When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to seek. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename
[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129188#comment-13129188 ] Ted Yu commented on HBASE-4553: --- We should impose an upper limit on the amount of time we wait in the loop. The update of .tableinfo is not atomic; we remove then rename - Key: HBASE-4553 URL: https://issues.apache.org/jira/browse/HBASE-4553 Project: HBase Issue Type: Task Reporter: stack Priority: Critical Fix For: 0.92.0 Attachments: HBase-4553-TestAvroServer.patch This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists already. In 0.20+ its better but still 'some' issues if existing reader when file is renamed. This issue is about fixing this (though we depend on fix first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted
[ https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-4585: -- Attachment: hbase-4585-apache-trunk.patch Avoid seek operation when current kv is deleted --- Key: HBASE-4585 URL: https://issues.apache.org/jira/browse/HBASE-4585 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to seek. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4538) NPE in AssignmentManager#updateTimers
[ https://issues.apache.org/jira/browse/HBASE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4538: -- Fix Version/s: (was: 0.92.0) 0.94.0 This was the last time NPE happened, 28 builds ago: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/42/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin/testOnlineChangeTableSchema/ If it happens again, we would get accurate line numbers. NPE in AssignmentManager#updateTimers - Key: HBASE-4538 URL: https://issues.apache.org/jira/browse/HBASE-4538 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.94.0 Saw this in a failed TestAdmin on 0.92 {code} 2011-10-05 01:18:58,890 ERROR [MASTER_OPEN_REGION-sv4r9s38,52146,131098450-2] executor.EventHandler(171): Caught throwable while processing event RS_ZK_REGION_OPENED java.lang.NullPointerException at org.apache.hadoop.hbase.master.AssignmentManager.updateTimers(AssignmentManager.java:1053) at org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:1027) at org.apache.hadoop.hbase.master.handler.OpenedRegionHandler.process(OpenedRegionHandler.java:108) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HBASE-4585) Avoid seek operation when current kv is deleted
[ https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-4585 started by Liyin Tang. Avoid seek operation when current kv is deleted --- Key: HBASE-4585 URL: https://issues.apache.org/jira/browse/HBASE-4585 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to seek. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129203#comment-13129203 ] Jesse Yates commented on HBASE-4605: @gary: {quote} I like the idea of using a system level coprocessor with a minimal extension interface for the checks to be performed. For the actual interface, you could even use Predicate from the google guava lib, or have Constraint just be a named interface that extends PredicatePut. Not critical, but plugging in to a standard interface instead of doing a one-off may enable future uses... {quote} That is exactly what I was thinking for the top-level implementation {quote} seems like we could make that sufficiently generic to enable both the coprocessors case and this with just changes to the shell code {quote} +1 Right now coprocessors have a special syntax for loading on table level, which feels kind of clunky to do by hand (specifying COPROCESSOR$). I feel like we could definitely help enable setting values with a more concrete syntax (like a setCoprocessor method that we have on the HTableDescriptor now), which should handle the numbering, etc. So using an abstract version of the stuff from 4554 would definitely help with that. I don't know if we can just the use update shell though - we would probably need to update the java connection as well. Right now the code for storing things in the conf would be fine, we just need to abstract it a little bit, so it would look something like: {code} public void addCoprocessor(name){ addProcessingElement(coprocessor$, name);} public void addConstriant(name){ addProcessingElement(constriant$, name);} public void addProcessingElement(String tag, String value){ ...//all the checking/add currently in addCoprocessor } {code} @keywal: Since they are just table configuration values, turning them on/off will be relatively painless. Cross-table transactions is separate can of worms and really goes against the whole design paradigm of HBase (see discussion on dev about this). This would be optimized to do single table checking, though people could implement cross table checks at serious cost (and later we can build in more optimized mechanisms if it is a common thing people do). {quote} HCatalog schema will be transformable as HBase constraints, adding value to the two of them... {quote} That should be super simple, it would just take a simple tool to create the corresponding constraints. I would use constraints to enforce things like data sanitation, rather than schema enforcement (its the last ditch barrier to things going into a table properly, since shipping things across the wire is expensive), which should be done client side, but it could definitely work. Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2739) Master should fail to start if it cannot successfully split logs
[ https://issues.apache.org/jira/browse/HBASE-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129225#comment-13129225 ] Alex Newman commented on HBASE-2739: - Is the appropriate way to fail, just to throw? What's the science here? - I just created a conf variable which controls whether or not this should cause a shutdown. Are there any other cases where we do/don't want to shutdown? Do we need a different variable for distributed / single node spliting? Master should fail to start if it cannot successfully split logs Key: HBASE-2739 URL: https://issues.apache.org/jira/browse/HBASE-2739 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.20.4, 0.90.0 Reporter: Todd Lipcon Assignee: Alex Newman Priority: Critical In trunk, in splitLogAfterStartup(), we log the error splitting, but don't shut down. Depending on configuration, we should probably shut down here rather than continue with dataloss. In 0.20, we print the stacktrace to stdout in verifyClusterState, but continue through and often fail to start up -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted
[ https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4585: -- Fix Version/s: 0.94.0 Avoid seek operation when current kv is deleted --- Key: HBASE-4585 URL: https://issues.apache.org/jira/browse/HBASE-4585 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Fix For: 0.94.0 Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch When the current kv is deleted during the matching in the ScanQueryMatcher, currently the matcher will return skip and continue to seek. Actually, if the current kv is deleted because of family deleted or column deleted, the matcher should seek to next col. If the current kv is deleted because of version deleted, the matcher should just return skip. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4607) SplitLogWorker should correctly terminate when waiting for ZK node
SplitLogWorker should correctly terminate when waiting for ZK node -- Key: HBASE-4607 URL: https://issues.apache.org/jira/browse/HBASE-4607 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor This is an attempt to fix the fact that SplitLogWorker threads are not being terminated properly in some unit tests. This probably does not happen in production because the master always creates the log-splitting ZK node, but it does happen in 89-fb. Thanks to Prakash Khemani for help on this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4254) Get tests passing on Hadoop 23
[ https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4254: -- Fix Version/s: (was: 0.94.0) 0.92.0 Get tests passing on Hadoop 23 -- Key: HBASE-4254 URL: https://issues.apache.org/jira/browse/HBASE-4254 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.92.0 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 build. It looks like most are reflection-based issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4606) Remove spam in HCM and fix a list.size == 0
[ https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129279#comment-13129279 ] Hudson commented on HBASE-4606: --- Integrated in HBase-0.92 #71 (See [https://builds.apache.org/job/HBase-0.92/71/]) HBASE-4606 Remove spam in HCM and fix a list.size == 0 jdcryans : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java Remove spam in HCM and fix a list.size == 0 --- Key: HBASE-4606 URL: https://issues.apache.org/jira/browse/HBASE-4606 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.92.0 Attachments: HBASE-4606.patch As discussed on the ML, HCM in 0.92 is being spammy with expecting X results which is a debug leftover. Also right next to it I see a list.size == 0, which should be converted into isEmpty. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4460) Support running an embedded ThriftServer within a RegionServer
[ https://issues.apache.org/jira/browse/HBASE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129286#comment-13129286 ] jirapos...@reviews.apache.org commented on HBASE-4460: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/ --- Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell. Summary --- Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This addresses bug HBASE-4460. https://issues.apache.org/jira/browse/HBASE-4460 Diffs - /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION Diff: https://reviews.apache.org/r/2410/diff Testing --- Running this already on our hbase-92-based branch and running test site. Thanks, Jonathan Support running an embedded ThriftServer within a RegionServer -- Key: HBASE-4460 URL: https://issues.apache.org/jira/browse/HBASE-4460 Project: HBase Issue Type: New Feature Components: regionserver, thrift Reporter: Jonathan Gray Assignee: Jonathan Gray Attachments: HBASE-4460-v1.patch Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This JIRA is just about the embedded ThriftServer. Will open others for the rest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-4562. -- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.90, 0.92, and trunk. The 0.90 patch still did not apply to the current 0.90 branch. I applied the changes manually this time, but in the future it would be great to base patches of the latests state of the branch in SVN. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4607) Split log worker should terminate properly when waiting for znode
[ https://issues.apache.org/jira/browse/HBASE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-4607: -- Summary: Split log worker should terminate properly when waiting for znode (was: SplitLogWorker should correctly terminate when waiting for ZK node) Split log worker should terminate properly when waiting for znode - Key: HBASE-4607 URL: https://issues.apache.org/jira/browse/HBASE-4607 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor This is an attempt to fix the fact that SplitLogWorker threads are not being terminated properly in some unit tests. This probably does not happen in production because the master always creates the log-splitting ZK node, but it does happen in 89-fb. Thanks to Prakash Khemani for help on this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4603) Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers
[ https://issues.apache.org/jira/browse/HBASE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129295#comment-13129295 ] Jonathan Gray commented on HBASE-4603: -- There was a nice param in HBASE-3380 that is in 90 but not 92/trunk. I'm going to see if we can get that brought into the active branches, then we can just set the maxServers config to the # of RS set to start, and then it will just work instantly w/o having to wait for this interval/sleep loop. Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers --- Key: HBASE-4603 URL: https://issues.apache.org/jira/browse/HBASE-4603 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.92.0 Environment: all. Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 20111017_4603_MiniHBaseCluster.patch This functions waits for at least 2 times hbase.master.wait.on.regionservers.interval, defaulted at 3 seconds, i.e. 6 seconds for every mini hbase cluster starts. In the context of a mini cluster, it's not useful, as the regions servers are created locally. Changing this to a lower value such as 100ms gives 5.8 second per HBase cluser start. It should lower the build time on the apache server by more than 8%. Beeing more aggressive (removing all the wait time) could be possible as well. To be studied later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4607) Split log worker should terminate properly when waiting for znode
[ https://issues.apache.org/jira/browse/HBASE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129297#comment-13129297 ] jirapos...@reviews.apache.org commented on HBASE-4607: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2411/ --- Review request for hbase and Prakash Khemani. Summary --- This is an attempt to fix the fact that SplitLogWorker threads are not being terminated properly in some unit tests. This probably does not happen in production because the master always creates the log-splitting ZK node, but it does happen in 89-fb. Thanks to Prakash Khemani for help on this. This addresses bug hbase-4607. https://issues.apache.org/jira/browse/hbase-4607 Diffs - src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java a43e0b3 Diff: https://reviews.apache.org/r/2411/diff Testing --- I will run unit tests and post an update when they have passed. Thanks, Mikhail Split log worker should terminate properly when waiting for znode - Key: HBASE-4607 URL: https://issues.apache.org/jira/browse/HBASE-4607 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor This is an attempt to fix the fact that SplitLogWorker threads are not being terminated properly in some unit tests. This probably does not happen in production because the master always creates the log-splitting ZK node, but it does happen in 89-fb. Thanks to Prakash Khemani for help on this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers
[ https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129293#comment-13129293 ] Jonathan Gray commented on HBASE-3380: -- So it looks like we thought we'd do a proper fix for 0.92, but do we have one? There's some good config params that were committed as part of this JIRA into 0.90 that are now not available in 0.92. Should this be committed to 0.92 and trunk? I'd like to at least bring these config params over since they are pretty nice (and will make a more elegant solution to stuff like HBASE-4603). Master failover can split logs of live servers -- Key: HBASE-3380 URL: https://issues.apache.org/jira/browse/HBASE-3380 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jonathan Gray Priority: Blocker Fix For: 0.90.0 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch The reason why TestMasterFailover fails is that when it does the master failover, the new master doesn't wait long enough for all region servers to checkin so it goes ahead and split logs... which doesn't work because of the way lease timeouts work: {noformat} 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] wal.HLogSplitter(256): Splitting hlog 1 of 1: hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204, length=0 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-1,5,main]: starting 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-2,5,main]: starting 2010-12-21 07:30:36,977 INFO [Master:0;vesta.apache.org:33170] util.FSUtils(625): Recovering file hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 2010-12-21 07:30:36,979 WARN [IPC Server handler 8 on 49187] namenode.FSNamesystem(1122): DIR* NameSystem.startFile: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 ... 2010-12-21 07:33:44,332 WARN [Master:0;vesta.apache.org:33170] util.FSUtils(644): Waited 187354ms for lease recovery on hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 {noformat} I think that we should always check in ZK the number of live region servers before waiting for them to check in, this way we know how many we should expect during failover. There's also a case where we still want to timeout, since RS can die during that time, but we should wait a bit longer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4608) HLog Compression
HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers
[ https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129308#comment-13129308 ] Ted Yu commented on HBASE-3380: --- +1 on bringing over the parameters. Master failover can split logs of live servers -- Key: HBASE-3380 URL: https://issues.apache.org/jira/browse/HBASE-3380 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jonathan Gray Priority: Blocker Fix For: 0.90.0 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch The reason why TestMasterFailover fails is that when it does the master failover, the new master doesn't wait long enough for all region servers to checkin so it goes ahead and split logs... which doesn't work because of the way lease timeouts work: {noformat} 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] wal.HLogSplitter(256): Splitting hlog 1 of 1: hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204, length=0 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-1,5,main]: starting 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-2,5,main]: starting 2010-12-21 07:30:36,977 INFO [Master:0;vesta.apache.org:33170] util.FSUtils(625): Recovering file hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 2010-12-21 07:30:36,979 WARN [IPC Server handler 8 on 49187] namenode.FSNamesystem(1122): DIR* NameSystem.startFile: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 ... 2010-12-21 07:33:44,332 WARN [Master:0;vesta.apache.org:33170] util.FSUtils(644): Waited 187354ms for lease recovery on hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 {noformat} I think that we should always check in ZK the number of live region servers before waiting for them to check in, this way we know how many we should expect during failover. There's also a case where we still want to timeout, since RS can die during that time, but we should wait a bit longer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers
[ https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129334#comment-13129334 ] Jean-Daniel Cryans commented on HBASE-3380: --- Let's do it, +1. Master failover can split logs of live servers -- Key: HBASE-3380 URL: https://issues.apache.org/jira/browse/HBASE-3380 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jonathan Gray Priority: Blocker Fix For: 0.90.0 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch The reason why TestMasterFailover fails is that when it does the master failover, the new master doesn't wait long enough for all region servers to checkin so it goes ahead and split logs... which doesn't work because of the way lease timeouts work: {noformat} 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] wal.HLogSplitter(256): Splitting hlog 1 of 1: hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204, length=0 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-1,5,main]: starting 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-2,5,main]: starting 2010-12-21 07:30:36,977 INFO [Master:0;vesta.apache.org:33170] util.FSUtils(625): Recovering file hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 2010-12-21 07:30:36,979 WARN [IPC Server handler 8 on 49187] namenode.FSNamesystem(1122): DIR* NameSystem.startFile: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 ... 2010-12-21 07:33:44,332 WARN [Master:0;vesta.apache.org:33170] util.FSUtils(644): Waited 187354ms for lease recovery on hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 {noformat} I think that we should always check in ZK the number of live region servers before waiting for them to check in, this way we know how many we should expect during failover. There's also a case where we still want to timeout, since RS can die during that time, but we should wait a bit longer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4574) Node deleted but still in RIT printed too often
[ https://issues.apache.org/jira/browse/HBASE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans resolved HBASE-4574. --- Resolution: Duplicate Fix Version/s: (was: 0.92.0) Dup of HBASE-4308 Node deleted but still in RIT printed too often - Key: HBASE-4574 URL: https://issues.apache.org/jira/browse/HBASE-4574 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Looking at the 0.92 master logs, I see I often get this message: bq. WARN org.apache.hadoop.hbase.master.AssignmentManager: Node deleted but still in RIT: TestTable,blah. state=OPEN, ts=1318369648361, server=blah The issue seems to be due to a race between OpenedRegionHandler and watchers in AssignmentManager. Specifically, ORH first deletes the znode then deletes the in-memory RIT data structure (via regionOnline). Between the two steps a watcher is triggered and if it arrives first then it will see the region still in RIT. If the message is really supposed to be a warning then in the current form it's useless as people will see this message 99% of the time because of this race. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers
[ https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129339#comment-13129339 ] Jonathan Gray commented on HBASE-3380: -- What's the best practice here? Should I just commit this to 92 and trunk and make a note here? Should I open a new jira since this is so old? (Thanks for input guys) Master failover can split logs of live servers -- Key: HBASE-3380 URL: https://issues.apache.org/jira/browse/HBASE-3380 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jonathan Gray Priority: Blocker Fix For: 0.90.0 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch The reason why TestMasterFailover fails is that when it does the master failover, the new master doesn't wait long enough for all region servers to checkin so it goes ahead and split logs... which doesn't work because of the way lease timeouts work: {noformat} 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] wal.HLogSplitter(256): Splitting hlog 1 of 1: hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204, length=0 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-1,5,main]: starting 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-2,5,main]: starting 2010-12-21 07:30:36,977 INFO [Master:0;vesta.apache.org:33170] util.FSUtils(625): Recovering file hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 2010-12-21 07:30:36,979 WARN [IPC Server handler 8 on 49187] namenode.FSNamesystem(1122): DIR* NameSystem.startFile: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 ... 2010-12-21 07:33:44,332 WARN [Master:0;vesta.apache.org:33170] util.FSUtils(644): Waited 187354ms for lease recovery on hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 {noformat} I think that we should always check in ZK the number of live region servers before waiting for them to check in, this way we know how many we should expect during failover. There's also a case where we still want to timeout, since RS can die during that time, but we should wait a bit longer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4607) Split log worker should terminate properly when waiting for znode
[ https://issues.apache.org/jira/browse/HBASE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129340#comment-13129340 ] jirapos...@reviews.apache.org commented on HBASE-4607: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2411/#review2635 --- src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java https://reviews.apache.org/r/2411/#comment5940 If exitWorker is false, we would enter taskLoop at line 167. Is this desirable ? - Ted On 2011-10-17 22:41:49, Mikhail Bautin wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2411/ bq. --- bq. bq. (Updated 2011-10-17 22:41:49) bq. bq. bq. Review request for hbase and Prakash Khemani. bq. bq. bq. Summary bq. --- bq. bq. This is an attempt to fix the fact that SplitLogWorker threads are not being terminated properly in some unit tests. This probably does not happen in production because the master always creates the log-splitting ZK node, but it does happen in 89-fb. Thanks to Prakash Khemani for help on this. bq. bq. bq. This addresses bug hbase-4607. bq. https://issues.apache.org/jira/browse/hbase-4607 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java a43e0b3 bq. bq. Diff: https://reviews.apache.org/r/2411/diff bq. bq. bq. Testing bq. --- bq. bq. I will run unit tests and post an update when they have passed. bq. bq. bq. Thanks, bq. bq. Mikhail bq. bq. Split log worker should terminate properly when waiting for znode - Key: HBASE-4607 URL: https://issues.apache.org/jira/browse/HBASE-4607 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor This is an attempt to fix the fact that SplitLogWorker threads are not being terminated properly in some unit tests. This probably does not happen in production because the master always creates the log-splitting ZK node, but it does happen in 89-fb. Thanks to Prakash Khemani for help on this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers
[ https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129341#comment-13129341 ] Jean-Daniel Cryans commented on HBASE-3380: --- Since it's almost a year old I'd prefer a new jira. Master failover can split logs of live servers -- Key: HBASE-3380 URL: https://issues.apache.org/jira/browse/HBASE-3380 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jonathan Gray Priority: Blocker Fix For: 0.90.0 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch The reason why TestMasterFailover fails is that when it does the master failover, the new master doesn't wait long enough for all region servers to checkin so it goes ahead and split logs... which doesn't work because of the way lease timeouts work: {noformat} 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] wal.HLogSplitter(256): Splitting hlog 1 of 1: hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204, length=0 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-1,5,main]: starting 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-2,5,main]: starting 2010-12-21 07:30:36,977 INFO [Master:0;vesta.apache.org:33170] util.FSUtils(625): Recovering file hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 2010-12-21 07:30:36,979 WARN [IPC Server handler 8 on 49187] namenode.FSNamesystem(1122): DIR* NameSystem.startFile: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 ... 2010-12-21 07:33:44,332 WARN [Master:0;vesta.apache.org:33170] util.FSUtils(644): Waited 187354ms for lease recovery on hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 {noformat} I think that we should always check in ZK the number of live region servers before waiting for them to check in, this way we know how many we should expect during failover. There's also a case where we still want to timeout, since RS can die during that time, but we should wait a bit longer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers
[ https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129342#comment-13129342 ] Ted Yu commented on HBASE-3380: --- Please open new JIRA where we may come up with better idea. Master failover can split logs of live servers -- Key: HBASE-3380 URL: https://issues.apache.org/jira/browse/HBASE-3380 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jonathan Gray Priority: Blocker Fix For: 0.90.0 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch The reason why TestMasterFailover fails is that when it does the master failover, the new master doesn't wait long enough for all region servers to checkin so it goes ahead and split logs... which doesn't work because of the way lease timeouts work: {noformat} 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] wal.HLogSplitter(256): Splitting hlog 1 of 1: hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204, length=0 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-1,5,main]: starting 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-2,5,main]: starting 2010-12-21 07:30:36,977 INFO [Master:0;vesta.apache.org:33170] util.FSUtils(625): Recovering file hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 2010-12-21 07:30:36,979 WARN [IPC Server handler 8 on 49187] namenode.FSNamesystem(1122): DIR* NameSystem.startFile: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 ... 2010-12-21 07:33:44,332 WARN [Master:0;vesta.apache.org:33170] util.FSUtils(644): Waited 187354ms for lease recovery on hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 {noformat} I think that we should always check in ZK the number of live region servers before waiting for them to check in, this way we know how many we should expect during failover. There's also a case where we still want to timeout, since RS can die during that time, but we should wait a bit longer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead
ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead - Key: HBASE-4609 URL: https://issues.apache.org/jira/browse/HBASE-4609 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Minor Fix For: 0.92.0 ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't include start code. Need to fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead
[ https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray updated HBASE-4609: - Attachment: HBASE-4609-v1.patch As advertised. Against trunk. ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead - Key: HBASE-4609 URL: https://issues.apache.org/jira/browse/HBASE-4609 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Minor Fix For: 0.92.0 Attachments: HBASE-4609-v1.patch ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't include start code. Need to fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead
[ https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray updated HBASE-4609: - Status: Patch Available (was: Open) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead - Key: HBASE-4609 URL: https://issues.apache.org/jira/browse/HBASE-4609 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Minor Fix For: 0.92.0 Attachments: HBASE-4609-v1.patch ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't include start code. Need to fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers
[ https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129344#comment-13129344 ] Jonathan Gray commented on HBASE-3380: -- Heartbeats still exist so I'm not sure much is different in 92 since we tackled this, right? I will open a new JIRA though. Master failover can split logs of live servers -- Key: HBASE-3380 URL: https://issues.apache.org/jira/browse/HBASE-3380 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jonathan Gray Priority: Blocker Fix For: 0.90.0 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch The reason why TestMasterFailover fails is that when it does the master failover, the new master doesn't wait long enough for all region servers to checkin so it goes ahead and split logs... which doesn't work because of the way lease timeouts work: {noformat} 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] wal.HLogSplitter(256): Splitting hlog 1 of 1: hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204, length=0 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-1,5,main]: starting 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] wal.HLogSplitter$WriterThread(619): Writer thread Thread[WriterThread-2,5,main]: starting 2010-12-21 07:30:36,977 INFO [Master:0;vesta.apache.org:33170] util.FSUtils(625): Recovering file hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 2010-12-21 07:30:36,979 WARN [IPC Server handler 8 on 49187] namenode.FSNamesystem(1122): DIR* NameSystem.startFile: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 ... 2010-12-21 07:33:44,332 WARN [Master:0;vesta.apache.org:33170] util.FSUtils(644): Waited 187354ms for lease recovery on hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204 for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, because this file is already being created by DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 127.0.0.1 {noformat} I think that we should always check in ZK the number of live region servers before waiting for them to check in, this way we know how many we should expect during failover. There's also a case where we still want to timeout, since RS can die during that time, but we should wait a bit longer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129393#comment-13129393 ] Andrew Purtell commented on HBASE-4605: --- +1 on choice #1, with generalization of argument passing to coprocessors via table attributes. HBASE-4048 and HBASE-4554 are a start there. Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4611) Add support for Phabricator/Differential as an alternative code review tool
Add support for Phabricator/Differential as an alternative code review tool --- Key: HBASE-4611 URL: https://issues.apache.org/jira/browse/HBASE-4611 Project: HBase Issue Type: Task Reporter: Jonathan Gray From http://phabricator.org/ : Phabricator is a open source collection of web applications which make it easier to write, review, and share source code. It is currently available as an early release. Phabricator was developed at Facebook. It's open source so pretty much anyone could host an instance of this software. To begin with, there will be a public-facing instance located at http://reviews.facebook.net (sponsored by Facebook and hosted by the OSUOSL http://osuosl.org). We will use this JIRA to deal with adding (and ensuring) Apache-friendly support that will allow us to do code reviews with Phabricator for HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4611) Add support for Phabricator/Differential as an alternative code review tool
[ https://issues.apache.org/jira/browse/HBASE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129397#comment-13129397 ] Jonathan Gray commented on HBASE-4611: -- In addition to being a (better) code review tool, the Phabricator suite also includes stuff like repo/revision browsing, nice command-line tools, pastebin, etc. which should be available for the HBase repos. Add support for Phabricator/Differential as an alternative code review tool --- Key: HBASE-4611 URL: https://issues.apache.org/jira/browse/HBASE-4611 Project: HBase Issue Type: Task Reporter: Jonathan Gray From http://phabricator.org/ : Phabricator is a open source collection of web applications which make it easier to write, review, and share source code. It is currently available as an early release. Phabricator was developed at Facebook. It's open source so pretty much anyone could host an instance of this software. To begin with, there will be a public-facing instance located at http://reviews.facebook.net (sponsored by Facebook and hosted by the OSUOSL http://osuosl.org). We will use this JIRA to deal with adding (and ensuring) Apache-friendly support that will allow us to do code reviews with Phabricator for HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129416#comment-13129416 ] bluedavy commented on HBASE-4562: - em,thks. When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4606) Remove spam in HCM and fix a list.size == 0
[ https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129420#comment-13129420 ] Hudson commented on HBASE-4606: --- Integrated in HBase-TRUNK #2332 (See [https://builds.apache.org/job/HBase-TRUNK/2332/]) HBASE-4606 Remove spam in HCM and fix a list.size == 0 jdcryans : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java Remove spam in HCM and fix a list.size == 0 --- Key: HBASE-4606 URL: https://issues.apache.org/jira/browse/HBASE-4606 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.92.0 Attachments: HBASE-4606.patch As discussed on the ML, HCM in 0.92 is being spammy with expecting X results which is a debug leftover. Also right next to it I see a list.size == 0, which should be converted into isEmpty. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files
[ https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129419#comment-13129419 ] Hudson commented on HBASE-4579: --- Integrated in HBase-TRUNK #2332 (See [https://builds.apache.org/job/HBase-TRUNK/2332/]) HBASE-4579 CST.requestCompaction semantics changed, logs are now spammed when too many store files forgot the CHANGES HBASE-4579 CST.requestCompaction semantics changed, logs are now spammed when too many store files jdcryans : Files : * /hbase/trunk/CHANGES.txt jdcryans : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java CST.requestCompaction semantics changed, logs are now spammed when too many store files --- Key: HBASE-4579 URL: https://issues.apache.org/jira/browse/HBASE-4579 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: HBASE-4579-v2.patch, HBASE-4579.patch Another bug I'm not so sure what's going on. I see this in my log: {quote} 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact {quote} It spams for a while, and a little later instead I get: {quote} 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. {quote} I believe I also saw something like that for flushes, but the region was closing so at least I know why it was spamming (would be nice if it just unrequested the flush): {quote} 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12
[jira] [Updated] (HBASE-4486) Improve Javadoc for HTableDescriptor
[ https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash Ashok updated HBASE-4486: --- Attachment: HBase-4486-v3.patch Ram's Comments Incorporated Improve Javadoc for HTableDescriptor Key: HBASE-4486 URL: https://issues.apache.org/jira/browse/HBASE-4486 Project: HBase Issue Type: Improvement Components: client, documentation Reporter: Akash Ashok Assignee: Akash Ashok Priority: Minor Attachments: HBase-4486-v2.patch, HBase-4486-v3.patch, HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4580) Create some invalid zk nodes when a clean cluster start.
[ https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129430#comment-13129430 ] jirapos...@reviews.apache.org commented on HBASE-4580: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ --- Review request for hbase. Summary --- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580. https://issues.apache.org/jira/browse/HBASE-4580 Diffs - /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing --- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S --- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao Create some invalid zk nodes when a clean cluster start. Key: HBASE-4580 URL: https://issues.apache.org/jira/browse/HBASE-4580 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.92.0 Attachments: HBASE-4580_TrunkV1.patch The below logs said that we created a invalid zk node when restarted a cluster. it mistakenly believed that the regions belong to a dead server. 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta updated status = true 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: ROOT/Meta already up-to date with new HRI. 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state 2011-10-11 05:05:29,195 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state 2011-10-11 05:05:29,229 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for db5f641452a70b09b85a92970e4198c7 with OFFLINE state 2011-10-11 05:05:29,237 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state 2011-10-11 05:05:29,253 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for c9385619425f737eab1a6624d2e097a8 with OFFLINE state // we cleaned all zk nodes. 2011-10-11 05:05:29,262 INFO org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. Assigning userregions 2011-10-11 05:05:29,262 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Deleting any existing unassigned nodes 2011-10-11 05:05:29,367 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) across 1 server(s), retainAssignment=true 2011-10-11 05:05:29,369 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Timeout-on-RIT=9000 2011-10-11 05:05:29,369 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
[jira] [Updated] (HBASE-4605) Add constraints as a top-level feature
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-4605: --- Attachment: java_Constraint_v2.patch Constraint implementation that is just added as a coprocessor. Not implemented as a Precondition for ease, though it could be ported over to that fairly easily. Basically, putting this up for posterity since the consensus seems to be pursuing #1 above. Also, is there a better way to pass back exceptions from coprocessors? Right now, the exception causes a retry which is a huge timeout problem Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates Attachments: java_Constraint_v2.patch From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4580) Create some invalid zk nodes when a clean cluster start.
[ https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129444#comment-13129444 ] jirapos...@reviews.apache.org commented on HBASE-4580: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ --- (Updated 2011-10-18 02:57:00.553590) Review request for hbase. Changes --- Sorry, I uploaded error patch file that lost a line of code Summary --- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580. https://issues.apache.org/jira/browse/HBASE-4580 Diffs (updated) - /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing --- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S --- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao Create some invalid zk nodes when a clean cluster start. Key: HBASE-4580 URL: https://issues.apache.org/jira/browse/HBASE-4580 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.92.0 Attachments: HBASE-4580_TrunkV1.patch The below logs said that we created a invalid zk node when restarted a cluster. it mistakenly believed that the regions belong to a dead server. 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta updated status = true 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: ROOT/Meta already up-to date with new HRI. 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state 2011-10-11 05:05:29,195 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state 2011-10-11 05:05:29,229 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for db5f641452a70b09b85a92970e4198c7 with OFFLINE state 2011-10-11 05:05:29,237 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state 2011-10-11 05:05:29,253 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for c9385619425f737eab1a6624d2e097a8 with OFFLINE state // we cleaned all zk nodes. 2011-10-11 05:05:29,262 INFO org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. Assigning userregions 2011-10-11 05:05:29,262 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Deleting any existing unassigned nodes 2011-10-11 05:05:29,367 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) across 1 server(s), retainAssignment=true 2011-10-11 05:05:29,369 DEBUG
[jira] [Commented] (HBASE-4580) Create some invalid zk nodes when a clean cluster start.
[ https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129449#comment-13129449 ] jirapos...@reviews.apache.org commented on HBASE-4580: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/#review2639 --- Overall patch looks good. TestCatalogTrackerOnCluster#testBadOriginalRootLocation passed. See minor comments below. /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/2420/#comment5941 Should read 'or regions that were in RIT' /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/2420/#comment5942 Should be on line 2229. 'is' should be 'in' /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/2420/#comment5943 Some people prefer the old format: ampersand at the end of first line signifying continuation on the second line /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/2420/#comment5944 There should be a space between if and left parenthesis - Ted On 2011-10-18 02:57:00, jinchao gao wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2420/ bq. --- bq. bq. (Updated 2011-10-18 02:57:00) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. https://issues.apache.org/jira/browse/HBASE-4580 bq. bq. bq. This addresses bug HBASE-4580. bq. https://issues.apache.org/jira/browse/HBASE-4580 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 bq. bq. Diff: https://reviews.apache.org/r/2420/diff bq. bq. bq. Testing bq. --- bq. bq. 1. I tested it in real cluster(3 nodes, created a table with 15 regions). bq. a)restart the cluster. bq. b)kill master and then start master bq. c)kill master and one region server, then start master. bq. bq. 2. all the UT test cased passed.(I tested twice) bq. Results : bq. bq. Tests in error: bq. testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org bq. bq. Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 bq. bq. The TestCatalogTrackerOnCluster passed in a connected network environment. bq. T E S T S bq. --- bq. Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster bq. Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec bq. bq. Results : bq. bq. Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 bq. bq. bq. Thanks, bq. bq. jinchao bq. bq. Create some invalid zk nodes when a clean cluster start. Key: HBASE-4580 URL: https://issues.apache.org/jira/browse/HBASE-4580 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.92.0 Attachments: HBASE-4580_TrunkV1.patch The below logs said that we created a invalid zk node when restarted a cluster. it mistakenly believed that the regions belong to a dead server. 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta updated status = true 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: ROOT/Meta already up-to date with new HRI. 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for
[jira] [Updated] (HBASE-4580) Some invalid zk nodes were created when a clean cluster restarts
[ https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4580: -- Summary: Some invalid zk nodes were created when a clean cluster restarts (was: Create some invalid zk nodes when a clean cluster start.) Some invalid zk nodes were created when a clean cluster restarts Key: HBASE-4580 URL: https://issues.apache.org/jira/browse/HBASE-4580 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.92.0 Attachments: HBASE-4580_TrunkV1.patch The below logs said that we created a invalid zk node when restarted a cluster. it mistakenly believed that the regions belong to a dead server. 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta updated status = true 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: ROOT/Meta already up-to date with new HRI. 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state 2011-10-11 05:05:29,195 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state 2011-10-11 05:05:29,229 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for db5f641452a70b09b85a92970e4198c7 with OFFLINE state 2011-10-11 05:05:29,237 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state 2011-10-11 05:05:29,253 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Creating (or updating) unassigned node for c9385619425f737eab1a6624d2e097a8 with OFFLINE state // we cleaned all zk nodes. 2011-10-11 05:05:29,262 INFO org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. Assigning userregions 2011-10-11 05:05:29,262 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Deleting any existing unassigned nodes 2011-10-11 05:05:29,367 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) across 1 server(s), retainAssignment=true 2011-10-11 05:05:29,369 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Timeout-on-RIT=9000 2011-10-11 05:05:29,369 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) to C3S3,54366,1318323920153 2011-10-11 05:05:29,369 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Async create of unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state 2011-10-11 05:05:29,371 INFO org.apache.hadoop.hbase.master.HMaster: Master has completed initialization 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Async create of unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Async create of unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Async create of unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a38 Async create of unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state 2011-10-11 05:05:29,372 DEBUG
[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129456#comment-13129456 ] Ted Yu commented on HBASE-4605: --- See the following from HBASE-4014 w.r.t. exceptions and coprocessors: The general gist here is to wrap each of {Master,RegionServer}CoprocessorHost's coprocessor call inside a try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) } block. handleCoprocessorThrowable() is responsible for either passing 'e' along to the client (if 'e' is an IOException) or, otherwise, aborting the service (Regionserver or Master). Add constraints as a top-level feature -- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates Attachments: java_Constraint_v2.patch From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129457#comment-13129457 ] Lars Hofhansl commented on HBASE-4562: -- No problem. Thanks for patch :) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129457#comment-13129457 ] Lars Hofhansl edited comment on HBASE-4562 at 10/18/11 3:40 AM: No problem. Thanks for the patch :) was (Author: lhofhansl): No problem. Thanks for patch :) When split doing offlineParentInMeta encounters error, it'll cause data loss Key: HBASE-4562 URL: https://issues.apache.org/jira/browse/HBASE-4562 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: bluedavy Assignee: bluedavy Priority: Blocker Fix For: 0.90.5 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt Follow below steps to replay the problem: 1. change the SplitTransaction.java as below,just like mock the timeout error. {code:title=SplitTransaction.java|borderStyle=solid} if (!testing) { MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); throw new IOException(some unexpected error in split); } {code} 2. update the regionserver code,restart; 3. create a table put some data to the table; 4. split the table; 5. kill the regionserver hosted the table; 6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost. We can fix the bug just use the patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4510) HDFS-1620 related changes downstream (For compiling with HDFS 0.23+)
[ https://issues.apache.org/jira/browse/HBASE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129458#comment-13129458 ] jirapos...@reviews.apache.org commented on HBASE-4510: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2108/ --- (Updated 2011-10-18 03:42:06.955026) Review request for hbase. Changes --- Updated patch that should fix 0.23 builds. Summary --- HBase isn't seemingly compiling anymore on 0.23 after the HDFS-1620 naming refactorings were carried out. Two solutions: 1. We use new classnames. This breaks HBase's backward compatibility with older Hadoop releases (is that a concern with future releases?) 2. HBase gets its own sets of constants as the upstream one is not marked for public usage. This needs a little more maintenance on HBases' side. Way (2) seems more viable. I've attached an initial patch that doesn't fix all trouble yet, but introduces the first way of changes. The remaining issue lies in the use of DistributedFileSystem's safemode methods (which are private audience as well) inside of HBase for safemode waits and checks (via HBases' FSUtils class). Since this uses an enum, it is more difficult to handle without upstream interventions - thoughts? This addresses bug HBASE-4510. https://issues.apache.org/jira/browse/HBASE-4510 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java dcd0937 src/main/java/org/apache/hadoop/hbase/util/FSUtils.java 789dd3b Diff: https://reviews.apache.org/r/2108/diff Testing --- Thanks, Harsh HDFS-1620 related changes downstream (For compiling with HDFS 0.23+) Key: HBASE-4510 URL: https://issues.apache.org/jira/browse/HBASE-4510 Project: HBase Issue Type: Task Affects Versions: 0.94.0 Reporter: Harsh J Assignee: Harsh J Priority: Blocker HBase isn't seemingly compiling anymore on 0.23 after the HDFS-1620 naming refactorings were carried out. Two solutions: * We use new classnames. This breaks HBase's backward compatibility with older Hadoop releases (is that a concern with future releases?) * HBase gets its own sets of constants as the upstream one is not marked for public usage. This needs a little more maintenance on HBases' side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira