[jira] [Commented] (HBASE-10915) Decouple region closing (HM and HRS) from ZK
[ https://issues.apache.org/jira/browse/HBASE-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988593#comment-13988593 ] Mikhail Antonov commented on HBASE-10915: - Last patch attached passed all tests on my jenkins against latest trunk. Not sure if hadoop-qa is stuck on something.. Decouple region closing (HM and HRS) from ZK Key: HBASE-10915 URL: https://issues.apache.org/jira/browse/HBASE-10915 Project: HBase Issue Type: Sub-task Components: Consensus, Zookeeper Affects Versions: 0.99.0 Reporter: Mikhail Antonov Assignee: Mikhail Antonov Attachments: HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch Decouple region closing from ZK. Includes RS side (CloseRegionHandler), HM side (ClosedRegionHandler) and the code using (HRegionServer, RSRpcServices etc). May need small changes in AssignmentManager. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10336) Remove deprecated usage of Hadoop HttpServer in InfoServer
[ https://issues.apache.org/jira/browse/HBASE-10336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Charles updated HBASE-10336: - Attachment: HBASE-10336-9.patch HBASE-10336-9.patch is rebased on trunk. TestSSLHttpServer fails here, but I suspect a jdk issue (running on jdk8). Would be worth to run on Jenkins. HBASE-10336 must land before HBASE-6581 (run on hadoop-3) to progress - This is still more needed since Hadoop has made HttpServer final in hadoop trunk... Remove deprecated usage of Hadoop HttpServer in InfoServer -- Key: HBASE-10336 URL: https://issues.apache.org/jira/browse/HBASE-10336 Project: HBase Issue Type: Bug Affects Versions: 0.99.0 Reporter: Eric Charles Assignee: Eric Charles Attachments: HBASE-10336-1.patch, HBASE-10336-2.patch, HBASE-10336-3.patch, HBASE-10336-4.patch, HBASE-10336-5.patch, HBASE-10336-6.patch, HBASE-10336-7.patch, HBASE-10336-8.patch, HBASE-10336-9.patch Recent changes in Hadoop HttpServer give NPE when running on hadoop 3.0.0-SNAPSHOT. This way we use HttpServer is deprecated and will probably be not fixed (see HDFS-5760). We'd better move to the new proposed builder pattern, which means we can no more use inheritance to build our nice InfoServer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10965) Automate detection of presence of Filter#filterRow()
[ https://issues.apache.org/jira/browse/HBASE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10965: --- Attachment: 10965-v7.txt Patch v7 adds test where a user filter extends Filter directly. Automate detection of presence of Filter#filterRow() Key: HBASE-10965 URL: https://issues.apache.org/jira/browse/HBASE-10965 Project: HBase Issue Type: Task Components: Filters Reporter: Ted Yu Assignee: Ted Yu Attachments: 10965-v1.txt, 10965-v2.txt, 10965-v3.txt, 10965-v4.txt, 10965-v6.txt, 10965-v7.txt There is potential inconsistency between the return value of Filter#hasFilterRow() and presence of Filter#filterRow(). Filters may override Filter#filterRow() while leaving return value of Filter#hasFilterRow() being false (inherited from FilterBase). Downside to purely depending on hasFilterRow() telling us whether custom filter overrides filterRow(List) or filterRow() is that the check below may be rendered ineffective: {code} if (nextKv == KV_LIMIT) { if (this.filter != null filter.hasFilterRow()) { throw new IncompatibleFilterException( Filter whose hasFilterRow() returns true is incompatible with scan with limit!); } {code} When user forgets to override hasFilterRow(), the above check becomes not useful. Another limitation is that we cannot optimize FilterList#filterRow() through short circuit when FilterList#hasFilterRow() turns false. See https://issues.apache.org/jira/browse/HBASE-11093?focusedCommentId=13985149page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985149 This JIRA aims to remove the inconsistency by automatically detecting the presence of overridden Filter#filterRow(). If filterRow() is implemented and not inherited from FilterBase, it is equivalent to having hasFilterRow() return true. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: (was: hbase-8763.patch) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v1.patch [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11113) clone_snapshot command prints wrong name upon error
Andrew Purtell created HBASE-3: -- Summary: clone_snapshot command prints wrong name upon error Key: HBASE-3 URL: https://issues.apache.org/jira/browse/HBASE-3 Project: HBase Issue Type: Bug Affects Versions: 0.98.2 Reporter: Andrew Purtell Priority: Trivial Fix For: 0.99.0, 0.98.3 hbase clone_snapshot 'snapshot', 'existing_table_name' ERROR: Table already exists: snapshot! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10933) hbck -fixHdfsOrphans is not working properly it throws null pointer exception
[ https://issues.apache.org/jira/browse/HBASE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Sharma updated HBASE-10933: -- Assignee: Y. SREENIVASULU REDDY (was: Deepak Sharma) hbck -fixHdfsOrphans is not working properly it throws null pointer exception - Key: HBASE-10933 URL: https://issues.apache.org/jira/browse/HBASE-10933 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.16, 0.98.2 Reporter: Deepak Sharma Assignee: Y. SREENIVASULU REDDY Priority: Critical if we regioninfo file is not existing in hbase region then if we run hbck repair or hbck -fixHdfsOrphans then it is not able to resolve this problem it throws null pointer exception {code} 2014-04-08 20:11:49,750 INFO [main] util.HBaseFsck (HBaseFsck.java:adoptHdfsOrphans(470)) - Attempting to handle orphan hdfs dir: hdfs://10.18.40.28:54310/hbase/TestHdfsOrphans1/5a3de9ca65e587cb05c9384a3981c950 java.lang.NullPointerException at org.apache.hadoop.hbase.util.HBaseFsck$TableInfo.access$000(HBaseFsck.java:1939) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphan(HBaseFsck.java:497) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphans(HBaseFsck.java:471) at org.apache.hadoop.hbase.util.HBaseFsck.restoreHdfsIntegrity(HBaseFsck.java:591) at org.apache.hadoop.hbase.util.HBaseFsck.offlineHdfsIntegrityRepair(HBaseFsck.java:369) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:447) at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3769) at org.apache.hadoop.hbase.util.HBaseFsck.run(HBaseFsck.java:3587) at com.huawei.isap.test.smartump.hadoop.hbase.HbaseHbckRepair.repairToFixHdfsOrphans(HbaseHbckRepair.java:244) at com.huawei.isap.test.smartump.hadoop.hbase.HbaseHbckRepair.setUp(HbaseHbckRepair.java:84) at junit.framework.TestCase.runBare(TestCase.java:132) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) {code} problem i got it is because since in HbaseFsck class {code} private void adoptHdfsOrphan(HbckInfo hi) {code} we are intializing tableinfo using SortedMapString, TableInfo tablesInfo object {code} TableInfo tableInfo = tablesInfo.get(tableName); {code} but in private SortedMapString, TableInfo loadHdfsRegionInfos() {code} for (HbckInfo hbi: hbckInfos) { if (hbi.getHdfsHRI() == null) { // was an orphan continue; } {code} we have check if a region is orphan then that table will can not be added in SortedMapString, TableInfo tablesInfo so later while using this we get null pointer exception -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10965) Automate detection of presence of Filter#filterRow()
[ https://issues.apache.org/jira/browse/HBASE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988837#comment-13988837 ] Ted Yu commented on HBASE-10965: I ran test suite on Linux. There were 3 test failures: {code} Failed tests: testQuarantineMissingHFile(org.apache.hadoop.hbase.util.TestHBaseFsck): expected:2 but was:1 Tests in error: testFlushCommitsWithAbort(org.apache.hadoop.hbase.client.TestMultiParallel): test timed out after 30 milliseconds testMasterRestartWhenTableInEnabling(org.apache.hadoop.hbase.master.TestAssignmentManager) {code} The above tests don't use filter. They passed on re-run. Automate detection of presence of Filter#filterRow() Key: HBASE-10965 URL: https://issues.apache.org/jira/browse/HBASE-10965 Project: HBase Issue Type: Task Components: Filters Reporter: Ted Yu Assignee: Ted Yu Attachments: 10965-v1.txt, 10965-v2.txt, 10965-v3.txt, 10965-v4.txt, 10965-v6.txt, 10965-v7.txt There is potential inconsistency between the return value of Filter#hasFilterRow() and presence of Filter#filterRow(). Filters may override Filter#filterRow() while leaving return value of Filter#hasFilterRow() being false (inherited from FilterBase). Downside to purely depending on hasFilterRow() telling us whether custom filter overrides filterRow(List) or filterRow() is that the check below may be rendered ineffective: {code} if (nextKv == KV_LIMIT) { if (this.filter != null filter.hasFilterRow()) { throw new IncompatibleFilterException( Filter whose hasFilterRow() returns true is incompatible with scan with limit!); } {code} When user forgets to override hasFilterRow(), the above check becomes not useful. Another limitation is that we cannot optimize FilterList#filterRow() through short circuit when FilterList#hasFilterRow() turns false. See https://issues.apache.org/jira/browse/HBASE-11093?focusedCommentId=13985149page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985149 This JIRA aims to remove the inconsistency by automatically detecting the presence of overridden Filter#filterRow(). If filterRow() is implemented and not inherited from FilterBase, it is equivalent to having hasFilterRow() return true. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11109) flush region sequence id may not be larger than all edits flushed
[ https://issues.apache.org/jira/browse/HBASE-11109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11109: -- Attachment: 11109v2.txt hadoopqa is gone? retry flush region sequence id may not be larger than all edits flushed - Key: HBASE-11109 URL: https://issues.apache.org/jira/browse/HBASE-11109 Project: HBase Issue Type: Sub-task Components: wal Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0 Attachments: 11109.txt, 11109v2.txt, 11109v2.txt This was found by [~jeffreyz] See parent issue. We have this issue since we put the ring buffer/disrupter into the WAL (HBASE-10156). An edits region sequence id is set only after the edit has traversed the ring buffer. Flushing, we just up whatever the current region sequence id is. Crossing the ring buffer may take some time and is done by background threads. The flusher may be taking the region sequence id though edits have not yet made it across the ringbuffer: i.e. edits that are actually scoped by the flush may have region sequence ids in excess of that of the flush sequence id reported. The consequences are not exactly clear. Would rather not have to find out so lets fix this here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11096) stop method of Master and RegionServer coprocessor is not invoked
[ https://issues.apache.org/jira/browse/HBASE-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1391#comment-1391 ] Andrew Purtell commented on HBASE-11096: We should have a quick and simple unit test for this issue. It's no problem to add a new one rather than change an existing test. stop method of Master and RegionServer coprocessor is not invoked -- Key: HBASE-11096 URL: https://issues.apache.org/jira/browse/HBASE-11096 Project: HBase Issue Type: Bug Affects Versions: 0.96.2, 0.98.1, 0.94.19 Reporter: Qiang Tian Assignee: Qiang Tian Fix For: 0.99.0, 0.96.3, 0.94.20, 0.98.3 Attachments: HBASE-11096-trunk-v0.patch, HBASE-11096-trunk-v0.patch, HBASE-11096-trunk-v1.patch the stop method of coprocessor specified by hbase.coprocessor.master.classes and hbase.coprocessor.regionserver.classes is not invoked. If coprocessor allocates OS resources, it could cause master/regionserver resource leak or hang during exit. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10602) Cleanup HTable public interface
[ https://issues.apache.org/jira/browse/HBASE-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1392#comment-1392 ] Andrew Purtell commented on HBASE-10602: bq. It has been a discussed (in the master redesign issue, and by Stack in RB for this issue) that at the RPC layer, we would want to fire up a request like createTable to master, and get a token back so that we can query the results later with it. Server task management with a client API for querying task status was proposed also on HBASE-10170 for long running coprocessor endpoint invocations. Cleanup HTable public interface --- Key: HBASE-10602 URL: https://issues.apache.org/jira/browse/HBASE-10602 Project: HBase Issue Type: Improvement Components: Client, Usability Reporter: Nick Dimiduk Assignee: Enis Soztutar Priority: Blocker Fix For: 0.99.0 Attachments: hbase-10602_v1.patch HBASE-6580 replaced the preferred means of HTableInterface acquisition to the HConnection#getTable factory methods. HBASE-9117 removes the HConnection cache, placing the burden of responsible connection cleanup on whomever acquires it. The remaining HTable constructors use a Connection instance and manage their own HConnection on the callers behalf. This is convenient but also a surprising source of poor performance for anyone accustomed to the previous connection caching behavior. I propose deprecating those remaining constructors for 0.98/0.96 and removing them for 1.0. While I'm at it, I suggest we pursue some API hygiene in general and convert HTable into an interface. I'm sure there are method overloads for accepting String/byte[]/TableName where just TableName is sufficient. Can that be done for 1.0 as well? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10926) Use global procedure to flush table memstore cache
[ https://issues.apache.org/jira/browse/HBASE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1394#comment-1394 ] Andrew Purtell commented on HBASE-10926: It would be good if you could report back a successful local unit test suite run [~jerryjch], but the committer can do this at commit time as well. Use global procedure to flush table memstore cache -- Key: HBASE-10926 URL: https://issues.apache.org/jira/browse/HBASE-10926 Project: HBase Issue Type: Improvement Components: Admin Affects Versions: 0.96.2, 0.98.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.99.0 Attachments: HBASE-10926-trunk-v1.patch, HBASE-10926-trunk-v2.patch, HBASE-10926-trunk-v3.patch, HBASE-10926-trunk-v4.patch Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11109) flush region sequence id may not be larger than all edits flushed
[ https://issues.apache.org/jira/browse/HBASE-11109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988909#comment-13988909 ] Mikhail Antonov commented on HBASE-11109: - bq. hadoopqa is gone? retry Probably. I tried cancelling/resubmitting consensus patches multiples time over past few days, didn't get any builds kicked in.. flush region sequence id may not be larger than all edits flushed - Key: HBASE-11109 URL: https://issues.apache.org/jira/browse/HBASE-11109 Project: HBase Issue Type: Sub-task Components: wal Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0 Attachments: 11109.txt, 11109v2.txt, 11109v2.txt This was found by [~jeffreyz] See parent issue. We have this issue since we put the ring buffer/disrupter into the WAL (HBASE-10156). An edits region sequence id is set only after the edit has traversed the ring buffer. Flushing, we just up whatever the current region sequence id is. Crossing the ring buffer may take some time and is done by background threads. The flusher may be taking the region sequence id though edits have not yet made it across the ringbuffer: i.e. edits that are actually scoped by the flush may have region sequence ids in excess of that of the flush sequence id reported. The consequences are not exactly clear. Would rather not have to find out so lets fix this here. -- This message was sent by Atlassian JIRA (v6.2#6252)