[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143780#comment-13143780 ] Phabricator commented on HBASE-4746: tedyu has commented on the revision [jira] [HBASE-4746] [89-fb] Use a random ZK client port in unit tests so we can run them in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java:64 How was 0xc000 chosen ? src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java:113 Should this be debug ? I imagine there would be many mini zookeeper clusters running in parallel on the build machine. src/main/java/org/apache/hadoop/hbase/client/HTable.java:90 Nice. We can mark this ctor deprecated in HBase 0.92. For HBase TRUNK, I think we should remove this ctor and the ctor on line 104. REVISION DETAIL https://reviews.facebook.net/D255 Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Attachments: D255.1.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4742: --- Attachment: D237.1.patch Liyin requested code review of [jira] [HBASE-4742] Split dead server's log in parallel. Reviewers: Kannan, khemani, Karthik, mbautin, JIRA When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. TEST PLAN 1) add one unit test for multiple region server shut down case 2) running all the unit test 3) will test it in dev cluster REVISION DETAIL https://reviews.facebook.net/D237 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143781#comment-13143781 ] Phabricator commented on HBASE-4742: Karthik has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:65 Could we write a one-line comment about the state? I think ABORT means there is no need to split any logs, right? src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:331 Might be missing something here... this break will just cause the operation to return as if it succeeded, but log splitting is still in progress right? We would then proceed to open the regions without ensuring this is completed? Maybe we need: this.requeue(); return false; REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4722) TestGlobalMemStoreSize has started failing
[ https://issues.apache.org/jira/browse/HBASE-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143785#comment-13143785 ] Hudson commented on HBASE-4722: --- Integrated in HBase-TRUNK #2409 (See [https://builds.apache.org/job/HBase-TRUNK/2409/]) HBASE-4722 TestGlobalMemStoreSize has started failing; ADDENDUM stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestGlobalMemStoreSize.java TestGlobalMemStoreSize has started failing -- Key: HBASE-4722 URL: https://issues.apache.org/jira/browse/HBASE-4722 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Attachments: 4722.txt, logging-v2.txt, logging.txt I'm digging in. It fails occasionally for me locally to. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4727) Don't unconditionally delete UNASSIGNED ZNode for a region.
[ https://issues.apache.org/jira/browse/HBASE-4727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143823#comment-13143823 ] ramkrishna.s.vasudevan commented on HBASE-4727: --- @Madhuwanti Could you pls take a look at HBASE-4540. I think that solves your problem.. If am wrong you can let me know. Don't unconditionally delete UNASSIGNED ZNode for a region. --- Key: HBASE-4727 URL: https://issues.apache.org/jira/browse/HBASE-4727 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.89.20100924 Reporter: Madhuwanti Vaidya Assignee: Madhuwanti Vaidya Priority: Minor Unconditionally deleting an UNASSIGNED ZNode when master processes RS2ZK_REGION_OPENED (from the toDo queue) for a region has caused multiply assigned regions or unassigned regions. One proposed fix is to check whether the ZNode is actually in the state RS2ZK_REGION_OPENED before deleting it. Another fix is to not delete the ZNode at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4729) Race between online altering and splitting kills the master
[ https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-4729: - Assignee: ramkrishna.s.vasudevan Race between online altering and splitting kills the master --- Key: HBASE-4729 URL: https://issues.apache.org/jira/browse/HBASE-4729 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.92.0, 0.94.0 I was running an online alter while regions were splitting, and suddenly the master died and left my table half-altered (haven't restarted the master yet). What killed the master: {quote} 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception creating node CLOSING org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101 at org.apache.zookeeper.KeeperException.create(KeeperException.java:110) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441) at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769) at org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661) at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} A znode was created because the region server was splitting the region 4 seconds before: {quote} 2011-11-02 17:06:40,704 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101. 2011-11-02 17:06:40,704 DEBUG org.apache.hadoop.hbase.regionserver.SplitTransaction: regionserver:62023-0x132f043bbde0710 Creating ephemeral node for f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Attempting to transition node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLITTING ... 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Successfully transitioned node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLIT 2011-11-02 17:06:44,061 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for f7e1783e65ea8d621a4bc96ad310f101 {quote} Now that the master is dead the region server is spewing those last two lines like mad. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4739) Master dying while going to close a region can leave it in transition forever
[ https://issues.apache.org/jira/browse/HBASE-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143830#comment-13143830 ] ramkrishna.s.vasudevan commented on HBASE-4739: --- Can we try giving a close call after some timeout. Because currently the below code in HRegionServer {code} protected boolean closeRegion(HRegionInfo region, final boolean abort, final boolean zk) { if (this.regionsInTransitionInRS.containsKey(region.getEncodedNameAsBytes())) { LOG.warn(Received close for region we are already opening or closing; + region.getEncodedName()); return false; } {code} will prevent a close call reaching once again. So in the case that J-D had described after a timeout if we issue a close call he should be able to close the region right? Correct me if am wrong. Master dying while going to close a region can leave it in transition forever - Key: HBASE-4739 URL: https://issues.apache.org/jira/browse/HBASE-4739 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Priority: Minor Fix For: 0.92.0, 0.94.0, 0.90.5 I saw this in the aftermath of HBASE-4729 on a 0.92 refreshed yesterday, when the master died it had just created the RIT znode for a region but didn't tell the RS to close it yet. When the master restarted it saw the znode and started printing this: {quote} 2011-11-03 00:02:49,130 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: TestTable,0007560564,1320253568406.f76899564cabe7e9857c3aeb526ec9dc. state=CLOSING, ts=1320253605285, server=sv4r11s38,62003,1320195046948 2011-11-03 00:02:49,130 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been CLOSING for too long, this should eventually complete or the server will expire, doing nothing {quote} It's never going to happen, and it's blocking balancing. I'm marking this as minor since I believe this situation is pretty rare unless you hit other bugs while trying out stuff to root bugs out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4726) RS should close region if it fails to mark it as 'OPENED'.
[ https://issues.apache.org/jira/browse/HBASE-4726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143831#comment-13143831 ] ramkrishna.s.vasudevan commented on HBASE-4726: --- This may also be fixed by HBASE-4540. RS should close region if it fails to mark it as 'OPENED'. -- Key: HBASE-4726 URL: https://issues.apache.org/jira/browse/HBASE-4726 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.89.20100924 Reporter: Madhuwanti Vaidya Assignee: Madhuwanti Vaidya Priority: Minor Currently if a RS fails to mark a region as 'OPENED' it only logs an error. It will leave the region open - this has caused duplicate region assignments in one of our production clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Open (was: Patch Available) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552 org.apache.hadoop.hbase.regionserver.TestMasterAddressManager 1 0.525 org.apache.hadoop.hbase.regionserver.TestMultiColumnScanner 6
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Attachment: 2003_4737_pom.patch same file, new test Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552 org.apache.hadoop.hbase.regionserver.TestMasterAddressManager 1 0.525
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Patch Available (was: Open) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552 org.apache.hadoop.hbase.regionserver.TestMasterAddressManager 1 0.525
[jira] [Created] (HBASE-4748) Restart the cluster after alter table(online) completely losses the table information
Restart the cluster after alter table(online) completely losses the table information - Key: HBASE-4748 URL: https://issues.apache.org/jira/browse/HBASE-4748 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan 1. Start a cluster. 2. Alter a table 3. Restart the master using ./hbase-daemon.sh restart master 4. Kill the RS after master restarts. 5. Start RS again. 6. No table operations can be performed on the table that was altered but admin.listTables() is able to list the altered table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143932#comment-13143932 ] Hadoop QA commented on HBASE-4737: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502384/2003_4737_pom.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 46 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/170//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/170//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/170//console This message is automatically generated. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Open (was: Patch Available) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552 org.apache.hadoop.hbase.regionserver.TestMasterAddressManager 1 0.525
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Attachment: 2003_4737_pom.dummy.patch dummy patch to test the build. does nothing Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Patch Available (was: Open) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552 org.apache.hadoop.hbase.regionserver.TestMasterAddressManager 1 0.525
[jira] [Commented] (HBASE-2502) HBase won't bind to designated interface when more than one network interface is available
[ https://issues.apache.org/jira/browse/HBASE-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143959#comment-13143959 ] Harsh J commented on HBASE-2502: Isn't this solved by using the hbase.master.dns.interface and hbase.master.dns.nameserver props right now? HBase won't bind to designated interface when more than one network interface is available -- Key: HBASE-2502 URL: https://issues.apache.org/jira/browse/HBASE-2502 Project: HBase Issue Type: Bug Reporter: stack See this message by Michael Segel up on the list: http://www.mail-archive.com/hbase-user@hadoop.apache.org/msg10042.html This comes up from time to time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4749) TestMasterFailover case occasional fails
TestMasterFailover case occasional fails Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4749) TestMasterFailover case occasional fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143990#comment-13143990 ] gaojinchao commented on HBASE-4749: --- It seems a bug for TRUNK. In version 0.90, We kill a RS and at same time start a Master, Master don't add a dying RS to online set. But in version 0.92 We will add a dying RS to online set. This will produce a lot of unusual scenarios: 1. if the root/meta is in a dying RS, we may lose data because don't split Hlog. looks issue: https://issues.apache.org/jira/browse/HBASE-4511. 2.In testMasterFailoverWithMockedRITOnDeadRScase , mocking scenarios will be invalid. look this logs: //we kill this RS(1320357166142 ) 2011-11-03 21:52:56,007 INFO [Thread-986] master.TestMasterFailover(1011): Killing RS juno.apache.org,60001,1320357166142 //we pick up this RS(1320357166142) through zk node. 2011-11-03 21:52:57,356 INFO [Master:0;juno.apache.org,51313,1320357176029] master.HMaster(464): Registering server found up in zk: juno.apache.org,60001,1320357166142 2011-11-03 21:52:57,357 INFO [Master:0;juno.apache.org,51313,1320357176029] master.ServerManager(239): Registering server=juno.apache.org,60001,1320357166142 So I think we should wait until killing RS is shut down and start a new hmaster. TestMasterFailover case occasional fails Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144009#comment-13144009 ] Phabricator commented on HBASE-4742: tedyu has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:351 Something seems to be missing after the log statement. Minor: missing space between double quote and 'and' src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:311 A return statement is missing, if I understand correctly. REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144033#comment-13144033 ] Hadoop QA commented on HBASE-4737: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502387/2003_4737_pom.dummy.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 46 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.util.TestFSUtils org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/171//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/171//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/171//console This message is automatically generated. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353
[jira] [Commented] (HBASE-4749) TestMasterFailover case occasional fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144036#comment-13144036 ] Ted Yu commented on HBASE-4749: --- Thanks for the finding Jinchao. From log of build 105: {code} Killing RS juno.apache.org,60001,1320357166142 2011-11-03 21:52:56,007 FATAL [Thread-986] regionserver.HRegionServer(1523): ABORTING region server juno.apache.org,60001,1320357166142: Killing for unit test ... 2011-11-03 21:52:56,011 WARN [Thread-986] regionserver.HRegionServer(1545): Unable to report fatal error to master java.lang.reflect.UndeclaredThrowableException at $Proxy16.reportRSFatalError(Unknown Source) at org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:1541) ... 2011-11-03 21:52:57,356 INFO [Master:0;juno.apache.org,51313,1320357176029] master.HMaster(464): Registering server found up in zk: juno.apache.org,60001,1320357166142 2011-11-03 21:52:57,357 INFO [Master:0;juno.apache.org,51313,1320357176029] master.ServerManager(239): Registering server=juno.apache.org,60001,1320357166142 ... 2011-11-03 21:52:57,586 INFO [Thread-986-EventThread] zookeeper.RegionServerTracker(93): RegionServer ephemeral node deleted, processing expiration [juno.apache.org,60001,1320357166142] 2011-11-03 21:52:57,588 INFO [RegionServer:1;juno.apache.org,60001,1320357166142] regionserver.HRegionServer(744): stopping server juno.apache.org,60001,1320357166142; zookeeper connection closed. {code} We can see that there was 570ms delay for the completion of region server shutdown handler. That was why re-registration of the dead region server happened. Since reportRSFatalError() encountered exception, we cannot rely on this callback to reach master. We have two options: 1. devise a mechanism to tell the new master the identity of the dead region server 2. insert a sleep of say 1 second before starting the new master Option 1 introduces extra complexity into Master. I am not sure if it is worth it just for test purposes. Many people wouldn't like option 2. More discussion is welcome. TestMasterFailover case occasional fails Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4722) TestGlobalMemStoreSize has started failing
[ https://issues.apache.org/jira/browse/HBASE-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144080#comment-13144080 ] stack commented on HBASE-4722: -- I think this fix should work. When I go back to the pre-patch code and insert an assert that will fail if memstore size changes between our getting of its size and our actual use of it, it fails after a few runs: {code} Caused by: java.io.IOException: currentMemStoreSize=17000, actual=17608 at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1284) ... 32 more {code} I'll leave this issue open another while in case the fix not complete. TestGlobalMemStoreSize has started failing -- Key: HBASE-4722 URL: https://issues.apache.org/jira/browse/HBASE-4722 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Attachments: 4722.txt, logging-v2.txt, logging.txt I'm digging in. It fails occasionally for me locally to. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4722) TestGlobalMemStoreSize has started failing
[ https://issues.apache.org/jira/browse/HBASE-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144081#comment-13144081 ] stack commented on HBASE-4722: -- I think this fix should work. When I go back to the pre-patch code and insert an assert that will fail if memstore size changes between our getting of its size and our actual use of it, it fails after a few runs: {code} Caused by: java.io.IOException: currentMemStoreSize=17000, actual=17608 at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1284) ... 32 more {code} I'll leave this issue open another while in case the fix not complete. TestGlobalMemStoreSize has started failing -- Key: HBASE-4722 URL: https://issues.apache.org/jira/browse/HBASE-4722 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Attachments: 4722.txt, logging-v2.txt, logging.txt I'm digging in. It fails occasionally for me locally to. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Open (was: Patch Available) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Attachment: 2003_4737_pom.v2.patch Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Patch Available (was: Open) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552
[jira] [Commented] (HBASE-4749) TestMasterFailover case occasional fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144085#comment-13144085 ] Ted Yu commented on HBASE-4749: --- A third option: delete the ephemeral node for the aborted region server before starting the new master. TestMasterFailover case occasional fails Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4741) Online schema change doesn't return errors
[ https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-4741: - Assignee: (was: Ted Yu) I may not have time to work on this in the next week. Online schema change doesn't return errors -- Key: HBASE-4741 URL: https://issues.apache.org/jira/browse/HBASE-4741 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: 4741-v2.txt, 4741-v3.txt, 4741-v4.txt, 4741.txt Still after the fun I had over in HBASE-4729, I tried to finish altering my table (remove a family) since only half of it was changed so I did this: {quote} hbase(main):002:0 alter 'TestTable', NAME = 'allo', METHOD = 'delete' Updating all regions with the new schema... 244/244 regions updated. Done. 0 row(s) in 1.2480 seconds {quote} Nice it all looks good, but over in the master log: {quote} org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does not exist so cannot be deleted at org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56) at org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86) at org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242) {quote} Maybe we should do checks before launching the async task. Marking critical as this is a regression. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4741) Online schema change doesn't return errors
[ https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4741: -- Attachment: 4741-v4.txt Patch v4 removes an unnecessary deleteColumn() call. Online schema change doesn't return errors -- Key: HBASE-4741 URL: https://issues.apache.org/jira/browse/HBASE-4741 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Ted Yu Priority: Critical Fix For: 0.92.0 Attachments: 4741-v2.txt, 4741-v3.txt, 4741-v4.txt, 4741.txt Still after the fun I had over in HBASE-4729, I tried to finish altering my table (remove a family) since only half of it was changed so I did this: {quote} hbase(main):002:0 alter 'TestTable', NAME = 'allo', METHOD = 'delete' Updating all regions with the new schema... 244/244 regions updated. Done. 0 row(s) in 1.2480 seconds {quote} Nice it all looks good, but over in the master log: {quote} org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does not exist so cannot be deleted at org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56) at org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86) at org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242) {quote} Maybe we should do checks before launching the async task. Marking critical as this is a regression. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4337) Update HBase directory structure layout to be aligned with Hadoop
[ https://issues.apache.org/jira/browse/HBASE-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HBASE-4337: - Attachment: HBASE-4337.patch Added new tar ball layout which aligned with Hadoop. The default layout is untouched. For building directory structure layout to be aligned with Hadoop, build the system like this: {noformat} mvn -P rpm,deb,binary -DskipTests {noformat} Update HBase directory structure layout to be aligned with Hadoop - Key: HBASE-4337 URL: https://issues.apache.org/jira/browse/HBASE-4337 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.90.3 Reporter: Eric Yang Assignee: Eric Yang Attachments: HBASE-4337.patch In HADOOP-6255, a proposal was made for common directory layout for Hadoop ecosystem. This jira is to track the necessary work for making HBase directory structure aligned with Hadoop for better integration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Open (was: Patch Available) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Attachment: 2003_4737_pom.v2.patch Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Patch Available (was: Open) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57 28.552
[jira] [Updated] (HBASE-4337) Update HBase directory structure layout to be aligned with Hadoop
[ https://issues.apache.org/jira/browse/HBASE-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HBASE-4337: - Affects Version/s: (was: 0.90.3) 0.92.0 Release Note: Added binary only profile for building binary only tar ball. Status: Patch Available (was: Open) Update HBase directory structure layout to be aligned with Hadoop - Key: HBASE-4337 URL: https://issues.apache.org/jira/browse/HBASE-4337 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Eric Yang Assignee: Eric Yang Attachments: HBASE-4337.patch In HADOOP-6255, a proposal was made for common directory layout for Hadoop ecosystem. This jira is to track the necessary work for making HBase directory structure aligned with Hadoop for better integration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4749) TestMasterFailover case occasional fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144107#comment-13144107 ] stack commented on HBASE-4749: -- Nice work Jinchao. I think you've pinpointed the root issue. The registration of regionservers by the master of regionservers that have not heartbeated the master but that still have an ephemeral node up in zk looks dangerous. It was added by hbase-1502, by me, where I purged master/regionserver control via heartbeats. I think we need to remove this bit of code. Looking TestMasterFailover case occasional fails Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144114#comment-13144114 ] stack commented on HBASE-4737: -- @N I'm having similar issue over in HBASE-4553. Trying to figure what it is Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144118#comment-13144118 ] stack commented on HBASE-4737: -- Hmm... maybe hbase-4553 last patch is something else... Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144130#comment-13144130 ] nkeywal commented on HBASE-4737: The default profile is not taken into account, I don't know why. But putting the property in the project itself should work. That's what the v2 is about. With the v1, I think that the process was forked only once for all tests that broke hbase tests then surefire :-/ Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Attachment: hbasetests.sh The script launch first the small tests (list hardcoded until the categories are set in the source code) and run the other tests with two parallel maven instance. Can replay the failed tests. Still beta, but improve the build time by around 40% = less than 1 hour for the whole suite test. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887
[jira] [Commented] (HBASE-4749) TestMasterFailover case occasional fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144182#comment-13144182 ] Ted Yu commented on HBASE-4749: --- Fourth option: use Guava ConcurrentMap for ServerManager.onlineServers where expiration time period can be adjusted by a new config parameter: {code} ConcurrentMapServerName, HServerLoad onlineServers = new MapMaker().expiration(2, TimeUnit.MINUTES).evictionListener(listener).makeMap(); {code} The registered listener would call expireServer() for the underlying region server. In HMaster.regionServerReport(), we refresh the entry (through a call to ServerManager) for the reporting region server. In terms of testing, the above approach may incur additional waiting period. TestMasterFailover case occasional fails Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4749: -- Summary: TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails (was: TestMasterFailover case occasional fails) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144190#comment-13144190 ] Ted Yu commented on HBASE-4737: --- Awesome, N. It would be nice to see what happens after HBASE-4746 is integrated. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694
[jira] [Updated] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4749: -- Priority: Critical (was: Minor) This issue may require changes in master code. TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2425) Crossport HADOOP-1849 rpc fix
[ https://issues.apache.org/jira/browse/HBASE-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144218#comment-13144218 ] jirapos...@reviews.apache.org commented on HBASE-2425: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2718/#review3049 --- Just a couple of comments. Otherwise looks good to me. src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/2718/#comment6797 We could eliminate the flag and use status instead. Are there plans for other bits being set in this? Otherwise, we always have length and error can be determined from Status. Or would removing this break asynchbase in other ways? src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java https://reviews.apache.org/r/2718/#comment6796 Good src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java https://reviews.apache.org/r/2718/#comment6792 Don't think this is necessary? The super.readFields(in) should throw VersionMismatchException if the read version doesn't match our getVersion(). src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java https://reviews.apache.org/r/2718/#comment6793 Probably better to use super.write(out) here. Same code, but future proof to changes. src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java https://reviews.apache.org/r/2718/#comment6794 Should be able to remove this now that Invocation implements VersionedWritable. src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java https://reviews.apache.org/r/2718/#comment6795 Should be able to remove this now that Invocation implements VersionedWritable. I didn't see any dependency on rpc version outside of the Invocation serialization. - Gary On 2011-11-04 00:11:21, Michael Stack wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2718/ bq. --- bq. bq. (Updated 2011-11-04 00:11:21) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Versions of Gary suggestions bq. bq. bq. This addresses bug hbase-2425. bq. https://issues.apache.org/jira/browse/hbase-2425 bq. bq. bq. Diffs bq. - bq. bq. src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java fce5490 bq.src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java 2fa4d6f bq. src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java 6f88357 bq.src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java 6fcb771 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1365411 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 4a8918a bq.src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java e60f970 bq.src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/Status.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java fb07374 bq.src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java 60a9248 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 8de2314 bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 0d0e4c5 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 12bd33e bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java 888f428 bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java e5b6a78 bq. bq. Diff: https://reviews.apache.org/r/2718/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Michael bq. bq. Crossport HADOOP-1849 rpc fix - Key: HBASE-2425 URL: https://issues.apache.org/jira/browse/HBASE-2425 Project: HBase Issue Type: Task Reporter: stack Labels: moved_from_0_20_5 Suggested over in HBASE-2360. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4741) Online schema change doesn't return errors
[ https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144217#comment-13144217 ] Hadoop QA commented on HBASE-4741: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502464/4741-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 46 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/172//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/172//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/172//console This message is automatically generated. Online schema change doesn't return errors -- Key: HBASE-4741 URL: https://issues.apache.org/jira/browse/HBASE-4741 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: 4741-v2.txt, 4741-v3.txt, 4741-v4.txt, 4741.txt Still after the fun I had over in HBASE-4729, I tried to finish altering my table (remove a family) since only half of it was changed so I did this: {quote} hbase(main):002:0 alter 'TestTable', NAME = 'allo', METHOD = 'delete' Updating all regions with the new schema... 244/244 regions updated. Done. 0 row(s) in 1.2480 seconds {quote} Nice it all looks good, but over in the master log: {quote} org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does not exist so cannot be deleted at org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56) at org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86) at org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242) {quote} Maybe we should do checks before launching the async task. Marking critical as this is a regression. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144224#comment-13144224 ] Hadoop QA commented on HBASE-4737: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502487/hbasetests.sh against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/176//console This message is automatically generated. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57
[jira] [Commented] (HBASE-4337) Update HBase directory structure layout to be aligned with Hadoop
[ https://issues.apache.org/jira/browse/HBASE-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144223#comment-13144223 ] Hadoop QA commented on HBASE-4337: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502466/HBASE-4337.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 46 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/175//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/175//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/175//console This message is automatically generated. Update HBase directory structure layout to be aligned with Hadoop - Key: HBASE-4337 URL: https://issues.apache.org/jira/browse/HBASE-4337 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Eric Yang Assignee: Eric Yang Attachments: HBASE-4337.patch In HADOOP-6255, a proposal was made for common directory layout for Hadoop ecosystem. This jira is to track the necessary work for making HBase directory structure aligned with Hadoop for better integration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2425) Crossport HADOOP-1849 rpc fix
[ https://issues.apache.org/jira/browse/HBASE-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144228#comment-13144228 ] jirapos...@reviews.apache.org commented on HBASE-2425: -- bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. Just a couple of comments. Otherwise looks good to me. Excellent. Thanks for the review. bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java, line 322 bq. https://reviews.apache.org/r/2718/diff/1/?file=56222#file56222line322 bq. bq. We could eliminate the flag and use status instead. Are there plans for other bits being set in this? Otherwise, we always have length and error can be determined from Status. Or would removing this break asynchbase in other ways? Its for asynchbase -- so it can tell diff between a response with length and one w/o (its trying to support all hbase versions). bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java, line 98 bq. https://reviews.apache.org/r/2718/diff/1/?file=56223#file56223line98 bq. bq. Don't think this is necessary? The super.readFields(in) should throw VersionMismatchException if the read version doesn't match our getVersion(). My mistake. Will fix. bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java, line 116 bq. https://reviews.apache.org/r/2718/diff/1/?file=56223#file56223line116 bq. bq. Probably better to use super.write(out) here. Same code, but future proof to changes. Ditto. bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java, line 60 bq. https://reviews.apache.org/r/2718/diff/1/?file=56227#file56227line60 bq. bq. Should be able to remove this now that Invocation implements VersionedWritable. Agreed. bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java, line 343 bq. https://reviews.apache.org/r/2718/diff/1/?file=56227#file56227line343 bq. bq. Should be able to remove this now that Invocation implements VersionedWritable. I didn't see any dependency on rpc version outside of the Invocation serialization. Good. - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2718/#review3049 --- On 2011-11-04 00:11:21, Michael Stack wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2718/ bq. --- bq. bq. (Updated 2011-11-04 00:11:21) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Versions of Gary suggestions bq. bq. bq. This addresses bug hbase-2425. bq. https://issues.apache.org/jira/browse/hbase-2425 bq. bq. bq. Diffs bq. - bq. bq. src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java fce5490 bq.src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java 2fa4d6f bq. src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java 6f88357 bq.src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java 6fcb771 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1365411 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 4a8918a bq.src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java e60f970 bq.src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/Status.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java fb07374 bq.src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java 60a9248 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 8de2314 bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 0d0e4c5 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 12bd33e bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java 888f428 bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java e5b6a78 bq. bq. Diff: https://reviews.apache.org/r/2718/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Michael bq. bq. Crossport HADOOP-1849 rpc fix - Key: HBASE-2425 URL: https://issues.apache.org/jira/browse/HBASE-2425 Project: HBase Issue Type:
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Open (was: Patch Available) seems that the v2 is ok Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764
[jira] [Commented] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144234#comment-13144234 ] Ted Yu commented on HBASE-4749: --- I tried option 2. I looped 20 times TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS and didn't get failure. Previously it was very easy to reproduce the failure. TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144235#comment-13144235 ] nkeywal commented on HBASE-4737: @ted from the comment in HBASE-4746: it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow., it seems that it should work quite well at the end! Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42
[jira] [Commented] (HBASE-2425) Crossport HADOOP-1849 rpc fix
[ https://issues.apache.org/jira/browse/HBASE-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144239#comment-13144239 ] jirapos...@reviews.apache.org commented on HBASE-2425: -- bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java, line 322 bq. https://reviews.apache.org/r/2718/diff/1/?file=56222#file56222line322 bq. bq. We could eliminate the flag and use status instead. Are there plans for other bits being set in this? Otherwise, we always have length and error can be determined from Status. Or would removing this break asynchbase in other ways? bq. bq. Michael Stack wrote: bq. Its for asynchbase -- so it can tell diff between a response with length and one w/o (its trying to support all hbase versions). So do we need an additional flag for status set here, so that asynchbase can tell when that's included? Or can it pick that up from the version number? - Gary --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2718/#review3049 --- On 2011-11-04 00:11:21, Michael Stack wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2718/ bq. --- bq. bq. (Updated 2011-11-04 00:11:21) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Versions of Gary suggestions bq. bq. bq. This addresses bug hbase-2425. bq. https://issues.apache.org/jira/browse/hbase-2425 bq. bq. bq. Diffs bq. - bq. bq. src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java fce5490 bq.src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java 2fa4d6f bq. src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java 6f88357 bq.src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java 6fcb771 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1365411 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 4a8918a bq.src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java e60f970 bq.src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/Status.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java fb07374 bq.src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java 60a9248 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 8de2314 bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 0d0e4c5 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 12bd33e bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java 888f428 bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java e5b6a78 bq. bq. Diff: https://reviews.apache.org/r/2718/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Michael bq. bq. Crossport HADOOP-1849 rpc fix - Key: HBASE-2425 URL: https://issues.apache.org/jira/browse/HBASE-2425 Project: HBase Issue Type: Task Reporter: stack Labels: moved_from_0_20_5 Suggested over in HBASE-2360. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename
[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4553: - Status: Open (was: Patch Available) The update of .tableinfo is not atomic; we remove then rename - Key: HBASE-4553 URL: https://issues.apache.org/jira/browse/HBASE-4553 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v13.txt, 4553-v5.txt, 4553-v9.txt, HBase-4553-TestAvroServer.patch This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists already. In 0.20+ its better but still 'some' issues if existing reader when file is renamed. This issue is about fixing this (though we depend on fix first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename
[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4553: - Status: Patch Available (was: Open) The update of .tableinfo is not atomic; we remove then rename - Key: HBASE-4553 URL: https://issues.apache.org/jira/browse/HBASE-4553 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v13.txt, 4553-v5.txt, 4553-v9.txt, HBase-4553-TestAvroServer.patch This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists already. In 0.20+ its better but still 'some' issues if existing reader when file is renamed. This issue is about fixing this (though we depend on fix first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename
[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4553: - Attachment: 4553-v13.txt Draining servers test case needed updating to match the changes this patch brings on. The update of .tableinfo is not atomic; we remove then rename - Key: HBASE-4553 URL: https://issues.apache.org/jira/browse/HBASE-4553 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v13.txt, 4553-v5.txt, 4553-v9.txt, HBase-4553-TestAvroServer.patch This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists already. In 0.20+ its better but still 'some' issues if existing reader when file is renamed. This issue is about fixing this (though we depend on fix first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144242#comment-13144242 ] Hadoop QA commented on HBASE-4737: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502469/2003_4737_pom.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 46 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.master.TestDistributedLogSplitting Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/174//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/174//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/174//console This message is automatically generated. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373
[jira] [Commented] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144249#comment-13144249 ] stack commented on HBASE-4749: -- Yeah, that would fix the test but we'd be left w/ hbase-4511 -- where on master failover, if root or meta verification fails because hosting server is going down... we'll miss edits. Let me see if I can fix. TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4511: -- Fix Version/s: (was: 0.94.0) 0.92.0 There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 Attachments: org.apache.hadoop.hbase.master.TestMasterFailover-output.rar It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or updating) unassigned node for 1028785192 with OFFLINE state 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] zookeeper.ZooKeeperWatcher(233): master:54557-0x132b31adbb30005 Received ZooKeeper Event, type=NodeCreated, state=SyncConnected,
[jira] [Commented] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144263#comment-13144263 ] Ted Yu commented on HBASE-4749: --- I think testMasterFailoverWithMockedRITOnDeadRS should be forked into two tests: 1. the aborted RS carried .META. 2. the aborted RS didn't carry .META. This would make each test behave deterministically. TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4733) Rest client does not close zookeeper sessions (leaking sessions for each GET or PUT)
[ https://issues.apache.org/jira/browse/HBASE-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144264#comment-13144264 ] jack levin commented on HBASE-4733: --- FYI this patch applies well to 0.90.4, those folks that use it in production should be able to fix their REST issue right away. Works well in our cluster. Rest client does not close zookeeper sessions (leaking sessions for each GET or PUT) Key: HBASE-4733 URL: https://issues.apache.org/jira/browse/HBASE-4733 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.90.4 Environment: Fedora 13. Reporter: jack levin Labels: newbie Fix For: 0.90.4 Attachments: my.patch This will appear in the log once the zookeeper connection/session leaking will grow to 2000, zookeeper won't be able to accept any more connections causing REST RPC calls to fail, here is the log when the problem is in progress: 2011-10-26 01:35:49,270 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,300 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.4 - max is 2000 2011-10-26 01:35:49,317 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.4 - max is 2000 2011-10-26 01:35:49,321 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,323 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.3 - max is 2000 2011-10-26 01:35:49,367 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.2 - max is 2000 2011-10-26 01:35:49,375 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.4 - max is 2000 2011-10-26 01:35:49,382 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.2 - max is 2000 2011-10-26 01:35:49,404 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.2 - max is 2000 2011-10-26 01:35:49,429 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.2 - max is 2000 2011-10-26 01:35:49,439 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,469 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,489 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.3 - max is 2000 2011-10-26 01:35:49,501 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.2 - max is 2000 2011-10-26 01:35:49,584 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.2 - max is 2000 After the fix, the log looks much better, and we can observe zookeeper connections closing after every RPC call: 2011-11-02 15:50:14,339 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /10.101.255.2:37225 2011-11-02 15:50:14,339 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /10.101.255.2:37225 2011-11-02 15:50:14,340 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x3352857cb1f19f with negotiated timeout 18 for client /10.101.255.2:37225 2011-11-02 15:50:14,363 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /10.101.255.2:37225 which had sessionid 0x3352857cb1f19f 2011-11-02 15:50:14,723 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /10.101.255.2:38089 2011-11-02 15:50:14,723 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /10.101.255.2:38089 2011-11-02 15:50:14,725 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x3352857cb1f1a0 with negotiated timeout 18 for client /10.101.255.2:38089 2011-11-02 15:50:14,771 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /10.101.255.2:38089 which had sessionid 0x3352857cb1f1a0 2011-11-02 15:50:16,326 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /10.101.255.3:34085 2011-11-02 15:50:16,326 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /10.101.255.3:34085 2011-11-02 15:50:16,328 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x3352857cb1f1a1 with negotiated timeout 18 for client /10.101.255.3:34085 -- This message is automatically generated by JIRA. If you think it
[jira] [Commented] (HBASE-4684) REST server is leaking ZK connections in 0.90
[ https://issues.apache.org/jira/browse/HBASE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144265#comment-13144265 ] jack levin commented on HBASE-4684: --- Here is applicable patch for 0.90.4 (tested to work): — src/main/java/org/apache/hadoop/hbase/rest/TableResource.java +++ src/main/java/org/apache/hadoop/hbase/rest/TableResource.java @@ -37,6 +37,7 @@ import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HConstants; import org.apache.hadoop.hbase.HTableDescriptor; +import org.apache.hadoop.hbase.client.HConnectionManager; import org.apache.hadoop.hbase.TableNotFoundException; import org.apache.hadoop.hbase.client.HBaseAdmin; import org.apache.hadoop.hbase.io.ImmutableBytesWritable; @@ -135,6 +136,7 @@ try { HBaseAdmin admin = new HBaseAdmin(servlet.getConfiguration()); HTableDescriptor htd = admin.getTableDescriptor(Bytes.toBytes(table)); + HConnectionManager.deleteConnection(admin.getConfiguration(), false); for (HColumnDescriptor hcd: htd.getFamilies()) { for (Map.EntryImmutableBytesWritable, ImmutableBytesWritable e: hcd.getValues().entrySet()) { REST server is leaking ZK connections in 0.90 - Key: HBASE-4684 URL: https://issues.apache.org/jira/browse/HBASE-4684 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Andrew Purtell Priority: Critical Fix For: 0.90.5 Attachments: HBASE-4684-0.90.patch, HBASE-4684-v2-0.90.patch As reported a month ago, http://search-hadoop.com/m/FD6gmKzrxY1, the REST server is leak ZK connections. Upon investigation I see that TableResource.scanTransformAttrs creates a new HBA per minute per table (when the server is getting requests) but never deletes the connection created in there. There are a bunch of other places where HBAs are created but not cleaned after like SchemaResource, StorageClusterStatusResource, StorageClusterVersionResource, ExistsResource, etc. Those places shouldn't be as leaky under normal circumstances tho. Thanks to Jack Levin for bringing up this issue again when he tried to upgrade. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4684) REST server is leaking ZK connections in 0.90
[ https://issues.apache.org/jira/browse/HBASE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144286#comment-13144286 ] Andrew Purtell commented on HBASE-4684: --- The patch on this issue fixes the problem and doesn't interfere with potentially concurrent actions. REST server is leaking ZK connections in 0.90 - Key: HBASE-4684 URL: https://issues.apache.org/jira/browse/HBASE-4684 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Andrew Purtell Priority: Critical Fix For: 0.90.5 Attachments: HBASE-4684-0.90.patch, HBASE-4684-v2-0.90.patch As reported a month ago, http://search-hadoop.com/m/FD6gmKzrxY1, the REST server is leak ZK connections. Upon investigation I see that TableResource.scanTransformAttrs creates a new HBA per minute per table (when the server is getting requests) but never deletes the connection created in there. There are a bunch of other places where HBAs are created but not cleaned after like SchemaResource, StorageClusterStatusResource, StorageClusterVersionResource, ExistsResource, etc. Those places shouldn't be as leaky under normal circumstances tho. Thanks to Jack Levin for bringing up this issue again when he tried to upgrade. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4747) Upgrade maven surefire plugin to 2.10
[ https://issues.apache.org/jira/browse/HBASE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4747: -- Status: Patch Available (was: Open) Upgrade maven surefire plugin to 2.10 - Key: HBASE-4747 URL: https://issues.apache.org/jira/browse/HBASE-4747 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 4747.txt Quite often, we see the following when running unit tests: {code} Running org.apache.hadoop.hbase.master.TestMasterFailover Exception in thread ThreadedStreamConsumer java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuffer.append(StringBuffer.java:224) at org.apache.maven.surefire.report.TestSetRunListener.getAsString(TestSetRunListener.java:201) at org.apache.maven.surefire.report.TestSetRunListener.testError(TestSetRunListener.java:139) at org.apache.maven.plugin.surefire.booterclient.output.ForkClient.consumeLine(ForkClient.java:112) at org.apache.maven.plugin.surefire.booterclient.output.ThreadedStreamConsumer$Pumper.run(ThreadedStreamConsumer.java:67) at java.lang.Thread.run(Thread.java:680) {code} This was due to https://jira.codehaus.org/browse/SUREFIRE-754 which has been fixed in surefire 2.10 We should upgrade to version 2.10 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4684) REST server is leaking ZK connections in 0.90
[ https://issues.apache.org/jira/browse/HBASE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144288#comment-13144288 ] Andrew Purtell commented on HBASE-4684: --- I'm going to commit this today unless there is an objection. REST server is leaking ZK connections in 0.90 - Key: HBASE-4684 URL: https://issues.apache.org/jira/browse/HBASE-4684 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Andrew Purtell Priority: Critical Fix For: 0.90.5 Attachments: HBASE-4684-0.90.patch, HBASE-4684-v2-0.90.patch As reported a month ago, http://search-hadoop.com/m/FD6gmKzrxY1, the REST server is leak ZK connections. Upon investigation I see that TableResource.scanTransformAttrs creates a new HBA per minute per table (when the server is getting requests) but never deletes the connection created in there. There are a bunch of other places where HBAs are created but not cleaned after like SchemaResource, StorageClusterStatusResource, StorageClusterVersionResource, ExistsResource, etc. Those places shouldn't be as leaky under normal circumstances tho. Thanks to Jack Levin for bringing up this issue again when he tried to upgrade. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4747) Upgrade maven surefire plugin to 2.10
[ https://issues.apache.org/jira/browse/HBASE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4747: -- Status: Open (was: Patch Available) Upgrade maven surefire plugin to 2.10 - Key: HBASE-4747 URL: https://issues.apache.org/jira/browse/HBASE-4747 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 4747.txt Quite often, we see the following when running unit tests: {code} Running org.apache.hadoop.hbase.master.TestMasterFailover Exception in thread ThreadedStreamConsumer java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuffer.append(StringBuffer.java:224) at org.apache.maven.surefire.report.TestSetRunListener.getAsString(TestSetRunListener.java:201) at org.apache.maven.surefire.report.TestSetRunListener.testError(TestSetRunListener.java:139) at org.apache.maven.plugin.surefire.booterclient.output.ForkClient.consumeLine(ForkClient.java:112) at org.apache.maven.plugin.surefire.booterclient.output.ThreadedStreamConsumer$Pumper.run(ThreadedStreamConsumer.java:67) at java.lang.Thread.run(Thread.java:680) {code} This was due to https://jira.codehaus.org/browse/SUREFIRE-754 which has been fixed in surefire 2.10 We should upgrade to version 2.10 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4733) Rest client does not close zookeeper sessions (leaking sessions for each GET or PUT)
[ https://issues.apache.org/jira/browse/HBASE-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144289#comment-13144289 ] Andrew Purtell commented on HBASE-4733: --- bq. FYI this patch applies well to 0.90.4, those folks that use it in production should be able to fix their REST issue right away. Yes Rest client does not close zookeeper sessions (leaking sessions for each GET or PUT) Key: HBASE-4733 URL: https://issues.apache.org/jira/browse/HBASE-4733 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.90.4 Environment: Fedora 13. Reporter: jack levin Labels: newbie Fix For: 0.90.4 Attachments: my.patch This will appear in the log once the zookeeper connection/session leaking will grow to 2000, zookeeper won't be able to accept any more connections causing REST RPC calls to fail, here is the log when the problem is in progress: 2011-10-26 01:35:49,270 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,300 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.4 - max is 2000 2011-10-26 01:35:49,317 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.4 - max is 2000 2011-10-26 01:35:49,321 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,323 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.3 - max is 2000 2011-10-26 01:35:49,367 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.2 - max is 2000 2011-10-26 01:35:49,375 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.4 - max is 2000 2011-10-26 01:35:49,382 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.2 - max is 2000 2011-10-26 01:35:49,404 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.2 - max is 2000 2011-10-26 01:35:49,429 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.2 - max is 2000 2011-10-26 01:35:49,439 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,469 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.3 - max is 2000 2011-10-26 01:35:49,489 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.3 - max is 2000 2011-10-26 01:35:49,501 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.101.255.2 - max is 2000 2011-10-26 01:35:49,584 WARN org.apache.zookeeper.server.NIOServerCnxn: Too many connections from /10.102.255.2 - max is 2000 After the fix, the log looks much better, and we can observe zookeeper connections closing after every RPC call: 2011-11-02 15:50:14,339 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /10.101.255.2:37225 2011-11-02 15:50:14,339 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /10.101.255.2:37225 2011-11-02 15:50:14,340 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x3352857cb1f19f with negotiated timeout 18 for client /10.101.255.2:37225 2011-11-02 15:50:14,363 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /10.101.255.2:37225 which had sessionid 0x3352857cb1f19f 2011-11-02 15:50:14,723 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /10.101.255.2:38089 2011-11-02 15:50:14,723 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /10.101.255.2:38089 2011-11-02 15:50:14,725 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x3352857cb1f1a0 with negotiated timeout 18 for client /10.101.255.2:38089 2011-11-02 15:50:14,771 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /10.101.255.2:38089 which had sessionid 0x3352857cb1f1a0 2011-11-02 15:50:16,326 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /10.101.255.3:34085 2011-11-02 15:50:16,326 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /10.101.255.3:34085 2011-11-02 15:50:16,328 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x3352857cb1f1a1 with negotiated timeout 18 for client /10.101.255.3:34085 -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144316#comment-13144316 ] Phabricator commented on HBASE-4742: khemani has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:316 name the thread for better jstacks src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:311 I don't think there is a need for ABORT state. There is no aborting the log-splitting ... we should just keep retrying. In this particular case when deadServer doesn't exist then you can return success src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:295 You could put the retry loop with sleeps here? Then the main thread has a very simple state where it just requeues itself if the log-splitting is not done. REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename
[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144321#comment-13144321 ] Hadoop QA commented on HBASE-4553: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502501/4553-v13.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 48 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/177//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/177//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/177//console This message is automatically generated. The update of .tableinfo is not atomic; we remove then rename - Key: HBASE-4553 URL: https://issues.apache.org/jira/browse/HBASE-4553 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v13.txt, 4553-v5.txt, 4553-v9.txt, HBase-4553-TestAvroServer.patch This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists already. In 0.20+ its better but still 'some' issues if existing reader when file is renamed. This issue is about fixing this (though we depend on fix first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4742: --- Attachment: D237.2.patch Liyin updated the revision [jira] [HBASE-4742] Split dead server's log in parallel. Reviewers: Kannan, khemani, Karthik, mbautin, JIRA Thanks Prakash, Ted and Karthik. I removed the ABORT status now. REVISION DETAIL https://reviews.facebook.net/D237 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4750) Make thrift2 ThriftHBaseServiceHandler more friendly to concurrent tests
Make thrift2 ThriftHBaseServiceHandler more friendly to concurrent tests Key: HBASE-4750 URL: https://issues.apache.org/jira/browse/HBASE-4750 Project: HBase Issue Type: Task Reporter: Ted Yu Quite often we saw the following reported by HadoopQA: {code} testExists(org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler) Time elapsed: 0.062 sec ERROR! java.lang.IllegalArgumentException: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:81) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:753) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:765) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:769) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:202) at org.apache.hadoop.hbase.client.HTableFactory.createHTableInterface(HTableFactory.java:36) at org.apache.hadoop.hbase.client.HTablePool.createHTable(HTablePool.java:268) at org.apache.hadoop.hbase.client.HTablePool.findOrCreateTable(HTablePool.java:198) at org.apache.hadoop.hbase.client.HTablePool.getTable(HTablePool.java:173) at org.apache.hadoop.hbase.client.HTablePool.getTable(HTablePool.java:216) at org.apache.hadoop.hbase.thrift2.ThriftHBaseServiceHandler.getTable(ThriftHBaseServiceHandler.java:64) at org.apache.hadoop.hbase.thrift2.ThriftHBaseServiceHandler.exists(ThriftHBaseServiceHandler.java:115) at org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler.testExists(TestThriftHBaseServiceHandler.java:123) {code} Methods in ThriftHBaseServiceHandler don't accept Configuration parameter. This makes parallelizing tests harder. Looking deeper, we can see that HTablePool methods such as getTable() and findOrCreateTable() don't accept Configuration parameter either. So we have to pass Configuration object to HTablePool ctor. This means we need to add ThriftHBaseServiceHandler ctor which takes Configuration parameter. Instead of the following in TestThriftHBaseServiceHandler: {code} ThriftHBaseServiceHandler handler = new ThriftHBaseServiceHandler(); {code} We should be using the new ThriftHBaseServiceHandler ctor and pass HBaseTestingUtility's Configuration so that HTablePool ctor can receive it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4742: --- Attachment: D237.3.patch Liyin updated the revision [jira] [HBASE-4742] Split dead server's log in parallel. Reviewers: Kannan, khemani, Karthik, mbautin, JIRA Address Karthik's comments. REVISION DETAIL https://reviews.facebook.net/D237 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144340#comment-13144340 ] Ted Yu commented on HBASE-4737: --- Using the shell script on MacBook I got: {code} Failed tests: testSimplePutDelete(org.apache.hadoop.hbase.replication.TestMasterReplication): Waited too much time for put replication queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited too much time for queueFailover replication Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 142.877 sec FAILURE! {code} Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57
[jira] [Commented] (HBASE-4747) Upgrade maven surefire plugin to 2.10
[ https://issues.apache.org/jira/browse/HBASE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144347#comment-13144347 ] stack commented on HBASE-4747: -- +1 Can you apply this Ted or do you need me to? Upgrade maven surefire plugin to 2.10 - Key: HBASE-4747 URL: https://issues.apache.org/jira/browse/HBASE-4747 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 4747.txt Quite often, we see the following when running unit tests: {code} Running org.apache.hadoop.hbase.master.TestMasterFailover Exception in thread ThreadedStreamConsumer java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuffer.append(StringBuffer.java:224) at org.apache.maven.surefire.report.TestSetRunListener.getAsString(TestSetRunListener.java:201) at org.apache.maven.surefire.report.TestSetRunListener.testError(TestSetRunListener.java:139) at org.apache.maven.plugin.surefire.booterclient.output.ForkClient.consumeLine(ForkClient.java:112) at org.apache.maven.plugin.surefire.booterclient.output.ThreadedStreamConsumer$Pumper.run(ThreadedStreamConsumer.java:67) at java.lang.Thread.run(Thread.java:680) {code} This was due to https://jira.codehaus.org/browse/SUREFIRE-754 which has been fixed in surefire 2.10 We should upgrade to version 2.10 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2425) Crossport HADOOP-1849 rpc fix
[ https://issues.apache.org/jira/browse/HBASE-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144348#comment-13144348 ] jirapos...@reviews.apache.org commented on HBASE-2425: -- bq. On 2011-11-04 18:06:38, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java, line 322 bq. https://reviews.apache.org/r/2718/diff/1/?file=56222#file56222line322 bq. bq. We could eliminate the flag and use status instead. Are there plans for other bits being set in this? Otherwise, we always have length and error can be determined from Status. Or would removing this break asynchbase in other ways? bq. bq. Michael Stack wrote: bq. Its for asynchbase -- so it can tell diff between a response with length and one w/o (its trying to support all hbase versions). bq. bq. Gary Helmling wrote: bq. So do we need an additional flag for status set here, so that asynchbase can tell when that's included? Or can it pick that up from the version number? asynchbase doesn't work against 0.92 yet; therefore, the flag will mean length + status (will check w/ the B man). - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2718/#review3049 --- On 2011-11-04 00:11:21, Michael Stack wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2718/ bq. --- bq. bq. (Updated 2011-11-04 00:11:21) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Versions of Gary suggestions bq. bq. bq. This addresses bug hbase-2425. bq. https://issues.apache.org/jira/browse/hbase-2425 bq. bq. bq. Diffs bq. - bq. bq. src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java fce5490 bq.src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java 2fa4d6f bq. src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java 6f88357 bq.src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java 6fcb771 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1365411 bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 4a8918a bq.src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java e60f970 bq.src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/Status.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java fb07374 bq.src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java 60a9248 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 8de2314 bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 0d0e4c5 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 12bd33e bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java 888f428 bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java e5b6a78 bq. bq. Diff: https://reviews.apache.org/r/2718/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Michael bq. bq. Crossport HADOOP-1849 rpc fix - Key: HBASE-2425 URL: https://issues.apache.org/jira/browse/HBASE-2425 Project: HBase Issue Type: Task Reporter: stack Labels: moved_from_0_20_5 Suggested over in HBASE-2360. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4747) Upgrade maven surefire plugin to 2.10
[ https://issues.apache.org/jira/browse/HBASE-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144350#comment-13144350 ] Ted Yu commented on HBASE-4747: --- Integrated to TRUNK. Thanks for the review Stack. Upgrade maven surefire plugin to 2.10 - Key: HBASE-4747 URL: https://issues.apache.org/jira/browse/HBASE-4747 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 4747.txt Quite often, we see the following when running unit tests: {code} Running org.apache.hadoop.hbase.master.TestMasterFailover Exception in thread ThreadedStreamConsumer java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuffer.append(StringBuffer.java:224) at org.apache.maven.surefire.report.TestSetRunListener.getAsString(TestSetRunListener.java:201) at org.apache.maven.surefire.report.TestSetRunListener.testError(TestSetRunListener.java:139) at org.apache.maven.plugin.surefire.booterclient.output.ForkClient.consumeLine(ForkClient.java:112) at org.apache.maven.plugin.surefire.booterclient.output.ThreadedStreamConsumer$Pumper.run(ThreadedStreamConsumer.java:67) at java.lang.Thread.run(Thread.java:680) {code} This was due to https://jira.codehaus.org/browse/SUREFIRE-754 which has been fixed in surefire 2.10 We should upgrade to version 2.10 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4740: -- Attachment: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error. Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4740: -- Status: Patch Available (was: Open) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error. Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144364#comment-13144364 ] Jonathan Hsieh commented on HBASE-4740: --- Review here: https://reviews.apache.org/r/2730/ [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error. Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144362#comment-13144362 ] Jonathan Hsieh commented on HBASE-4740: --- Ends up that I was splitting in the wrong place and splitting an empty region returns scary error messages when it should say return an innocuous one. [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error. Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144370#comment-13144370 ] Hadoop QA commented on HBASE-4740: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502534/0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/178//console This message is automatically generated. [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error. Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144394#comment-13144394 ] Alex Newman commented on HBASE-2600: So I think the issue with ; is that it comes after [0-9]. What about using ! or something like that? Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4498) HBase RPM/DEB packages attempt to setup ZooKeeper environment incorrectly
[ https://issues.apache.org/jira/browse/HBASE-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HBASE-4498: - Resolution: Duplicate Status: Resolved (was: Patch Available) The patch is part of patch for HBASE-4635. HBase RPM/DEB packages attempt to setup ZooKeeper environment incorrectly - Key: HBASE-4498 URL: https://issues.apache.org/jira/browse/HBASE-4498 Project: HBase Issue Type: Bug Components: build, scripts Affects Versions: 0.92.0 Environment: Java, Linux Reporter: Eric Yang Assignee: Eric Yang Fix For: 0.92.0 Attachments: HBASE-4498.patch HBase RPM packaging was done prior to completion of ZooKeeper RPM packaging. In update-hbase-env.sh, it expects ZooKeeper environment script to exist in /etc/default/zookeeper-env.sh. After several revision of ZOOKEEPER-999, it was decided to remove /etc/default/zookeeper-env.sh by ZooKeeper community. Hence, update-hbase-env.sh should not depend on /etc/default/zookeeper-env.sh. Instead, update-hbase-env.sh should assume ZooKeeper exists in /usr for RPM/deb packages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4467) Handle inconsistencies in Hadoop libraries naming in hbase script
[ https://issues.apache.org/jira/browse/HBASE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144400#comment-13144400 ] Hadoop QA commented on HBASE-4467: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12496268/HBASE-4467.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/179//console This message is automatically generated. Handle inconsistencies in Hadoop libraries naming in hbase script - Key: HBASE-4467 URL: https://issues.apache.org/jira/browse/HBASE-4467 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.92.0, 0.94.0 Reporter: Lars George Assignee: Lars George Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-4467.patch When using an Hadoop tarball that has a library naming of hadoop-x.y.z-core as opposed to hadoop-core-x.y.z then the hbase script throws errors. {noformat} $ bin/start-hbase.sh ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) localhost: starting zookeeper, logging to /projects/opensource/hbase-trunk-rw//logs/hbase-larsgeorge-zookeeper-de1-app-mbp-2.out localhost: /projects/opensource/hadoop-0.20.2-append localhost: ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory localhost: Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName localhost: Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName localhost:at java.net.URLClassLoader$1.run(URLClassLoader.java:202) localhost:at java.security.AccessController.doPrivileged(Native Method) localhost:at java.net.URLClassLoader.findClass(URLClassLoader.java:190) localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:306) localhost:at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:247) starting master, logging to /projects/opensource/hbase-trunk-rw/bin/../logs/hbase-larsgeorge-master-de1-app-mbp-2.out /projects/opensource/hadoop-0.20.2-append ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) localhost: starting regionserver, logging to /projects/opensource/hbase-trunk-rw//logs/hbase-larsgeorge-regionserver-de1-app-mbp-2.out localhost: /projects/opensource/hadoop-0.20.2-append
[jira] [Commented] (HBASE-2742) Provide strong authentication with a secure RPC engine
[ https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144406#comment-13144406 ] jirapos...@reviews.apache.org commented on HBASE-2742: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1991/#review3056 --- Ship it! - Andrew On 2011-10-26 20:23:19, Gary Helmling wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1991/ bq. --- bq. bq. (Updated 2011-10-26 20:23:19) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This patch creates a new secure RPC engine for HBase, which provides Kerberos based authentication of clients, and a token-based authentication mechanism for mapreduce jobs. Primary components of the patch are: bq. bq. - a new maven profile for secure Hadoop/HBase: hadoop-0.20S bq.- Secure Hadoop dependent classes are separated under a pseudo-module in the security/ directory. These source and test directories are only including if building the secure Hadoop profile bq.- Currently the security classes get packaged with the regular HBase build artifacts. We need a way to at least override project.version, so we can append something like a -security suffix indicating the additional security components. bq.- The pseudo-module here is really a half-step forward. It enables the security code to be optionally included in the build for now, and sets up the structure for a security module. But we still will want to pursue full modularization (see HBASE-4336), which will allow packing the security code in a separate build artifact. bq. bq. - a new RPC engine providing kerberos and token-based authentication: org.apache.hadoop.hbase.ipc.SecureRpcEngine bq.- implementation under security/src/main/java/org/apache/hadoop/hbase/ipc/ bq.- The implementation classes extend the existing HBaseClient and HBaseServer to share as much of the RPC code as possible. The main override is of the connection classes to allow control over the SASL negotiation of secure connections bq. bq. - existing RPC changes bq.- The existing HBaseClient and HBaseServer have been modified to make subclassing possible bq.- All references to Hadoop UserGroupInformation have been replaced with org.apache.hadoop.hbase.security.User to insulate from future dependencies on specific Hadoop versions bq. bq. - a coprocessor endpoint for obtaining new authentication tokens: TokenProvider, and supporting classes for token generation and synchronization (incorporating HBASE-3615) bq.- implementation is under security/src/main/java/org/apache/hadoop/hbase/security/token/ bq.- Secret keys for token generation and verification are synchronized throughout the cluster in zookeeper, under /hbase/tokenauth/keys bq. bq. bq. To enable secure RPC, add the following configuration to hbase-site.xml: bq. bq.property bq. namehadoop.security.authorization/name bq. valuetrue/value bq./property bq.property bq. namehadoop.security.authentication/name bq. valuekerberos/value bq./property bq.property bq. namehbase.rpc.engine/name bq. valueorg.apache.hadoop.hbase.ipc.SecureRpcEngine/value bq./property bq.property bq. namehbase.coprocessor.region.classes/name bq. valueorg.apache.hadoop.hbase.security.token.TokenProvider/value bq./property bq. bq. In addition, the master and regionserver processes must be configured for kerberos authentication using the properties: bq. bq. * hbase.(master|regionserver).keytab.file bq. * hbase.(master|regionserver).kerberos.principal bq. * hbase.(master|regionserver).kerberos.https.principal bq. bq. bq. This addresses bug HBASE-2742. bq. https://issues.apache.org/jira/browse/HBASE-2742 bq. bq. bq. Diffs bq. - bq. bq.conf/hbase-policy.xml PRE-CREATION bq.pom.xml 9d42e2b bq.security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/ipc/SecureConnectionHeader.java PRE-CREATION bq.security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java PRE-CREATION bq.security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java PRE-CREATION bq.security/src/main/java/org/apache/hadoop/hbase/ipc/Status.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/AccessDeniedException.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/HBasePolicyProvider.java
[jira] [Commented] (HBASE-4415) Add configuration script for setup HBase (hbase-setup-conf.sh)
[ https://issues.apache.org/jira/browse/HBASE-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144405#comment-13144405 ] Eric Yang commented on HBASE-4415: -- Ted, HBASE-4498 contains the explanation of reason to remove zookeeper-env.sh because ZooKeeper community do not support this in their RPM creation. Stack, hbase-env.sh in src/packages/templates/conf is used for hbase-setup-conf.sh as post installation process. It is designed to be a template, where conf/hbase-env.sh can be edited in place, which I think they should be separated to be safe. I am happy to convert it to use conf/hbase-env.sh location, but this would make conf/hbase-env.sh a template. Is this something worth proceeding? Add configuration script for setup HBase (hbase-setup-conf.sh) -- Key: HBASE-4415 URL: https://issues.apache.org/jira/browse/HBASE-4415 Project: HBase Issue Type: New Feature Components: scripts Affects Versions: 0.90.4, 0.92.0 Environment: Java 6, Linux Reporter: Eric Yang Assignee: Eric Yang Fix For: 0.90.4, 0.92.0 Attachments: HBASE-4415-1.patch, HBASE-4415-2.patch, HBASE-4415-3.patch, HBASE-4415-4.patch, HBASE-4415-5.patch, HBASE-4415-6.patch, HBASE-4415.patch The goal of this jura is to provide a installation script for configuring HBase environment and configuration. By using the same pattern of *-setup-conf.sh for all Hadoop related projects. For HBase, the usage of the script looks like this: {noformat} usage: ./hbase-setup-conf.sh parameters Optional parameters: --hadoop-conf=/etc/hadoopSet Hadoop configuration directory location --hadoop-home=/usr Set Hadoop directory location --hadoop-namenode=localhost Set Hadoop namenode hostname --hadoop-replication=3 Set HDFS replication --hbase-home=/usrSet HBase directory location --hbase-conf=/etc/hbase Set HBase configuration directory location --hbase-log=/var/log/hbase Set HBase log directory location --hbase-pid=/var/run/hbase Set HBase pid directory location --hbase-user=hbase Set HBase user --java-home=/usr/java/defaultSet JAVA_HOME directory location --kerberos-realm=KERBEROS.EXAMPLE.COMSet Kerberos realm --kerberos-principal-id=_HOSTSet Kerberos principal ID --keytab-dir=/etc/security/keytabs Set keytab directory --regionservers=localhostSet regionservers hostnames --zookeeper-home=/usrSet ZooKeeper directory location --zookeeper-quorum=localhost Set ZooKeeper Quorum --zookeeper-snapshot=/var/lib/zookeeper Set ZooKeeper snapshot location {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4635) Remove dependency of java for rpm/deb packaging
[ https://issues.apache.org/jira/browse/HBASE-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HBASE-4635: - Attachment: HBASE-4635-1.patch Updated patch to remove conflict with HBASE-4415. Changes for HBASE-4498 is now part of HBASE-4415. Remove dependency of java for rpm/deb packaging --- Key: HBASE-4635 URL: https://issues.apache.org/jira/browse/HBASE-4635 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.92.0 Environment: Java, Ubuntu, RHEL Reporter: Eric Yang Assignee: Eric Yang Attachments: HBASE-4635-1.patch, HBASE-4635.patch Comment from HBASE-3606: Eric, it looks like hbase rpm spec file sets dependency on jdk. Can we remove the jdk dependency ? As everyone will not be installing jdk through rpm. There are multiple ways to install Java on Linux. It would be better to remove Java dependency declaration for packaging. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144418#comment-13144418 ] Phabricator commented on HBASE-4742: khemani has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java:152 How can the master close? It should not under any of these circumstances. REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3025) Coprocessor based simple access control
[ https://issues.apache.org/jira/browse/HBASE-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144417#comment-13144417 ] jirapos...@reviews.apache.org commented on HBASE-3025: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2041/#review3057 --- Ship it! security/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessControlFilter.java https://reviews.apache.org/r/2041/#comment6817 Would 1 slave be sufficient? security/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java https://reviews.apache.org/r/2041/#comment6818 Should we wait for the ACL table to become available here? I've seen this after making changes that alter connection setup timing: org.apache.hadoop.hbase.TableNotFoundException: _acl_ at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:863) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:732) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:196) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:152) at org.apache.hadoop.hbase.security.rbac.TestAccessController.setupBeforeClass(TestAccessController.java:95) - Andrew On 2011-11-01 21:18:27, Gary Helmling wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2041/ bq. --- bq. bq. (Updated 2011-11-01 21:18:27) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This patch implements access control list based authorization of HBase operations. The patch depends on the currently posted patch for HBASE-2742 (secure RPC engine). bq. bq. Key parts of the implementation are: bq. bq. * AccessControlLists - encapsulates storage of permission grants in a metadata table (_acl_). This differs from previous implementation where the .META. table was used to store permissions. bq. bq. * AccessController - bq.- implements MasterObserver and RegionObserver, performing authorization checks in each of the preXXX() hooks. If authorization fails, an AccessDeniedException is thrown. bq.- implements AccessControllerProtocol as a coprocessor endpoint to provide RPC methods for granting, revoking and listing permissions. bq. bq. * ZKPermissionWatcher (and TableAuthManager) - synchronizes ACL entries and updates throughout the cluster nodes using ZK. ACL entries are stored in per-table znodes as /hbase/acl/tablename. bq. bq. * Additional ruby shell scripts providing the grant, revoke and user_permission commands bq. bq. * Support for a new OWNER attribute in HTableDescriptor. I could separate out this change into a new JIRA for discussion, but I don't see it as currently useful outside of security. Alternately, I could handle the OWNER attribute completely in AccessController without changing HTD, but that would make interaction via hbase shell a bit uglier. bq. bq. bq. This addresses bug HBASE-3025. bq. https://issues.apache.org/jira/browse/HBASE-3025 bq. bq. bq. Diffs bq. - bq. bq. security/src/main/java/org/apache/hadoop/hbase/security/access/AccessControlFilter.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/AccessControlLists.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/AccessControllerProtocol.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/Permission.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/TablePermission.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/UserPermission.java PRE-CREATION bq. security/src/main/java/org/apache/hadoop/hbase/security/access/ZKPermissionWatcher.java PRE-CREATION bq. security/src/test/java/org/apache/hadoop/hbase/security/access/SecureTestUtil.java PRE-CREATION bq. security/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessControlFilter.java PRE-CREATION bq.
[jira] [Updated] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4511: - Attachment: sketch.txt I think there is a hole here as identified by Gaojinchao. Its probably of rare incidence but its easy to reason about. Say we fail verifying root region location when a new master joins an exisiting cluster, we will reassign it. We may have failed though because the server has not yet expired in zk. In this later case, the root will likely be assigned to a new server. When the server that had been carrying the root eventually goes down, the shutdown handling code may treat it as a server that was carrying root (if root location has not yet been updated) or it may not (if root location has been updated). In either case, its edits will likely be skipped when root opens in its new location. Ditto for meta region. Here is a sketch of a patch that does the following: + If on master start, verification of the root location fails AND the server we were verifying is a member of the online set, expire the server -- do not reassign root (let shutdown handler do the reassigning). + Ditto for meta (only don't re-expire a server if we just did it on root check). Drastic but I think this 'correct'. Patch is not working right though... need to check it out later. There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Minor Fix For: 0.92.0 Attachments: org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720]
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144433#comment-13144433 ] Phabricator commented on HBASE-4742: khemani has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. also, I think we should maintain the old single threaded behavior when distributed log splitting is not enabled. (You might not have to make any change becasue of splitLogLock in HMaster.splitLog() function ... but please verify) REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename
[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4553: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed branch and trunk. The update of .tableinfo is not atomic; we remove then rename - Key: HBASE-4553 URL: https://issues.apache.org/jira/browse/HBASE-4553 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v12.txt, 4553-v13.txt, 4553-v5.txt, 4553-v9.txt, HBase-4553-TestAvroServer.patch This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists already. In 0.20+ its better but still 'some' issues if existing reader when file is renamed. This issue is about fixing this (though we depend on fix first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144439#comment-13144439 ] Lars Hofhansl commented on HBASE-2600: -- I think it does not matter as long as it comes after ','. The current format is: tableName','..., in order to enforce that the empty region is at the end, we can have the that entry have the forma: tableName';'... that way we ensure it is after all other key for that table. Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4725) NPE in AM#updateTimers
[ https://issues.apache.org/jira/browse/HBASE-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4725: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed branch and trunk. NPE in AM#updateTimers -- Key: HBASE-4725 URL: https://issues.apache.org/jira/browse/HBASE-4725 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.0 Attachments: am.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1312#comment-1312 ] stack commented on HBASE-4749: -- So, looking at this more, I think its correct to register servers that are up in zk but that have not reported in (and have not expired yet). And HBASE-4511 is a real issue (I've commented over there). Since hbase-4511 a rare issue IMO, I don't think it a blocker/critical fix needed for 0.92. Fixing this test, lets do something basic like one of the Ted suggestions above. TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1311#comment-1311 ] Alex Newman commented on HBASE-2600: consider what would happen if you had two tables, one named foo and one named foo1 Wouldn't you have a row order like foo,A foo,B foo,C foo1,A - notice 1 comes before ; in ascii foo1,B foo1,C foo;, foo1;, Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4749: - Status: Patch Available (was: Open) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 Attachments: 4749.txt look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4749) TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4749: - Attachment: 4749.txt Here is a wait on regionserver that does not wait a period -- it actually waits till the RS is down. Running the test, it seems to be working. I'll let my tests run a bit longer... TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS occasionally fails - Key: HBASE-4749 URL: https://issues.apache.org/jira/browse/HBASE-4749 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0 Reporter: gaojinchao Priority: Critical Fix For: 0.92.0 Attachments: 4749.txt look this logs: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/105/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testMasterFailoverWithMockedRITOnDeadRS/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2742) Provide strong authentication with a secure RPC engine
[ https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1316#comment-1316 ] jirapos...@reviews.apache.org commented on HBASE-2742: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1991/#review3058 --- I made a start; more to follow later. conf/hbase-policy.xml https://reviews.apache.org/r/1991/#comment6821 Apache license? pom.xml https://reviews.apache.org/r/1991/#comment6819 We need to change this when we ship 0.92? Can you use variable here? ${pom.version}? pom.xml https://reviews.apache.org/r/1991/#comment6820 This is probably best place for this code for now. Was thinking src/security but that gets weird when test code and main. This is pseudo-maven-modules for now. security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6822 Minor nit: My guess is that this is not your code. Why not just a return CONDITIONS... no need of this if ... return true else return false - Michael On 2011-10-26 20:23:19, Gary Helmling wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1991/ bq. --- bq. bq. (Updated 2011-10-26 20:23:19) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This patch creates a new secure RPC engine for HBase, which provides Kerberos based authentication of clients, and a token-based authentication mechanism for mapreduce jobs. Primary components of the patch are: bq. bq. - a new maven profile for secure Hadoop/HBase: hadoop-0.20S bq.- Secure Hadoop dependent classes are separated under a pseudo-module in the security/ directory. These source and test directories are only including if building the secure Hadoop profile bq.- Currently the security classes get packaged with the regular HBase build artifacts. We need a way to at least override project.version, so we can append something like a -security suffix indicating the additional security components. bq.- The pseudo-module here is really a half-step forward. It enables the security code to be optionally included in the build for now, and sets up the structure for a security module. But we still will want to pursue full modularization (see HBASE-4336), which will allow packing the security code in a separate build artifact. bq. bq. - a new RPC engine providing kerberos and token-based authentication: org.apache.hadoop.hbase.ipc.SecureRpcEngine bq.- implementation under security/src/main/java/org/apache/hadoop/hbase/ipc/ bq.- The implementation classes extend the existing HBaseClient and HBaseServer to share as much of the RPC code as possible. The main override is of the connection classes to allow control over the SASL negotiation of secure connections bq. bq. - existing RPC changes bq.- The existing HBaseClient and HBaseServer have been modified to make subclassing possible bq.- All references to Hadoop UserGroupInformation have been replaced with org.apache.hadoop.hbase.security.User to insulate from future dependencies on specific Hadoop versions bq. bq. - a coprocessor endpoint for obtaining new authentication tokens: TokenProvider, and supporting classes for token generation and synchronization (incorporating HBASE-3615) bq.- implementation is under security/src/main/java/org/apache/hadoop/hbase/security/token/ bq.- Secret keys for token generation and verification are synchronized throughout the cluster in zookeeper, under /hbase/tokenauth/keys bq. bq. bq. To enable secure RPC, add the following configuration to hbase-site.xml: bq. bq.property bq. namehadoop.security.authorization/name bq. valuetrue/value bq./property bq.property bq. namehadoop.security.authentication/name bq. valuekerberos/value bq./property bq.property bq. namehbase.rpc.engine/name bq. valueorg.apache.hadoop.hbase.ipc.SecureRpcEngine/value bq./property bq.property bq. namehbase.coprocessor.region.classes/name bq. valueorg.apache.hadoop.hbase.security.token.TokenProvider/value bq./property bq. bq. In addition, the master and regionserver processes must be configured for kerberos authentication using the properties: bq. bq. * hbase.(master|regionserver).keytab.file bq. * hbase.(master|regionserver).kerberos.principal bq. * hbase.(master|regionserver).kerberos.https.principal bq. bq. bq. This addresses bug HBASE-2742. bq. https://issues.apache.org/jira/browse/HBASE-2742 bq. bq.
[jira] [Created] (HBASE-4751) Make TestAdmin#testEnableTableRoundRobinAssignment friendly to concurrent tests
Make TestAdmin#testEnableTableRoundRobinAssignment friendly to concurrent tests --- Key: HBASE-4751 URL: https://issues.apache.org/jira/browse/HBASE-4751 Project: HBase Issue Type: Task Reporter: Ted Yu From https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2410/artifact/trunk/target/surefire-reports/org.apache.hadoop.hbase.client.TestAdmin.txt : {code} testEnableTableRoundRobinAssignment(org.apache.hadoop.hbase.client.TestAdmin) Time elapsed: 4.345 sec ERROR! java.lang.IllegalArgumentException: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:81) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:753) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:765) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:202) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:157) at org.apache.hadoop.hbase.client.TestAdmin.testEnableTableRoundRobinAssignment(TestAdmin.java:604) {code} This was due to: {code} HTable metaTable = new HTable(HConstants.META_TABLE_NAME); {code} A few lines above, we have the correct usage: {code} HTable ht = new HTable(TEST_UTIL.getConfiguration(), tableName); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira