[jira] [Commented] (HBASE-7160) Improve IdLock and remove its minor defects
[ https://issues.apache.org/jira/browse/HBASE-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502652#comment-13502652 ] Hadoop QA commented on HBASE-7160: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12554644/HBASE-7160-V3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 98 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3397//console This message is automatically generated. Improve IdLock and remove its minor defects --- Key: HBASE-7160 URL: https://issues.apache.org/jira/browse/HBASE-7160 Project: HBase Issue Type: Improvement Reporter: Hiroshi Ikeda Assignee: Hiroshi Ikeda Priority: Minor Attachments: HBASE-7160.patch, HBASE-7160-V2.patch, HBASE-7160-V3.patch Combination of synchronizations and concurrent collections complicates the code, and it is hard to trace the code and to confirm its correctness. We should re-create the class and make it more understandable. In the current code, I find the following minor defects: (1) In the case that there is a waiting thread for a lock in getLockEntry() and another thread is releasing the lock by calling releaseLockEntry(), trying to get the lock with a 3rd thread by calling getLockEntry() falls into a busy loop until the waiting thread wakes up and gets the lock. Even if notify() wakes up the blocked thread and causes a context switch to the waked thread immediately, synchronization might block the waked thread and cause another context switch, and the busy loop might continue for a while. (2) In the same case as (1), since releasing the lock is merely notifying without removing the lock-entry from the map, interrupting the waiting thread might leave an unused lock-entry (entry.numWaiters == 0) in the map. This is a memory leak unless the id of the lock is used again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502738#comment-13502738 ] Adrian Muraru commented on HBASE-7205: -- There is a caching tentative in the current implementation of {code}org.apache.hadoop.hbase.coprocessor.CoprocessorHost#load{code} : and the custom class loader is cached on thread context: {code} Thread.currentThread().setContextClassLoader(cl); {code} However the classes are loaded from the current class-loader (different from thread classloader) {code} implClass = getClass().getClassLoader().loadClass(className); {code} [~te...@apache.org] Not sure if RegionCoprocessorHost is the right place, we need similar behavior on MasterCoprocessorHost so should be one level down. Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7209) Enhance dev-support/test-patch.sh with running test suite using 'mvn package'
Ted Yu created HBASE-7209: - Summary: Enhance dev-support/test-patch.sh with running test suite using 'mvn package' Key: HBASE-7209 URL: https://issues.apache.org/jira/browse/HBASE-7209 Project: HBase Issue Type: Bug Reporter: Ted Yu Currently in dev-support/test-patch.sh, every 'mvn package' command is accompanied with '-DskipTests' This may hide issue(s) when tests are run from jar file. We should enhance the script with running test suite using 'mvn package' command -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7109) integration tests on cluster are not getting picked up from distribution
[ https://issues.apache.org/jira/browse/HBASE-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502740#comment-13502740 ] Ted Yu commented on HBASE-7109: --- I logged HBASE-7209: Enhance dev-support/test-patch.sh with running test suite using 'mvn package' integration tests on cluster are not getting picked up from distribution Key: HBASE-7109 URL: https://issues.apache.org/jira/browse/HBASE-7109 Project: HBase Issue Type: Sub-task Components: test Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.96.0 Attachments: HBASE-7109-squashed.patch, HBASE-7109-v2-squashed.patch, HBASE-7109-v3-squashed.patch, HBASE-7109-v4-squashed.patch, HBASE-7109-v5.patch, HBASE-7109-v5.patch, HBASE-7109-v6-0.94.patch, HBASE-7109-v6.patch The method of finding test classes only works on local build (or its full copy), not if the distribution is used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502744#comment-13502744 ] Ted Yu commented on HBASE-7205: --- @Adrian: Were you suggesting that we should retrieve thread classloader first so that we can reuse CoprocessorClassLoader instance ? Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502763#comment-13502763 ] Adrian Muraru commented on HBASE-7205: -- [~te...@apache.org] CHecking the thread classloader would be a quick fix in the current implementation, however this would break the following usecase: 1. User creates a table and adds a RegionObserver CP (say org.mycompany.MyRegionObserver) available in say hdfs:///lib/mycp-*0.1*.jar (in turn, a new classloader will be cached on thread context) 2. The user updates the implementation and overwrite the COPROCESSOR att on table to load from a new jar (say hdfs:///lib/mycp-*0.2*.jar) In turn, the old/cached classloader will be used to load MyRegionObserver class - in other ways there is no way to drop an old classloader Though, this might be considered a corner case and documented properly The real solution here in my mind would be to cache the classloaders keyed by external jar canonical path: This would require two new methods on org.apache.hadoop.hbase.Server 1. getExternalClassLoader(String key)/ 2. setExternalClassLoader(String key, ClassLoader cl) Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7167) Thrift's deleteMultiple() raises exception instead of returning list of failed deletes
[ https://issues.apache.org/jira/browse/HBASE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502789#comment-13502789 ] ramkrishna.s.vasudevan commented on HBASE-7167: --- bq.+ * Allways returns an empty list for backwards compatibility. Pls 'Allways' to 'always'. The changes seems to be fine. May be only doc change itself should be ok. Anyway +1 on the patch. Thrift's deleteMultiple() raises exception instead of returning list of failed deletes -- Key: HBASE-7167 URL: https://issues.apache.org/jira/browse/HBASE-7167 Project: HBase Issue Type: Bug Components: Thrift Reporter: Daniel Gómez Ferro Assignee: Daniel Gómez Ferro Attachments: HBASE-7167.patch, HBASE-7167.patch Thrift API claims deleteMultiple() returns the list of failed Deletes, but the current implementation throws a TIOError instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7210) Backport HBASE-6059 to 0.94
ramkrishna.s.vasudevan created HBASE-7210: - Summary: Backport HBASE-6059 to 0.94 Key: HBASE-7210 URL: https://issues.apache.org/jira/browse/HBASE-7210 Project: HBase Issue Type: Bug Affects Versions: 0.94.2 Reporter: ramkrishna.s.vasudevan Fix For: 0.94.4 HBASE-6059 seems to be an important issue. Chunhui has already given a patch for 94. Need to rebase if it does not apply cleanly. Raising a new one as the old issue is already closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7167) Thrift's deleteMultiple() raises exception instead of returning list of failed deletes
[ https://issues.apache.org/jira/browse/HBASE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Gómez Ferro updated HBASE-7167: -- Attachment: HBASE-7167.patch Fix typo spotted by Ramkrishna Thrift's deleteMultiple() raises exception instead of returning list of failed deletes -- Key: HBASE-7167 URL: https://issues.apache.org/jira/browse/HBASE-7167 Project: HBase Issue Type: Bug Components: Thrift Reporter: Daniel Gómez Ferro Assignee: Daniel Gómez Ferro Attachments: HBASE-7167.patch, HBASE-7167.patch, HBASE-7167.patch Thrift API claims deleteMultiple() returns the list of failed Deletes, but the current implementation throws a TIOError instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7168) [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn
[ https://issues.apache.org/jira/browse/HBASE-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502798#comment-13502798 ] nkeywal commented on HBASE-7168: Committed, thanks for the review, Elliott Stack! Note: I renamed the MAVEN_OPTIONS to MAVEN_OPTS as I saw at the last moment it was the naming pattern for the other options. [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn Key: HBASE-7168 URL: https://issues.apache.org/jira/browse/HBASE-7168 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.96.0 Environment: dev Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 7168.v1.patch When it happens, it's difficult to guess. Let's fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7168) [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn
[ https://issues.apache.org/jira/browse/HBASE-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal resolved HBASE-7168. Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn Key: HBASE-7168 URL: https://issues.apache.org/jira/browse/HBASE-7168 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.96.0 Environment: dev Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 7168.v1.patch When it happens, it's difficult to guess. Let's fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7211) Improve hbase ref guide for the testing part.
nkeywal created HBASE-7211: -- Summary: Improve hbase ref guide for the testing part. Key: HBASE-7211 URL: https://issues.apache.org/jira/browse/HBASE-7211 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Here is some stuff I saw. I will propose a fix in a week or so, please add the comment or issues you have in mind. ??15.6.1. Apache HBase Modules?? = We should be able to use categories in all modules. The default should be small; but any test manipulating the time needs to be in a specific jvm (hence medium), so it's not always related to minicluster. ??15.6.3.6. hbasetests.sh?? = We can remove this chapter, and the script The script is not totally useless, but I think nobody actually uses it. = Add a chapter on flakiness. Some tests are, unfortunately, flaky. While there number decreases, we still have some. Rules are: - don't write flaky tests! :-) - small tests cannot be flaky, as it blocks other test execution. Corollary: if you have an issue with a small test, it's either your environment either a severe issue. - rerun the test a few time to validate, check the ports and file descriptors used. ??mvn test -P localTests -Dtest=MyTest?? = We could actually activate the localTests profile whenever -Dtest is used. If we do that, we can remove the reference from localTests in the doc. ??mvn test -P runSmallTests?? ??mvn test -P runMediumTests?? = I'm not sure it's actually used. We could remove them from the pom.xml (and the doc). ??The HBase build uses a patched version of the maven surefire plugin?? = Hopefully, we will be able to remove this soon :-) ??Integration tests are described TODO: POINTER_TO_INTEGRATION_TEST_SECTION?? = Should be documented -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7168) [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn
[ https://issues.apache.org/jira/browse/HBASE-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502819#comment-13502819 ] Hudson commented on HBASE-7168: --- Integrated in HBase-TRUNK #3560 (See [https://builds.apache.org/job/HBase-TRUNK/3560/]) HBASE-7168 In the script called 'hbase', we don't check the errors when generating the classpath with mvn (Revision 1412565) Result = FAILURE nkeywal : Files : * /hbase/trunk/bin/hbase [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn Key: HBASE-7168 URL: https://issues.apache.org/jira/browse/HBASE-7168 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.96.0 Environment: dev Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 7168.v1.patch When it happens, it's difficult to guess. Let's fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502820#comment-13502820 ] Ted Yu commented on HBASE-7205: --- As I said earlier, RegionCoprocessorHost has access to RegionServerServices. Should the new methods be added to RegionServerServices ? In the case you described above, should we allow the path for ver 0.1 of the jar to be purged from the new classloader map ? Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7167) Thrift's deleteMultiple() raises exception instead of returning list of failed deletes
[ https://issues.apache.org/jira/browse/HBASE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502829#comment-13502829 ] Hadoop QA commented on HBASE-7167: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12554683/HBASE-7167.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 98 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3398//console This message is automatically generated. Thrift's deleteMultiple() raises exception instead of returning list of failed deletes -- Key: HBASE-7167 URL: https://issues.apache.org/jira/browse/HBASE-7167 Project: HBase Issue Type: Bug Components: Thrift Reporter: Daniel Gómez Ferro Assignee: Daniel Gómez Ferro Attachments: HBASE-7167.patch, HBASE-7167.patch, HBASE-7167.patch Thrift API claims deleteMultiple() returns the list of failed Deletes, but the current implementation throws a TIOError instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7167) Thrift's deleteMultiple() raises exception instead of returning list of failed deletes
[ https://issues.apache.org/jira/browse/HBASE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7167: -- Fix Version/s: 0.96.0 Release Note: Updated documentation to reflect that Thrift2 API deleteMultiple() throws a TIOError if any delete fails. (was: Updated documentation to reflect that Thrift2 API deleteMultiple() throws a TIOError if a delete fails.) Hadoop Flags: Reviewed Integrated latest patch to trunk. Thanks for the patch, Daniel. Thanks for the review, Ram. Thrift's deleteMultiple() raises exception instead of returning list of failed deletes -- Key: HBASE-7167 URL: https://issues.apache.org/jira/browse/HBASE-7167 Project: HBase Issue Type: Bug Components: Thrift Reporter: Daniel Gómez Ferro Assignee: Daniel Gómez Ferro Fix For: 0.96.0 Attachments: HBASE-7167.patch, HBASE-7167.patch, HBASE-7167.patch Thrift API claims deleteMultiple() returns the list of failed Deletes, but the current implementation throws a TIOError instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7167) Thrift's deleteMultiple() raises exception instead of returning list of failed deletes
[ https://issues.apache.org/jira/browse/HBASE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502843#comment-13502843 ] Hudson commented on HBASE-7167: --- Integrated in HBase-TRUNK #3561 (See [https://builds.apache.org/job/HBase-TRUNK/3561/]) HBASE-7167 Thrift's deleteMultiple() raises exception instead of returning list of failed deletes (Daniel Gomez) (Revision 1412594) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftUtilities.java * /hbase/trunk/hbase-server/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift Thrift's deleteMultiple() raises exception instead of returning list of failed deletes -- Key: HBASE-7167 URL: https://issues.apache.org/jira/browse/HBASE-7167 Project: HBase Issue Type: Bug Components: Thrift Reporter: Daniel Gómez Ferro Assignee: Daniel Gómez Ferro Fix For: 0.96.0 Attachments: HBASE-7167.patch, HBASE-7167.patch, HBASE-7167.patch Thrift API claims deleteMultiple() returns the list of failed Deletes, but the current implementation throws a TIOError instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6774) Immediate assignment of regions that don't have entries in HLog
[ https://issues.apache.org/jira/browse/HBASE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502845#comment-13502845 ] nkeywal commented on HBASE-6774: For the master based solution If we go for the regionserver - master - zookeeper solution, it's not perfect imho, because we just add an agent in the middle. The master could store the region information, without going to ZK - Faster than the solution with ZK, because we would not write to the disk - If we lose the master, we lose the date, but it's not an issue (just that the recovery will be slower: we will have to read all the logs) - The master becomes an element of the write path (for the first write in a memstore). I'm not at ease with that. At the end of the day, I agree with what Stack said previously: let's not add a new component in the write path. This is valid for both the master ZK. So we're left with the other options: - specific WAL for .meta. - adding meta data at the end of the WAL. I'm currently looking at them. Immediate assignment of regions that don't have entries in HLog --- Key: HBASE-6774 URL: https://issues.apache.org/jira/browse/HBASE-6774 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal The algo is today, after a failure detection: - split the logs - when all the logs are split, assign the regions But some regions can have no entries at all in the HLog. There are many reasons for this: - kind of reference or historical tables. Bulk written sometimes then read only. - sequential rowkeys. In this case, most of the regions will be read only. But they can be in a regionserver with a lot of writes. - tables flushed often for safety reasons. I'm thinking about meta here. For meta; we can imagine flushing very often. Hence, the recovery for meta, in many cases, will be the failure detection time. There are different possible algos: Option 1) A new task is added, in parallel of the split. This task reads all the HLog. If there is no entry for a region, this region is assigned. Pro: simple Cons: We will need to read all the files. Add a read. Option 2) The master writes in ZK the number of log files, per region. When the regionserver starts the split, it reads the full block (64M) and decrease the log file counter of the region. If it reaches 0, the assign start. At the end of its split, the region server decreases the counter as well. This allow to start the assign even if not all the HLog are finished. It would allow to make some regions available even if we have an issue in one of the log file. Pro: parallel Cons: add something to do for the region server. Requites to read the whole file before starting to write. Option 3) Add some metadata at the end of the log file. The last log file won't have meta data, as if we are recovering, it's because the server crashed. But the others will. And last log file should be smaller (half a block on average). Option 4) Still some metadata, but in a different file. Cons: write are increased (but not that much, we just need to write the region once). Pros: if we lose the HLog files (major failure, no replica available) we can still continue with the regions that were not written at this stage. I think it should be done, even if none of the algorithm above is totally convincing yet. It's linked as well to locality and short circuit reads: with these two points reading the file twice become much less of an issue for example. My current preference would be to open the file twice in the region server, once for splitting as of today, once for a quick read looking for unused regions. Who knows, may be it would even be faster this way, the quick read thread would warm-up the different caches for the splitting thread. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3533) Allow HBASE_LIBRARY_PATH env var to specify extra locations of native libs
[ https://issues.apache.org/jira/browse/HBASE-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502852#comment-13502852 ] Andrés Pipicello commented on HBASE-3533: - Hi It seems the modifications were lost in revision 1150282. (see http://svn.apache.org/viewvc/hbase/trunk/bin/hbase?r1=1133211r2=1150282diff_format=h) Allow HBASE_LIBRARY_PATH env var to specify extra locations of native libs -- Key: HBASE-3533 URL: https://issues.apache.org/jira/browse/HBASE-3533 Project: HBase Issue Type: Improvement Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hbase-3533.txt Would be handy when you have native libs at other spots on the system (eg I often want to test hadoop-lzo changes directly out of its build dir) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502856#comment-13502856 ] Lars Hofhansl commented on HBASE-7205: -- [~apurtell] Want have a look? Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7210) Backport HBASE-6059 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-7210: -- Attachment: 6059-94.patch Chunhuui's 0.94 patch. Please review.. Backport HBASE-6059 to 0.94 --- Key: HBASE-7210 URL: https://issues.apache.org/jira/browse/HBASE-7210 Project: HBase Issue Type: Bug Affects Versions: 0.94.2 Reporter: ramkrishna.s.vasudevan Fix For: 0.94.4 Attachments: 6059-94.patch HBASE-6059 seems to be an important issue. Chunhui has already given a patch for 94. Need to rebase if it does not apply cleanly. Raising a new one as the old issue is already closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502881#comment-13502881 ] Lars Hofhansl commented on HBASE-7205: -- Looking at HBASE-6308 it seems the memory behavior didn't change principally. If this is indeed a new problem I will make this a blocker and sink the current 0.94.3RC for this. Any patch here should also explore the memory behavior in a test (although I am not entirely sure how one would do that) Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0, 0.94.4 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7205: - Fix Version/s: 0.94.4 Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0, 0.94.4 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502887#comment-13502887 ] Lars Hofhansl commented on HBASE-7205: -- The other option is to revert HBASE-6308 from 0.94 and only put that fix into 0.96. (At this point I am favoring that) Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0, 0.94.4 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502888#comment-13502888 ] Lars Hofhansl commented on HBASE-7205: -- [~jbaldassari] Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Priority: Critical Fix For: 0.96.0, 0.94.4 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96
[ https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502896#comment-13502896 ] Jonathan Hsieh commented on HBASE-6055: --- If I get some +1's or no comments after Monday, I'll update it. Snapshots in HBase 0.96 --- Key: HBASE-6055 URL: https://issues.apache.org/jira/browse/HBASE-6055 Project: HBase Issue Type: New Feature Components: Client, master, regionserver, snapshots, Zookeeper Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055, 0.96.0 Attachments: Snapshots in HBase.docx Continuation of HBASE-50 for the current trunk. Since the implementation has drastically changed, opening as a new ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7212) Globally Barriered Procedure mechanism
Jonathan Hsieh created HBASE-7212: - Summary: Globally Barriered Procedure mechanism Key: HBASE-7212 URL: https://issues.apache.org/jira/browse/HBASE-7212 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Jonathan Hsieh Fix For: hbase-6055 This is a simplified version of what was proposed in HBASE-6573. Instead of claiming to be a 2pc or 3pc implementation (which implies logging at each actor, and recovery operations) this is just provides a best effort global barrier mechanism called a Procedure. Users need only to implement a methods to acquireBarrier, to act when insideBarrier, and to releaseBarrier that use the ExternalException cooperative error checking mechanism. Globally consistent snapshots require the ability to quiesce writes to a set of region servers before a the snapshot operation is executed. Also if any node fails, it needs to be able to notify them so that they abort. The first cut of other online snapshots don't need the fully barrier but may still use this for its error propagation mechanisms. This version removes the extra layer incurred in the previous implementation due to the use of generics, separates the coordinator and members, and reduces the amount of inheritance used in favor of composition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96
[ https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502910#comment-13502910 ] Matteo Bertozzi commented on HBASE-6055: +1 on separate offline and online, but maybe we can keep a root jira to keep track of all the dependencies, and a general design doc {code} + Snapshot in HBase |-- HFile Archiver |-- Offline Snapshot |- Offline Snapshot |- Cleaner |- Restore/Clone |- Shell |- ... |-- Online Snapshot |- Procedure |- Exception Framework |- Timestamp Snapshot |- ... {code} Snapshots in HBase 0.96 --- Key: HBASE-6055 URL: https://issues.apache.org/jira/browse/HBASE-6055 Project: HBase Issue Type: New Feature Components: Client, master, regionserver, snapshots, Zookeeper Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055, 0.96.0 Attachments: Snapshots in HBase.docx Continuation of HBASE-50 for the current trunk. Since the implementation has drastically changed, opening as a new ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7168) [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn
[ https://issues.apache.org/jira/browse/HBASE-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502960#comment-13502960 ] Hudson commented on HBASE-7168: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #271 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/271/]) HBASE-7168 In the script called 'hbase', we don't check the errors when generating the classpath with mvn (Revision 1412565) Result = FAILURE nkeywal : Files : * /hbase/trunk/bin/hbase [dev] in the script called 'hbase', we don't check for errors when generating the classpath with mvn Key: HBASE-7168 URL: https://issues.apache.org/jira/browse/HBASE-7168 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.96.0 Environment: dev Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 7168.v1.patch When it happens, it's difficult to guess. Let's fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7167) Thrift's deleteMultiple() raises exception instead of returning list of failed deletes
[ https://issues.apache.org/jira/browse/HBASE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502961#comment-13502961 ] Hudson commented on HBASE-7167: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #271 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/271/]) HBASE-7167 Thrift's deleteMultiple() raises exception instead of returning list of failed deletes (Daniel Gomez) (Revision 1412594) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftUtilities.java * /hbase/trunk/hbase-server/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift Thrift's deleteMultiple() raises exception instead of returning list of failed deletes -- Key: HBASE-7167 URL: https://issues.apache.org/jira/browse/HBASE-7167 Project: HBase Issue Type: Bug Components: Thrift Reporter: Daniel Gómez Ferro Assignee: Daniel Gómez Ferro Fix For: 0.96.0 Attachments: HBASE-7167.patch, HBASE-7167.patch, HBASE-7167.patch Thrift API claims deleteMultiple() returns the list of failed Deletes, but the current implementation throws a TIOError instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7206) External Exception framework v2 (simplifies and replaces HBASE-6571)
[ https://issues.apache.org/jira/browse/HBASE-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-7206: -- Attachment: 121122-external-exceptions.pdf Attached a quick deck motivating and describing lifecycle of external exceptions framework. External Exception framework v2 (simplifies and replaces HBASE-6571) Key: HBASE-7206 URL: https://issues.apache.org/jira/browse/HBASE-7206 Project: HBase Issue Type: Sub-task Components: snapshots Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: hbase-6055, 0.96.0 Attachments: 121122-external-exceptions.pdf This provides a way of sending exceptions from 'external' threads/processes (not the main executing thread) to others that poll cooperatively for external exceptions. Some examples of how this can be used include: having a separate timeout thread that injects an exception when a time limit has elapsed (TimeoutExceptionInjector, was OperationAttemptTimer), or having an exception from an separate process delivered to a local thread. This simplified version is centered around the ExternalException class. Instead of using generics and ErrorListener interfaces, this more straight-forward implementation eliminates many of the builders/factories and generics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6774) Immediate assignment of regions that don't have entries in HLog
[ https://issues.apache.org/jira/browse/HBASE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502985#comment-13502985 ] Devaraj Das commented on HBASE-6774: I am starting to prototype the specific wal for .meta. approach (leveraging the implementation of FSHlog) to get a feel for the complexity, etc. Will keep folks posted (and probably raise a separate jira as well). Immediate assignment of regions that don't have entries in HLog --- Key: HBASE-6774 URL: https://issues.apache.org/jira/browse/HBASE-6774 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal The algo is today, after a failure detection: - split the logs - when all the logs are split, assign the regions But some regions can have no entries at all in the HLog. There are many reasons for this: - kind of reference or historical tables. Bulk written sometimes then read only. - sequential rowkeys. In this case, most of the regions will be read only. But they can be in a regionserver with a lot of writes. - tables flushed often for safety reasons. I'm thinking about meta here. For meta; we can imagine flushing very often. Hence, the recovery for meta, in many cases, will be the failure detection time. There are different possible algos: Option 1) A new task is added, in parallel of the split. This task reads all the HLog. If there is no entry for a region, this region is assigned. Pro: simple Cons: We will need to read all the files. Add a read. Option 2) The master writes in ZK the number of log files, per region. When the regionserver starts the split, it reads the full block (64M) and decrease the log file counter of the region. If it reaches 0, the assign start. At the end of its split, the region server decreases the counter as well. This allow to start the assign even if not all the HLog are finished. It would allow to make some regions available even if we have an issue in one of the log file. Pro: parallel Cons: add something to do for the region server. Requites to read the whole file before starting to write. Option 3) Add some metadata at the end of the log file. The last log file won't have meta data, as if we are recovering, it's because the server crashed. But the others will. And last log file should be smaller (half a block on average). Option 4) Still some metadata, but in a different file. Cons: write are increased (but not that much, we just need to write the region once). Pros: if we lose the HLog files (major failure, no replica available) we can still continue with the regions that were not written at this stage. I think it should be done, even if none of the algorithm above is totally convincing yet. It's linked as well to locality and short circuit reads: with these two points reading the file twice become much less of an issue for example. My current preference would be to open the file twice in the region server, once for splitting as of today, once for a quick read looking for unused regions. Who knows, may be it would even be faster this way, the quick read thread would warm-up the different caches for the splitting thread. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6469) Failure on enable/disable table will cause table state in zk to be left as enabling/disabling until master is restart
[ https://issues.apache.org/jira/browse/HBASE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13503001#comment-13503001 ] Wataru Yukawa commented on HBASE-6469: -- Recently, I've got the following problem. http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/25974 Environment:HBase 0.94.2 The result of hbase hbck is INCONSISTENT. The table state in zk is left as ENABLING. The table is neither enable nor disable. Although I can't reproduce the problem, maybe the following procedure {noformat} hbase(main):003:0 create 'table1', 'rowkey1' hbase(main):005:0 disable 'table1' hbase(main):007:0 alter 'table1', METHOD = 'table_att', DEFERRED_LOG_FLUSH = 'true' hbase(main):011:0 enable 'table1' //error hbase(main):009:0 is_enabled 'table1' false hbase(main):006:0 is_disabled 'table1' false hbase(main):011:0 drop 'table1' //error {noformat} The soution is the following * hbase hbck -fix * HMaster reboot I hope this information will be helpful. Failure on enable/disable table will cause table state in zk to be left as enabling/disabling until master is restart - Key: HBASE-6469 URL: https://issues.apache.org/jira/browse/HBASE-6469 Project: HBase Issue Type: Bug Affects Versions: 0.94.2, 0.96.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0, 0.94.4 In Enable/DisableTableHandler code, if something goes wrong in handling, the table state in zk is left as ENABLING / DISABLING. After that we cannot force any more action from the API or CLI, and the only recovery path is restarting the master. {code} if (done) { // Flip the table to enabled. this.assignmentManager.getZKTable().setEnabledTable( this.tableNameStr); LOG.info(Table ' + this.tableNameStr + ' was successfully enabled. Status: done= + done); } else { LOG.warn(Table ' + this.tableNameStr + ' wasn't successfully enabled. Status: done= + done); } {code} Here, if done is false, the table state is not changed. There is also no way to set skipTableStateCheck from cli / api. We have run into this issue a couple of times before. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira