[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309619#comment-14309619 ] Jesse Yates commented on HBASE-12979: - Ah, that's the doc I was looking for! All I could find was the Java3 spec. So I think the above patch is fine... I'll see if we can get a test run going then. Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-12979: Status: Patch Available (was: Open) Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12956) Binding to 0.0.0.0 is broken after HBASE-10569
[ https://issues.apache.org/jira/browse/HBASE-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-12956: -- Attachment: HBASE-12956-v3.txt Binding to 0.0.0.0 is broken after HBASE-10569 -- Key: HBASE-12956 URL: https://issues.apache.org/jira/browse/HBASE-12956 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Blocker Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 0001-HBASE-12956-Binding-to-0.0.0.0-is-broken-after-HBASE.patch, HBASE-12956-v2.txt, HBASE-12956-v3.txt After the Region Server and Master code was merged, we lost the functionality to bind to 0.0.0.0 via hbase.regionserver.ipc.address and znodes now get created with the wildcard address which means that RSs and the master cannot connect to each other. Thanks to [~dimaspivak] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309841#comment-14309841 ] Lars Hofhansl commented on HBASE-12979: --- +1 That just looks so much cleaner too. Is that the absolute only place where we need to call this? Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12035) Client does an RPC to master everytime a region is relocated
[ https://issues.apache.org/jira/browse/HBASE-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309761#comment-14309761 ] Hadoop QA commented on HBASE-12035: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697068/HBASE-12035.patch against master branch at commit 2583e8de574ae4b002c5dbc80b0da666b42dd699. ATTACHMENT ID: 12697068 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 42 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMetaWithReplicas {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.namespace.TestNamespaceAuditor.testRegionMerge(TestNamespaceAuditor.java:308) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12717//console This message is automatically generated. Client does an RPC to master everytime a region is relocated Key: HBASE-12035 URL: https://issues.apache.org/jira/browse/HBASE-12035 Project: HBase Issue Type: Improvement Components: Client, master Affects Versions: 2.0.0 Reporter: Enis Soztutar Assignee: Andrey Stepachev Priority: Critical Fix For: 2.0.0 Attachments: 12035v2.txt, HBASE-12035 (1) (1).patch, HBASE-12035 (1) (1).patch, HBASE-12035 (1).patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch HBASE-7767 moved table enabled|disabled state to be kept in hdfs instead of zookeeper. isTableDisabled() which is used in HConnectionImplementation.relocateRegion() now became a master RPC call rather than a zookeeper client call. Since we do relocateRegion() calls everytime we want to relocate a region
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309937#comment-14309937 ] Hadoop QA commented on HBASE-12980: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697092/12980.txt against master branch at commit 2583e8de574ae4b002c5dbc80b0da666b42dd699. ATTACHMENT ID: 12697092 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12719//console This message is automatically generated. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12035) Client does an RPC to master everytime a region is relocated
[ https://issues.apache.org/jira/browse/HBASE-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12035: -- Attachment: HBASE-12035 (2).patch Client does an RPC to master everytime a region is relocated Key: HBASE-12035 URL: https://issues.apache.org/jira/browse/HBASE-12035 Project: HBase Issue Type: Improvement Components: Client, master Affects Versions: 2.0.0 Reporter: Enis Soztutar Assignee: Andrey Stepachev Priority: Critical Fix For: 2.0.0 Attachments: 12035v2.txt, HBASE-12035 (1) (1).patch, HBASE-12035 (1) (1).patch, HBASE-12035 (1).patch, HBASE-12035 (2).patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch HBASE-7767 moved table enabled|disabled state to be kept in hdfs instead of zookeeper. isTableDisabled() which is used in HConnectionImplementation.relocateRegion() now became a master RPC call rather than a zookeeper client call. Since we do relocateRegion() calls everytime we want to relocate a region (region moved, RS down, etc) this implies that when the master is down, the some of the clients for uncached regions will be affected. See HBASE-7767 and HBASE-11974 for some more background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309933#comment-14309933 ] Andrew Purtell commented on HBASE-12979: I've thought about returning stats on query results but that should be a separate issue. This one seems good to go. Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
stack created HBASE-12980: - Summary: Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11568) Async WAL replication for region replicas
[ https://issues.apache.org/jira/browse/HBASE-11568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309745#comment-14309745 ] Hudson commented on HBASE-11568: FAILURE: Integrated in HBase-1.1 #150 (See [https://builds.apache.org/job/HBase-1.1/150/]) HBASE-11568. Addendum to add a file that I missed earlier. (ddas: rev 78c50af3ec2f98053a4c736183b15499e574c113) * hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionAdminServiceCallable.java Async WAL replication for region replicas - Key: HBASE-11568 URL: https://issues.apache.org/jira/browse/HBASE-11568 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 Attachments: 11568-2-branch-1.txt, 11568-branch-1.txt, hbase-11568_v2.patch, hbase-11568_v3.patch As mentioned in parent issue, and design docs for phase-1 (HBASE-10070) and Phase-2 (HBASE-11183), implement asynchronous WAL replication from the WAL files of the primary region to the secondary region replicas. The WAL replication will build upon the pluggable replication framework introduced in HBASE-11367, and the distributed WAL replay. Upon having some experience with the patch, we changed the design so that there is only one replication queue for doing the async wal replication to secondary replicas rather than having a queue per region replica. This is due to the fact that, we do not want to tail the logs of every region server for a single region replica. Handling of flushes/compactions and memstore accounting will be handled in other subtasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12891) have hbck do region consistency checks in parallel
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309764#comment-14309764 ] Hadoop QA commented on HBASE-12891: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697063/HBASE-12891-v1.patch against master branch at commit 2583e8de574ae4b002c5dbc80b0da666b42dd699. ATTACHMENT ID: 12697063 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12716//console This message is automatically generated. have hbck do region consistency checks in parallel -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309885#comment-14309885 ] Hadoop QA commented on HBASE-12979: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697077/hbase-12979-v0-master.patch against master branch at commit 2583e8de574ae4b002c5dbc80b0da666b42dd699. ATTACHMENT ID: 12697077 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12718//console This message is automatically generated. Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release.
[jira] [Commented] (HBASE-12978) hbase:meta has a row missing hregioninfo and it causes my long-running job to fail
[ https://issues.apache.org/jira/browse/HBASE-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309630#comment-14309630 ] stack commented on HBASE-12978: --- So, this missing row kept showing up in my long-running tests. I thought I had a problem I could easily manufacture an empty info:regioninfo but then comparing notes, I noticed it the same row that was showing across tests and indeed the delete of a table will leave behind table rows if the info:regioninfo column is missing. Making a subtask to fix this annoyance that hampers testing. hbase:meta has a row missing hregioninfo and it causes my long-running job to fail -- Key: HBASE-12978 URL: https://issues.apache.org/jira/browse/HBASE-12978 Project: HBase Issue Type: Bug Reporter: stack Fix For: 1.0.0 Testing 1.0.0 trying long-running tests. A row in hbase:meta was missing its HRI entry. It caused the job to fail. Around the time of the first task failure, there are balances of the hbase:meta region and it was on a server that crashed. I tried to look at what happened around time of our writing hbase:meta and I ran into another issue; 20 logs of 256MBs filled with WrongRegionException written over a minute or two. The actual update of hbase:meta was not in the logs, it'd been rotated off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12978) hbase:meta has a row missing hregioninfo and it causes my long-running job to fail
[ https://issues.apache.org/jira/browse/HBASE-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309680#comment-14309680 ] stack commented on HBASE-12978: --- It was HBASE-12980, the 'delete of a table may not clean all rows from hbase:meta' that was manifesting as HBASE-12974, 'Opaque AsyncProcess failure...' (not sure how to fix that one yet -- it remains opaque) hbase:meta has a row missing hregioninfo and it causes my long-running job to fail -- Key: HBASE-12978 URL: https://issues.apache.org/jira/browse/HBASE-12978 Project: HBase Issue Type: Bug Reporter: stack Fix For: 1.0.0 Testing 1.0.0 trying long-running tests. A row in hbase:meta was missing its HRI entry. It caused the job to fail. Around the time of the first task failure, there are balances of the hbase:meta region and it was on a server that crashed. I tried to look at what happened around time of our writing hbase:meta and I ran into another issue; 20 logs of 256MBs filled with WrongRegionException written over a minute or two. The actual update of hbase:meta was not in the logs, it'd been rotated off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12980: -- Attachment: 12980.txt Patch and a test that fails if we do not do the added cleanup. I think it good to get into 1.0.0 because it was frustrating my testing efforts. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12980: -- Fix Version/s: 0.98.11 1.1.0 2.0.0 Status: Patch Available (was: Open) Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309918#comment-14309918 ] Jesse Yates commented on HBASE-12979: - Well, this is the only places we changed it before. I'll commit this today, unless there are any objections Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12070) Add an option to hbck to fix ZK inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-12070: --- Status: Patch Available (was: In Progress) Add an option to hbck to fix ZK inconsistencies --- Key: HBASE-12070 URL: https://issues.apache.org/jira/browse/HBASE-12070 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 1.1.0 Reporter: Sudarshan Kadambi Assignee: Stephen Yuan Jiang Fix For: 1.1.0 Attachments: HBASE-12070.v1-branch-1.patch, HBASE-12070.v2-branch-1.patch If the HMaster bounces in the middle of table creation, we could be left in a state where a znode exists for the table, but that hasn't percolated into META or to HDFS. We've run into this a couple times on our clusters. Once the table is in this state, the only fix is to rm the znode using the zookeeper-client. Doing this manually looks a bit error prone. Could an option be added to hbck to catch and fix such inconsistencies? A more general issue I'd like comment on is whether it makes sense for HMaster to be maintaining its own write-ahead log? The idea would be that on a bounce, the master would discover it was in the middle of creating a table and either rollback or complete that operation? An issue that we observed recently was that a table that was in DISABLING state before a bounce was not in that state after. A write-ahead log to persist table state changes seems useful. Now, all of this state could be in ZK instead of the WAL - it doesn't matter where it gets persisted as long as it does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12070) Add an option to hbck to fix ZK inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-12070: --- Status: In Progress (was: Patch Available) Add an option to hbck to fix ZK inconsistencies --- Key: HBASE-12070 URL: https://issues.apache.org/jira/browse/HBASE-12070 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 1.1.0 Reporter: Sudarshan Kadambi Assignee: Stephen Yuan Jiang Fix For: 1.1.0 Attachments: HBASE-12070.v1-branch-1.patch, HBASE-12070.v2-branch-1.patch If the HMaster bounces in the middle of table creation, we could be left in a state where a znode exists for the table, but that hasn't percolated into META or to HDFS. We've run into this a couple times on our clusters. Once the table is in this state, the only fix is to rm the znode using the zookeeper-client. Doing this manually looks a bit error prone. Could an option be added to hbck to catch and fix such inconsistencies? A more general issue I'd like comment on is whether it makes sense for HMaster to be maintaining its own write-ahead log? The idea would be that on a bounce, the master would discover it was in the middle of creating a table and either rollback or complete that operation? An issue that we observed recently was that a table that was in DISABLING state before a bounce was not in that state after. A write-ahead log to persist table state changes seems useful. Now, all of this state could be in ZK instead of the WAL - it doesn't matter where it gets persisted as long as it does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-12891: --- Summary: Parallel execution for Hbck checkRegionConsistency (was: have hbck do region consistency checks in parallel) Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310039#comment-14310039 ] Andrew Purtell commented on HBASE-12891: Ok for branch-1.0 [~enis] ? Going to commit this everywhere else shortly unless objection. Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310223#comment-14310223 ] Hudson commented on HBASE-12980: FAILURE: Integrated in HBase-1.1 #151 (See [https://builds.apache.org/job/HBase-1.1/151/]) HBASE-12980 Delete of a table may not clean all rows from hbase:meta (stack: rev 9293bf26ea898ed1cf195ad9c0ef0f7a9cc2e087) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.0.98.txt, 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-12979: Fix Version/s: 1.1.0 1.0.1 2.0.0 Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309957#comment-14309957 ] Lars Hofhansl commented on HBASE-12981: --- {code} + idx = Math.abs(idx); {code} Whoa, that can make things much worse, no? FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.10 Reporter: stack Assignee: stack Fix For: 0.98.11 Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310062#comment-14310062 ] Enis Soztutar commented on HBASE-12981: --- Could this be HBASE-11200 ? We have seen this in action using earlier 0.98 versions. FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.10 Reporter: stack Assignee: stack Fix For: 0.98.11 Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12981: -- Affects Version/s: (was: 0.98.10) 0.98.1 FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.1 Reporter: stack Assignee: stack Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12981: -- Status: Patch Available (was: Open) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.10 Reporter: stack Assignee: stack Fix For: 0.98.11 Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12956) Binding to 0.0.0.0 is broken after HBASE-10569
[ https://issues.apache.org/jira/browse/HBASE-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12956: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch. Thanks Esteban for the patch, and Dima for reporting. Binding to 0.0.0.0 is broken after HBASE-10569 -- Key: HBASE-12956 URL: https://issues.apache.org/jira/browse/HBASE-12956 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Blocker Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 0001-HBASE-12956-Binding-to-0.0.0.0-is-broken-after-HBASE.patch, HBASE-12956-v2.txt, HBASE-12956-v3.txt After the Region Server and Master code was merged, we lost the functionality to bind to 0.0.0.0 via hbase.regionserver.ipc.address and znodes now get created with the wildcard address which means that RSs and the master cannot connect to each other. Thanks to [~dimaspivak] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-12891: --- Resolution: Fixed Fix Version/s: (was: 1.0.1) Status: Resolved (was: Patch Available) Pushed to 0.98, branch-1, and master. Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310168#comment-14310168 ] Enis Soztutar commented on HBASE-12979: --- Let's get this in this RC. The patch is trivial enough. Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310174#comment-14310174 ] Enis Soztutar commented on HBASE-12891: --- Patch looks good, but should not we rethrow the exception, rather than just printing a log for it. It changes the behavior for hbck where it used to throw the exception. Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310110#comment-14310110 ] Hadoop QA commented on HBASE-12980: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697157/12980.0.98.txt against 0.98 branch at commit 3b56d2a0bc36f9dcb901bb709b8d9ae58df955ff. ATTACHMENT ID: 12697157 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color}. The patch appears to cause mvn compile goal to fail. Compilation errors resume: [ERROR] COMPILATION ERROR : [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java:[34,30] error: cannot find symbol [ERROR] symbol: class MetaTableAccessor [ERROR] symbol: class Table [ERROR] symbol: variable MetaTableAccessor [ERROR] symbol: class Table [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java:[34,30] error: cannot find symbol [ERROR] symbol: class MetaTableAccessor [ERROR] location: package org.apache.hadoop.hbase [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java:[44,37] error: cannot find symbol [ERROR] symbol: class Table [ERROR] location: package org.apache.hadoop.hbase.client [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java:[156,21] error: cannot find symbol [ERROR] symbol: variable MetaTableAccessor [ERROR] location: class DeleteTableHandler [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java:[157,4] error: cannot find symbol [ERROR] symbol: class Table [ERROR] location: class DeleteTableHandler [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java:[158,25] error: cannot find symbol [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hbase-server Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12724//console This message is automatically generated. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.0.98.txt, 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-12979: Attachment: hbase-12979-v0-0.98.patch Attaching patch for 0.98 Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-0.98.patch, hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310179#comment-14310179 ] Jesse Yates commented on HBASE-12979: - And just committed to branch-1.0 and 0.98. I'll attach the committed patch for 0.98 momentarily, others cherry-picked cleanly Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-0.98.patch, hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309961#comment-14309961 ] Jesse Yates commented on HBASE-12979: - Just committed to master and 1.1.0. I'll wait on the commit to 1.0.1, but im assuming [~enis] wants this there as well. Waiting on 0.98 too, just to make sure it makes it into all the upstream first. Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12070) Add an option to hbck to fix ZK inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-12070: --- Attachment: HBASE-12070.v2-branch-1.patch Add an option to hbck to fix ZK inconsistencies --- Key: HBASE-12070 URL: https://issues.apache.org/jira/browse/HBASE-12070 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 1.1.0 Reporter: Sudarshan Kadambi Assignee: Stephen Yuan Jiang Fix For: 1.1.0 Attachments: HBASE-12070.v1-branch-1.patch, HBASE-12070.v2-branch-1.patch If the HMaster bounces in the middle of table creation, we could be left in a state where a znode exists for the table, but that hasn't percolated into META or to HDFS. We've run into this a couple times on our clusters. Once the table is in this state, the only fix is to rm the znode using the zookeeper-client. Doing this manually looks a bit error prone. Could an option be added to hbck to catch and fix such inconsistencies? A more general issue I'd like comment on is whether it makes sense for HMaster to be maintaining its own write-ahead log? The idea would be that on a bounce, the master would discover it was in the middle of creating a table and either rollback or complete that operation? An issue that we observed recently was that a table that was in DISABLING state before a bounce was not in that state after. A write-ahead log to persist table state changes seems useful. Now, all of this state could be in ZK instead of the WAL - it doesn't matter where it gets persisted as long as it does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310012#comment-14310012 ] Andrew Purtell commented on HBASE-12980: +1 Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12956) Binding to 0.0.0.0 is broken after HBASE-10569
[ https://issues.apache.org/jira/browse/HBASE-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310011#comment-14310011 ] Hadoop QA commented on HBASE-12956: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697106/HBASE-12956-v3.txt against master branch at commit 2583e8de574ae4b002c5dbc80b0da666b42dd699. ATTACHMENT ID: 12697106 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12720//console This message is automatically generated. Binding to 0.0.0.0 is broken after HBASE-10569 -- Key: HBASE-12956 URL: https://issues.apache.org/jira/browse/HBASE-12956 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Blocker Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 0001-HBASE-12956-Binding-to-0.0.0.0-is-broken-after-HBASE.patch, HBASE-12956-v2.txt, HBASE-12956-v3.txt After the Region Server and Master code was merged, we lost the functionality to bind to 0.0.0.0 via hbase.regionserver.ipc.address and znodes now get created with the wildcard address which means that RSs and the master cannot connect to each other. Thanks to [~dimaspivak] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12980: -- Fix Version/s: (was: 0.98.11) It is NOT in 0.98. Too much work backporting. Attaching patch of what I'd done. It is modern APIs and stuff not in 0.98. Will have to find equivalents. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310038#comment-14310038 ] stack edited comment on HBASE-12980 at 2/6/15 10:47 PM: Thanks [~apurtell] Committed to master and to branch-1. Waiting on [~enis] for clearance on 1.0. was (Author: stack): Thanks [~apurtell] Committed to master, 0.98 and to branch-1. Waiting on [~enis] for clearance on 1.0. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310096#comment-14310096 ] Sean Busbey commented on HBASE-12981: - {quote} {code} + idx = Math.abs(idx); {code} Whoa, that can make things much worse, no? {quote} At this point idx should be in the range (-asyncSyncers.length, asyncSyncers.length) so taking the absolute value should be okay. Usually I'd do a 0 check and add the devisor, but I don't think it makes a difference unless this is a very hot code path. FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.10 Reporter: stack Assignee: stack Fix For: 0.98.11 Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310102#comment-14310102 ] Enis Soztutar commented on HBASE-12980: --- +1 for 1.0. Did not cut the RC yet. Thanks Stack. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.0.98.txt, 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12035) Client does an RPC to master everytime a region is relocated
[ https://issues.apache.org/jira/browse/HBASE-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310097#comment-14310097 ] Hadoop QA commented on HBASE-12035: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697119/HBASE-12035%20%282%29.patch against master branch at commit 2583e8de574ae4b002c5dbc80b0da666b42dd699. ATTACHMENT ID: 12697119 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 42 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.client.TestMetaWithReplicas org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12721//console This message is automatically generated. Client does an RPC to master everytime a region is relocated Key: HBASE-12035 URL: https://issues.apache.org/jira/browse/HBASE-12035 Project: HBase Issue Type: Improvement Components: Client, master Affects Versions: 2.0.0 Reporter: Enis Soztutar Assignee: Andrey Stepachev Priority: Critical Fix For: 2.0.0 Attachments: 12035v2.txt, HBASE-12035 (1) (1).patch, HBASE-12035 (1) (1).patch, HBASE-12035 (1).patch, HBASE-12035 (2).patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch HBASE-7767 moved table enabled|disabled state to be kept in hdfs instead of zookeeper. isTableDisabled() which is used in HConnectionImplementation.relocateRegion() now became a master RPC call rather than a zookeeper client call. Since we do relocateRegion() calls everytime we want to relocate a region
[jira] [Commented] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310116#comment-14310116 ] stack commented on HBASE-12981: --- Thanks [~enis] I think you are right. I was looking at wrong 0.98 version. This is a patched 0.98.1 (CDH5.1.3). It doesn't have HBASE-11200. Let me close this as a duplicate (Thanks for reviews [~lhofhansl] and [~busbey] -- agree w/ Sean). FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.10 Reporter: stack Assignee: stack Fix For: 0.98.11 Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12954) Ability impaired using HBase on multihomed hosts
[ https://issues.apache.org/jira/browse/HBASE-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-12954: --- Attachment: 12954-v12.txt Patch v12 is rebased on master branch. Ability impaired using HBase on multihomed hosts Key: HBASE-12954 URL: https://issues.apache.org/jira/browse/HBASE-12954 Project: HBase Issue Type: Bug Affects Versions: 0.98.4 Reporter: Clay B. Assignee: Ted Yu Priority: Minor Attachments: 12954-v1.txt, 12954-v10.txt, 12954-v11.txt, 12954-v12.txt, 12954-v7.txt, 12954-v8.txt, Hadoop Three Interfaces.png For HBase clusters running on unusual networks (such as NAT'd cloud environments or physical machines with multiple IP's per network interface) it would be ideal to have a way to both specify: # which IP interface to which HBase master or region-server will bind # what hostname HBase will advertise in Zookeeper both for a master or region-server process While efforts such as HBASE-8640 go a long way to normalize these two sources of information, it is not possible in the current design of the properties available to an administrator for these to be unambiguously specified. One has been able to request {{hbase.master.ipc.address}} or {{hbase.regionserver.ipc.address}} but one can not specify the desired HBase {{hbase.master.hostname}}. (It was removed in HBASE-1357, further I am unaware of a region-server equivalent.) I use a configuration management system to generate all of my configuration files on a per-machine basis. As such, an option to generate a file specifying exactly which hostname to use would be helpful. Today, specifying the bind address for HBase works and one can use an HBase-only DNS for faking what to put in Zookeeper but this is far from ideal. Network interfaces have no intrinsic IP address, nor hostname. Specifing a DNS server is awkward as the DNS server may differ from the system's resolver and is a single IP address. Similarly, on hosts which use a transient VIP (e.g. through keepalived) for other services, it means there's a seemingly non-deterministic hostname choice made by HBase depending on the state of the VIP at daemon start-up time. I will attach two networking examples I use which become very difficult to manage under the current properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12981: -- Resolution: Duplicate Fix Version/s: (was: 0.98.11) Status: Resolved (was: Patch Available) Dup of HBASE-11200 FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.1 Reporter: stack Assignee: stack Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310226#comment-14310226 ] Hadoop QA commented on HBASE-12981: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697132/12981.0.98.txt against 0.98 branch at commit 75148385ee5b2065992aea19a810436196576f20. ATTACHMENT ID: 12697132 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 23 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 42 release audit warnings (more than the master's current 0 warnings). {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.TestChoreService.testCorePoolIncrease(TestChoreService.java:395) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/patchReleaseAuditWarnings.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12722//console This message is automatically generated. FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.1 Reporter: stack Assignee: stack Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310224#comment-14310224 ] Hudson commented on HBASE-12979: FAILURE: Integrated in HBase-1.1 #151 (See [https://builds.apache.org/job/HBase-1.1/151/]) HBASE-12979 Use setters instead of return values for handing back statistics from HRegion methods (jyates: rev 073badfd7f91ae22fad71b14a58e2b17a2f956cb) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-0.98.patch, hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12981) FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4
[ https://issues.apache.org/jira/browse/HBASE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12981: -- Attachment: 12981.0.98.txt Patch for 0.98. Logs (every minute) if we come across a negative index. Outputs the inputs used in the modulo so we can get better understanding. FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 --- Key: HBASE-12981 URL: https://issues.apache.org/jira/browse/HBASE-12981 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.10 Reporter: stack Assignee: stack Fix For: 0.98.11 Attachments: 12981.0.98.txt A user reported the below. It happens after the RS has been running a while. 015-01-20 22:33:23,031 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -4 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) 2015-01-20 22:33:23,035 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: regionserver60020-WAL.AsyncWriter exiting ## Similarly on Node 23 - on 12-20-2014 05:13: 2014-12-20 05:13:40,715 ERROR org.apache.hadoop.hbase.regionserver.wal.FSHLog: UNEXPECTED java.lang.ArrayIndexOutOfBoundsException: -3 at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncWriter.run(FSHLog.java:1149) at java.lang.Thread.run(Thread.java:745) ### Looking in code, I can't see how this could come about other than our write seqid ran over the top of a long (unlikely). I think this a 0.98 issue since 1.0+ is different here. It does: int index = Math.abs(this.syncRunnerIndex++) % this.syncRunners.length; I'm going to add logging of the circumstance that produces a negative index and then defense against our using negative indices; there could be more going on in here, more than I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310038#comment-14310038 ] stack commented on HBASE-12980: --- Thanks [~apurtell] Committed to master, 0.98 and to branch-1. Waiting on [~enis] for clearance on 1.0. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12980: -- Attachment: 12980.0.98.txt Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.0.98.txt, 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310134#comment-14310134 ] stack commented on HBASE-12980: --- Pushed to branch-1.0. Thanks [~enis] Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.0.98.txt, 12980.txt One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310269#comment-14310269 ] Hadoop QA commented on HBASE-12979: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697171/hbase-12979-v0-0.98.patch against 0.98 branch at commit 1426f85b15b5b613e0d36693758ce2fc2ade82bf. ATTACHMENT ID: 12697171 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12728//console This message is automatically generated. Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-0.98.patch, hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12974) Opaque AsyncProcess failure: RetriesExhaustedWithDetailsException but no detail
[ https://issues.apache.org/jira/browse/HBASE-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310279#comment-14310279 ] stack commented on HBASE-12974: --- Thanks [~nkeywal] It was an odd case that provoked this. Let me see if I can improve it... (Yeah, I see how after ten times we get more detail... nice) Opaque AsyncProcess failure: RetriesExhaustedWithDetailsException but no detail --- Key: HBASE-12974 URL: https://issues.apache.org/jira/browse/HBASE-12974 Project: HBase Issue Type: Sub-task Components: integration tests Affects Versions: 1.0.0 Reporter: stack Assignee: stack I'm trying to do longer running tests but when I up the numbers for a task I run into this: {code} 2015-02-04 15:35:10,267 FATAL [IPC Server handler 17 on 43975] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1419986015214_0204_m_02_3 - exited : org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IOException: 1 time, at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:227) at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(AsyncProcess.java:207) at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1658) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:208) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.doMutate(BufferedMutatorImpl.java:141) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:98) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.persist(IntegrationTestBigLinkedList.java:449) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:407) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:355) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} Its telling me an action failed but 1 time only with an empty IOE? I'm kinda stumped. Starting up this issue to see if I can get to the bottom of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12954) Ability impaired using HBase on multihomed hosts
[ https://issues.apache.org/jira/browse/HBASE-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-12954: --- Attachment: 12954-v12.txt Ability impaired using HBase on multihomed hosts Key: HBASE-12954 URL: https://issues.apache.org/jira/browse/HBASE-12954 Project: HBase Issue Type: Bug Affects Versions: 0.98.4 Reporter: Clay B. Assignee: Ted Yu Priority: Minor Attachments: 12954-v1.txt, 12954-v10.txt, 12954-v11.txt, 12954-v12.txt, 12954-v12.txt, 12954-v7.txt, 12954-v8.txt, Hadoop Three Interfaces.png For HBase clusters running on unusual networks (such as NAT'd cloud environments or physical machines with multiple IP's per network interface) it would be ideal to have a way to both specify: # which IP interface to which HBase master or region-server will bind # what hostname HBase will advertise in Zookeeper both for a master or region-server process While efforts such as HBASE-8640 go a long way to normalize these two sources of information, it is not possible in the current design of the properties available to an administrator for these to be unambiguously specified. One has been able to request {{hbase.master.ipc.address}} or {{hbase.regionserver.ipc.address}} but one can not specify the desired HBase {{hbase.master.hostname}}. (It was removed in HBASE-1357, further I am unaware of a region-server equivalent.) I use a configuration management system to generate all of my configuration files on a per-machine basis. As such, an option to generate a file specifying exactly which hostname to use would be helpful. Today, specifying the bind address for HBase works and one can use an HBase-only DNS for faking what to put in Zookeeper but this is far from ideal. Network interfaces have no intrinsic IP address, nor hostname. Specifing a DNS server is awkward as the DNS server may differ from the system's resolver and is a single IP address. Similarly, on hosts which use a transient VIP (e.g. through keepalived) for other services, it means there's a seemingly non-deterministic hostname choice made by HBase depending on the state of the VIP at daemon start-up time. I will attach two networking examples I use which become very difficult to manage under the current properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12954) Ability impaired using HBase on multihomed hosts
[ https://issues.apache.org/jira/browse/HBASE-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310277#comment-14310277 ] Ted Yu commented on HBASE-12954: {code} python dev-support/findHangingTests.py https://builds.apache.org/job/PreCommit-HBASE-Build/12725/console Fetching the console output from the URL Printing hanging tests Hanging test : org.apache.hadoop.hbase.TestChoreService Printing Failing tests {code} Not related to the patch. Ability impaired using HBase on multihomed hosts Key: HBASE-12954 URL: https://issues.apache.org/jira/browse/HBASE-12954 Project: HBase Issue Type: Bug Affects Versions: 0.98.4 Reporter: Clay B. Assignee: Ted Yu Priority: Minor Attachments: 12954-v1.txt, 12954-v10.txt, 12954-v11.txt, 12954-v12.txt, 12954-v12.txt, 12954-v7.txt, 12954-v8.txt, Hadoop Three Interfaces.png For HBase clusters running on unusual networks (such as NAT'd cloud environments or physical machines with multiple IP's per network interface) it would be ideal to have a way to both specify: # which IP interface to which HBase master or region-server will bind # what hostname HBase will advertise in Zookeeper both for a master or region-server process While efforts such as HBASE-8640 go a long way to normalize these two sources of information, it is not possible in the current design of the properties available to an administrator for these to be unambiguously specified. One has been able to request {{hbase.master.ipc.address}} or {{hbase.regionserver.ipc.address}} but one can not specify the desired HBase {{hbase.master.hostname}}. (It was removed in HBASE-1357, further I am unaware of a region-server equivalent.) I use a configuration management system to generate all of my configuration files on a per-machine basis. As such, an option to generate a file specifying exactly which hostname to use would be helpful. Today, specifying the bind address for HBase works and one can use an HBase-only DNS for faking what to put in Zookeeper but this is far from ideal. Network interfaces have no intrinsic IP address, nor hostname. Specifing a DNS server is awkward as the DNS server may differ from the system's resolver and is a single IP address. Similarly, on hosts which use a transient VIP (e.g. through keepalived) for other services, it means there's a seemingly non-deterministic hostname choice made by HBase depending on the state of the VIP at daemon start-up time. I will attach two networking examples I use which become very difficult to manage under the current properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12035) Client does an RPC to master everytime a region is relocated
[ https://issues.apache.org/jira/browse/HBASE-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12035: -- Attachment: HBASE-12035 (2).patch Retry. The failures do not seem related being classic flakies. Client does an RPC to master everytime a region is relocated Key: HBASE-12035 URL: https://issues.apache.org/jira/browse/HBASE-12035 Project: HBase Issue Type: Improvement Components: Client, master Affects Versions: 2.0.0 Reporter: Enis Soztutar Assignee: Andrey Stepachev Priority: Critical Fix For: 2.0.0 Attachments: 12035v2.txt, HBASE-12035 (1) (1).patch, HBASE-12035 (1) (1).patch, HBASE-12035 (1).patch, HBASE-12035 (2).patch, HBASE-12035 (2).patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch, HBASE-12035.patch HBASE-7767 moved table enabled|disabled state to be kept in hdfs instead of zookeeper. isTableDisabled() which is used in HConnectionImplementation.relocateRegion() now became a master RPC call rather than a zookeeper client call. Since we do relocateRegion() calls everytime we want to relocate a region (region moved, RS down, etc) this implies that when the master is down, the some of the clients for uncached regions will be affected. See HBASE-7767 and HBASE-11974 for some more background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310290#comment-14310290 ] stack commented on HBASE-12980: --- +1 [~apurtell] You are a better man than me. Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.txt, HBASE-12980-0.98.patch One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310293#comment-14310293 ] Andrew Purtell commented on HBASE-12891: bq. should not we rethrow the exception, rather than just printing a log for it. It changes the behavior for hbck where it used to throw the exception. I think if that's a concern we could make the change in a quick followup Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310301#comment-14310301 ] Jerry He commented on HBASE-12949: -- In this particular case, I have not determined why the corruption happened. Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.94.3, 0.98.10 Reporter: Jerry He Attachments: HBASE-12949-master.patch We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11567) Write bulk load COMMIT events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310302#comment-14310302 ] Enis Soztutar commented on HBASE-11567: --- Can you change this following: {code} +WALKey key = new WALKey(info.getEncodedNameAsBytes(), tn); {code} to {code} // we use HLogKey here instead of WALKey directly to support legacy coprocessors. WALKey key = new HLogKey(info.getEncodedNameAsBytes(), tn); {code} The other similar methods in WALUtil (writeCompactionMarker) does that. Other than that +1. I have checked the PB refactor I was talking about earlier. It seems it is compatible in wire format, and I also tried manually serializing and deserializing with this change. Write bulk load COMMIT events to WAL Key: HBASE-11567 URL: https://issues.apache.org/jira/browse/HBASE-11567 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Alex Newman Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, HBASE-11567-v4-rebase.patch, hbase-11567-v3.patch, hbase-11567-v4.patch Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and region open/close (HBASE-11512) , we should persist bulk load events to WAL. This is especially important for secondary region replicas, since we can use this information to pick up primary regions' files from secondary replicas. A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310299#comment-14310299 ] stack commented on HBASE-12949: --- So, what you thinking on this issue [~jerryhe]? How comes infinite loop if we were supposed to have fallen back on hdfs block? Maybe that didn't happen in this case? Or we need to throw a more violent exception? Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.94.3, 0.98.10 Reporter: Jerry He Attachments: HBASE-12949-master.patch We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar reopened HBASE-12891: --- Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310322#comment-14310322 ] Hudson commented on HBASE-12980: FAILURE: Integrated in HBase-TRUNK #6100 (See [https://builds.apache.org/job/HBase-TRUNK/6100/]) HBASE-12980 Delete of a table may not clean all rows from hbase:meta (stack: rev 57319c536a136d331b925614417a8deba159ad8c) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 12980.txt, HBASE-12980-0.98.patch One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10569) Co-locate meta and master
[ https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310324#comment-14310324 ] Hudson commented on HBASE-10569: FAILURE: Integrated in HBase-TRUNK #6100 (See [https://builds.apache.org/job/HBase-TRUNK/6100/]) HBASE-12956 Binding to 0.0.0.0 is broken after HBASE-10569 (enis: rev 3b56d2a0bc36f9dcb901bb709b8d9ae58df955ff) * hbase-server/src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java Co-locate meta and master - Key: HBASE-10569 URL: https://issues.apache.org/jira/browse/HBASE-10569 Project: HBase Issue Type: Improvement Components: master, Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.99.0 Attachments: Co-locateMetaAndMasterHBASE-10569.pdf, hbase-10569_v1.patch, hbase-10569_v2.patch, hbase-10569_v3.1.patch, hbase-10569_v3.patch, master_rs.pdf I was thinking simplifying/improving the region assignments. The first step is to co-locate the meta and the master as many people agreed on HBASE-5487. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11567) Write bulk load COMMIT events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310527#comment-14310527 ] Hudson commented on HBASE-11567: FAILURE: Integrated in HBase-1.0 #720 (See [https://builds.apache.org/job/HBase-1.0/720/]) HBASE-11567 Write bulk load COMMIT events to WAL (Only partial patch containing PB changes) (enis: rev 2395d69c23dccba029659a5f28d72631494fee8f) * hbase-protocol/src/main/protobuf/WAL.proto * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/VisibilityLabelsProtos.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java Write bulk load COMMIT events to WAL Key: HBASE-11567 URL: https://issues.apache.org/jira/browse/HBASE-11567 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Alex Newman Fix For: 2.0.0, 1.1.0 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, HBASE-11567-v4-rebase.patch, hbase-11567-branch-1.0-partial.patch, hbase-11567-v3.patch, hbase-11567-v4.patch Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and region open/close (HBASE-11512) , we should persist bulk load events to WAL. This is especially important for secondary region replicas, since we can use this information to pick up primary regions' files from secondary replicas. A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12982) Adding timeouts to TestChoreService
[ https://issues.apache.org/jira/browse/HBASE-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12982: -- Attachment: 12982.txt Just adding timeouts on all tests in TestChoreService Adding timeouts to TestChoreService --- Key: HBASE-12982 URL: https://issues.apache.org/jira/browse/HBASE-12982 Project: HBase Issue Type: Improvement Reporter: stack Attachments: 12982.txt One of the lads fingered TimeChoreService as acting up going zombie. Adding timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310556#comment-14310556 ] Hudson commented on HBASE-12979: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #798 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/798/]) HBASE-12979 Use setters instead of return values for handing back statistics from HRegion methods (jyates: rev d776789fc42065e1422c5fc419fe0fd566e2043c) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.11, 0.98.10.1 Attachments: hbase-12979-v0-0.98.patch, hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310555#comment-14310555 ] Hudson commented on HBASE-12891: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #798 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/798/]) HBASE-12891 Parallel execution for Hbck checkRegionConsistency (apurtell: rev de08a82c64beb265d89037ceabda76309e3912fc) * hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch, hbase-12891-addendum1.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11861) Native MOB Compaction mechanisms.
[ https://issues.apache.org/jira/browse/HBASE-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310567#comment-14310567 ] ramkrishna.s.vasudevan commented on HBASE-11861: Committed? Oh I was reviewing this and thought would complete it over the weekend or the beginning of next week. No problem anyway. Thanks [~jmhsieh] and [~jingcheng...@intel.com]. Native MOB Compaction mechanisms. - Key: HBASE-11861 URL: https://issues.apache.org/jira/browse/HBASE-11861 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: 2.0.0 Reporter: Jonathan Hsieh Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: 141030-mob-compaction.pdf, HBASE-11861-V1.diff, HBASE-11861-V2.diff, HBASE-11861-V3.diff, HBASE-11861-V4.diff, HBASE-11861-V5.diff, HBASE-11861-V6.diff, HBASE-11861.diff, mob compaction-out-of-region.pdf, mob compaction.pdf Currently, the first cut of mob will have external processes to age off old mob data (the ttl cleaner), and to compact away deleted or over written data (the sweep tool). From an operational point of view, having two external tools, especially one that relies on MapReduce is undesirable. In this issue we'll tackle integrating these into hbase without requiring external processes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12108) HBaseConfiguration: set classloader before loading xml files
[ https://issues.apache.org/jira/browse/HBASE-12108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12108: -- Fix Version/s: (was: 1.0.1) 1.0.0 HBaseConfiguration: set classloader before loading xml files Key: HBASE-12108 URL: https://issues.apache.org/jira/browse/HBASE-12108 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.98.6 Reporter: Aniket Bhatnagar Priority: Minor Labels: class_loader, configuration, patch Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: HBaseConfiguration_HBASE_HBASE-12108.patch IN the setup wherein HBase jars are loaded in child classloader whose parent had loaded hadoop-common jar, HBaseConfiguration.create() throws hbase-default.xml file seems to be for and old version of HBase (null)... exception. ClassLoader should be set in Hadoop conf object before calling addHbaseResources method -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12979: -- Fix Version/s: (was: 1.0.1) 1.0.0 Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11, 0.98.10.1 Attachments: hbase-12979-v0-0.98.patch, hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12962) TestHFileBlockIndex.testBlockIndex() commented out during HBASE-10531
[ https://issues.apache.org/jira/browse/HBASE-12962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12962: -- Fix Version/s: (was: 1.0.1) 1.1.0 TestHFileBlockIndex.testBlockIndex() commented out during HBASE-10531 - Key: HBASE-12962 URL: https://issues.apache.org/jira/browse/HBASE-12962 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0, 2.0.0, 1.0.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: HBASE-12962.patch Accidentally during HBASE-10531 the test case testBlockIndex() in TestHFileBlockIndex was commented out. Apologies for that. Not sure how that happened. This patch uncomments the commented out test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11544) [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME
[ https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310528#comment-14310528 ] stack commented on HBASE-11544: --- [~jonathan.lawlor] Could we do as [~lhofhansl] suggests in a follow-on issue? [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME -- Key: HBASE-11544 URL: https://issues.apache.org/jira/browse/HBASE-11544 Project: HBase Issue Type: Bug Reporter: stack Assignee: Jonathan Lawlor Priority: Critical Labels: beginner Running some tests, I set hbase.client.scanner.caching=1000. Dataset has large cells. I kept OOME'ing. Serverside, we should measure how much we've accumulated and return to the client whatever we've gathered once we pass out a certain size threshold rather than keep accumulating till we OOME. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-8340) WAL compression handling of seeks seems to be either inefficient or incorrect
[ https://issues.apache.org/jira/browse/HBASE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310546#comment-14310546 ] Lars Hofhansl commented on HBASE-8340: -- Can we close this then? WAL compression handling of seeks seems to be either inefficient or incorrect - Key: HBASE-8340 URL: https://issues.apache.org/jira/browse/HBASE-8340 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin In next(...): {code} if (compressionContext != null emptyCompressionContext) { emptyCompressionContext = false; } return ... {code} In seek() {code} if (compressionContext != null emptyCompressionContext) { while (next() != null) { if (getPosition() == pos) { emptyCompressionContext = false; break; } } ... reader.seek(pos); {code} So, seek will seek the file directly if either any next, or any seek, has been called before. I am not sure what this code is for, but my best guess is that it is to populate the dictionary for compression. If it is so, it would seem that one next() call (or even one seek() call) would not be enough, and seek must always use next(), otherwise it is incorrect. If we assume that one next() is enough to be able to use reader.seek, as the current code would seem to imply, then there's no need for the first seek to call next() in a loop - it can call next once and then do reader.seek. Note: even in case if all of this works fine because external usage creates the object and does one seek before any next-s, and no seeks after (the only bug-free pattern currently possible with both methods used if I'm not mistaken), then the code needs to be tightened and bug potential removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12978) hbase:meta has a row missing hregioninfo and it causes my long-running job to fail
[ https://issues.apache.org/jira/browse/HBASE-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310547#comment-14310547 ] stack commented on HBASE-12978: --- @enis Regards HBASE-12974, suggest 1.0.1. The cause was itself obscure. Hopefully its a rare case. Ditto with this main issue of missing info:regioninfo. I don't have handle on it yet but seems rare enough. hbase:meta has a row missing hregioninfo and it causes my long-running job to fail -- Key: HBASE-12978 URL: https://issues.apache.org/jira/browse/HBASE-12978 Project: HBase Issue Type: Bug Reporter: stack Fix For: 1.0.0 Testing 1.0.0 trying long-running tests. A row in hbase:meta was missing its HRI entry. It caused the job to fail. Around the time of the first task failure, there are balances of the hbase:meta region and it was on a server that crashed. I tried to look at what happened around time of our writing hbase:meta and I ran into another issue; 20 logs of 256MBs filled with WrongRegionException written over a minute or two. The actual update of hbase:meta was not in the logs, it'd been rotated off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11567) Write bulk load COMMIT events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310551#comment-14310551 ] Hudson commented on HBASE-11567: FAILURE: Integrated in HBase-1.1 #154 (See [https://builds.apache.org/job/HBase-1.1/154/]) HBASE-11567 Write bulk load COMMIT events to WAL (jeffreyz: rev b0b0a74fef6382643c6ff8d07167ad90ff0d7c43) * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/FilterProtos.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * pom.xml * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALUtil.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/VisibilityLabelsProtos.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionServerBulkLoad.java * hbase-protocol/src/main/protobuf/WAL.proto * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALActionsListener.java Write bulk load COMMIT events to WAL Key: HBASE-11567 URL: https://issues.apache.org/jira/browse/HBASE-11567 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Alex Newman Fix For: 2.0.0, 1.1.0 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, HBASE-11567-v4-rebase.patch, hbase-11567-branch-1.0-partial.patch, hbase-11567-v3.patch, hbase-11567-v4.patch Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and region open/close (HBASE-11512) , we should persist bulk load events to WAL. This is especially important for secondary region replicas, since we can use this information to pick up primary regions' files from secondary replicas. A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12982) Adding timeouts to TestChoreService
[ https://issues.apache.org/jira/browse/HBASE-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310553#comment-14310553 ] Hudson commented on HBASE-12982: FAILURE: Integrated in HBase-1.1 #155 (See [https://builds.apache.org/job/HBase-1.1/155/]) HBASE-12982 Adding timeouts to TestChoreService (stack: rev cd996ea240b6cf612130eae28c31d0ee06ead437) * hbase-common/src/test/java/org/apache/hadoop/hbase/TestChoreService.java Adding timeouts to TestChoreService --- Key: HBASE-12982 URL: https://issues.apache.org/jira/browse/HBASE-12982 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0 Attachments: 12982.txt One of the lads fingered TimeChoreService as acting up going zombie. Adding timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12983) HBase book mentions hadoo.ssl.enabled when it should be hbase.ssl.enabled
Esteban Gutierrez created HBASE-12983: - Summary: HBase book mentions hadoo.ssl.enabled when it should be hbase.ssl.enabled Key: HBASE-12983 URL: https://issues.apache.org/jira/browse/HBASE-12983 Project: HBase Issue Type: Bug Components: documentation Reporter: Esteban Gutierrez In the HBase book we say the following: {quote} A default HBase install uses insecure HTTP connections for web UIs for the master and region servers. To enable secure HTTP (HTTPS) connections instead, set *hadoop.ssl.enabled* to true in hbase-site.xml. This does not change the port used by the Web UI. To change the port for the web UI for a given HBase component, configure that port’s setting in hbase-site.xml. These settings are: {quote} The property should be *hbase.ssl.enabled* instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12983) HBase book mentions hadoop.ssl.enabled when it should be hbase.ssl.enabled
[ https://issues.apache.org/jira/browse/HBASE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-12983: -- Summary: HBase book mentions hadoop.ssl.enabled when it should be hbase.ssl.enabled (was: HBase book mentions hadoo.ssl.enabled when it should be hbase.ssl.enabled) HBase book mentions hadoop.ssl.enabled when it should be hbase.ssl.enabled -- Key: HBASE-12983 URL: https://issues.apache.org/jira/browse/HBASE-12983 Project: HBase Issue Type: Bug Components: documentation Reporter: Esteban Gutierrez In the HBase book we say the following: {quote} A default HBase install uses insecure HTTP connections for web UIs for the master and region servers. To enable secure HTTP (HTTPS) connections instead, set *hadoop.ssl.enabled* to true in hbase-site.xml. This does not change the port used by the Web UI. To change the port for the web UI for a given HBase component, configure that port’s setting in hbase-site.xml. These settings are: {quote} The property should be *hbase.ssl.enabled* instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12982) Adding timeouts to TestChoreService
[ https://issues.apache.org/jira/browse/HBASE-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310573#comment-14310573 ] Hudson commented on HBASE-12982: FAILURE: Integrated in HBase-TRUNK #6104 (See [https://builds.apache.org/job/HBase-TRUNK/6104/]) HBASE-12982 Adding timeouts to TestChoreService (stack: rev ac175b1bd9ec3878f50458382563810142df032d) * hbase-common/src/test/java/org/apache/hadoop/hbase/TestChoreService.java Adding timeouts to TestChoreService --- Key: HBASE-12982 URL: https://issues.apache.org/jira/browse/HBASE-12982 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0 Attachments: 12982.txt One of the lads fingered TimeChoreService as acting up going zombie. Adding timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12967) Invalid FQCNs in alter table command leaves the table unusable
[ https://issues.apache.org/jira/browse/HBASE-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12967: -- Fix Version/s: (was: 1.0.0) 1.0.1 Invalid FQCNs in alter table command leaves the table unusable -- Key: HBASE-12967 URL: https://issues.apache.org/jira/browse/HBASE-12967 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Priority: Critical Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.11 Refer to this thread http://osdir.com/ml/general/2015-02/msg03547.html A user tries to alter a table with a new split policy. Due to an invalid classname the table does not get enabled and the table becomes unusable. I think Procedure V2 is a long term soln for this but I think we atleast need to provide a work around or a set of steps to come out of this. Any fix before Procedure V2 comes into place would useful for the already released versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12978) hbase:meta has a row missing hregioninfo and it causes my long-running job to fail
[ https://issues.apache.org/jira/browse/HBASE-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12978: -- Fix Version/s: (was: 1.0.0) 1.0.1 hbase:meta has a row missing hregioninfo and it causes my long-running job to fail -- Key: HBASE-12978 URL: https://issues.apache.org/jira/browse/HBASE-12978 Project: HBase Issue Type: Bug Reporter: stack Fix For: 1.0.1 Testing 1.0.0 trying long-running tests. A row in hbase:meta was missing its HRI entry. It caused the job to fail. Around the time of the first task failure, there are balances of the hbase:meta region and it was on a server that crashed. I tried to look at what happened around time of our writing hbase:meta and I ran into another issue; 20 logs of 256MBs filled with WrongRegionException written over a minute or two. The actual update of hbase:meta was not in the logs, it'd been rotated off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12976) Set default value for hbase.client.scanner.max.result.size
[ https://issues.apache.org/jira/browse/HBASE-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12976: -- Fix Version/s: (was: 1.0.1) 1.0.0 Set default value for hbase.client.scanner.max.result.size -- Key: HBASE-12976 URL: https://issues.apache.org/jira/browse/HBASE-12976 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12976-v2.txt, 12976.txt Setting scanner caching is somewhat of a black art. It's hard to estimate ahead of time how large the result set will be. I propose we hbase.client.scanner.max.result.size to 2mb. That is good compromise between performance and buffer usage on typical networks (avoiding OOMs when the caching was chosen too high). To an HTable client this is completely transparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11567) Write bulk load COMMIT events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310523#comment-14310523 ] Hadoop QA commented on HBASE-11567: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697205/hbase-11567-branch-1.0-partial.patch against branch-1.0 branch at commit 7f4146bf4d4df84041b284a76d917d602b5531da. ATTACHMENT ID: 12697205 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 9 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12730//console This message is automatically generated. Write bulk load COMMIT events to WAL Key: HBASE-11567 URL: https://issues.apache.org/jira/browse/HBASE-11567 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Alex Newman Fix For: 2.0.0, 1.1.0 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, HBASE-11567-v4-rebase.patch, hbase-11567-branch-1.0-partial.patch, hbase-11567-v3.patch, hbase-11567-v4.patch Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and region open/close (HBASE-11512) , we should persist bulk load events to WAL. This is especially important for secondary region replicas, since we can use this information to pick up primary regions' files from secondary replicas. A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12982) Adding timeouts to TestChoreService
stack created HBASE-12982: - Summary: Adding timeouts to TestChoreService Key: HBASE-12982 URL: https://issues.apache.org/jira/browse/HBASE-12982 Project: HBase Issue Type: Improvement Reporter: stack One of the lads fingered TimeChoreService as acting up going zombie. Adding timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-12982) Adding timeouts to TestChoreService
[ https://issues.apache.org/jira/browse/HBASE-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-12982. --- Resolution: Fixed Fix Version/s: 1.1.0 2.0.0 Assignee: stack Pushed to branch-1+ Adding timeouts to TestChoreService --- Key: HBASE-12982 URL: https://issues.apache.org/jira/browse/HBASE-12982 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0 Attachments: 12982.txt One of the lads fingered TimeChoreService as acting up going zombie. Adding timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12982) Adding timeouts to TestChoreService
[ https://issues.apache.org/jira/browse/HBASE-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310526#comment-14310526 ] stack commented on HBASE-12982: --- [~jonathan.lawlor] FYI Adding timeouts to TestChoreService --- Key: HBASE-12982 URL: https://issues.apache.org/jira/browse/HBASE-12982 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0 Attachments: 12982.txt One of the lads fingered TimeChoreService as acting up going zombie. Adding timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11567) Write bulk load COMMIT events to WAL
[ https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310550#comment-14310550 ] Hudson commented on HBASE-11567: FAILURE: Integrated in HBase-TRUNK #6103 (See [https://builds.apache.org/job/HBase-TRUNK/6103/]) HBASE-11567 Write bulk load COMMIT events to WAL (Alex Newman, Jeffrey Zhong) (jeffreyz: rev 3f4427739d9ff698d39f2687f11f65967c67340d) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBulkLoad.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/FilterProtos.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALActionsListener.java * pom.xml * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/VisibilityLabelsProtos.java * hbase-protocol/src/main/protobuf/WAL.proto * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionServerBulkLoad.java * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALUtil.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java Write bulk load COMMIT events to WAL Key: HBASE-11567 URL: https://issues.apache.org/jira/browse/HBASE-11567 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Alex Newman Fix For: 2.0.0, 1.1.0 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, HBASE-11567-v4-rebase.patch, hbase-11567-branch-1.0-partial.patch, hbase-11567-v3.patch, hbase-11567-v4.patch Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and region open/close (HBASE-11512) , we should persist bulk load events to WAL. This is especially important for secondary region replicas, since we can use this information to pick up primary regions' files from secondary replicas. A design doc for secondary replica replication can be found at HBASE-11183. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310578#comment-14310578 ] Hudson commented on HBASE-12891: FAILURE: Integrated in HBase-0.98 #841 (See [https://builds.apache.org/job/HBase-0.98/841/]) Revert HBASE-12891 Parallel execution for Hbck checkRegionConsistency (apurtell: rev 5f57c07713031f4b6ec1e536812f5811ac3d31cd) * hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch, hbase-12891-addendum1.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310577#comment-14310577 ] Hudson commented on HBASE-12980: FAILURE: Integrated in HBase-0.98 #841 (See [https://builds.apache.org/job/HBase-0.98/841/]) HBASE-12980 Delete of a table may not clean all rows from hbase:meta (apurtell: rev be82ea78f3cee109ce30b8c72b6ed4ac09df7ed6) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestDeleteTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12980.txt, HBASE-12980-0.98.patch One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12984) SSL cannot be used by the InfoPort in branch-1
Esteban Gutierrez created HBASE-12984: - Summary: SSL cannot be used by the InfoPort in branch-1 Key: HBASE-12984 URL: https://issues.apache.org/jira/browse/HBASE-12984 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0 Reporter: Esteban Gutierrez Priority: Blocker Setting {{hbase.ssl.enabled}} to {{true}} doesn't enable SSL on the InfoServer. Found that the problem is down the InfoServer and HttpConfig in how we setup the protocol in the HttpServer: {code} for (URI ep : endpoints) { Connector listener = null; String scheme = ep.getScheme(); if (http.equals(scheme)) { listener = HttpServer.createDefaultChannelConnector(); } else if (https.equals(scheme)) { SslSocketConnector c = new SslSocketConnectorSecure(); c.setNeedClientAuth(needsClientAuth); c.setKeyPassword(keyPassword); {code} It depends what end points have been added by the InfoServer: {code} builder .setName(name) .addEndpoint(URI.create(http://; + bindAddress + : + port)) .setAppDir(HBASE_APP_DIR).setFindPort(findPort).setConf(c); {code} Basically we always use http and we don't look via HttConfig if {{hbase.ssl.enabled}} was set to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12984) SSL cannot be used by the InfoPort in branch-1
[ https://issues.apache.org/jira/browse/HBASE-12984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-12984: -- Description: Setting {{hbase.ssl.enabled}} to {{true}} doesn't enable SSL on the InfoServer. Found that the problem is down the InfoServer and HttpConfig in how we setup the protocol in the HttpServer: {code} for (URI ep : endpoints) { Connector listener = null; String scheme = ep.getScheme(); if (http.equals(scheme)) { listener = HttpServer.createDefaultChannelConnector(); } else if (https.equals(scheme)) { SslSocketConnector c = new SslSocketConnectorSecure(); c.setNeedClientAuth(needsClientAuth); c.setKeyPassword(keyPassword); {code} It depends what end points have been added by the InfoServer: {code} builder .setName(name) .addEndpoint(URI.create(http://; + bindAddress + : + port)) .setAppDir(HBASE_APP_DIR).setFindPort(findPort).setConf(c); {code} Basically we always use http and we don't look via HttConfig if {{hbase.ssl.enabled}} was set to true and we assign the right schema based on the configuration. was: Setting {{hbase.ssl.enabled}} to {{true}} doesn't enable SSL on the InfoServer. Found that the problem is down the InfoServer and HttpConfig in how we setup the protocol in the HttpServer: {code} for (URI ep : endpoints) { Connector listener = null; String scheme = ep.getScheme(); if (http.equals(scheme)) { listener = HttpServer.createDefaultChannelConnector(); } else if (https.equals(scheme)) { SslSocketConnector c = new SslSocketConnectorSecure(); c.setNeedClientAuth(needsClientAuth); c.setKeyPassword(keyPassword); {code} It depends what end points have been added by the InfoServer: {code} builder .setName(name) .addEndpoint(URI.create(http://; + bindAddress + : + port)) .setAppDir(HBASE_APP_DIR).setFindPort(findPort).setConf(c); {code} Basically we always use http and we don't look via HttConfig if {{hbase.ssl.enabled}} was set to true. SSL cannot be used by the InfoPort in branch-1 -- Key: HBASE-12984 URL: https://issues.apache.org/jira/browse/HBASE-12984 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0 Reporter: Esteban Gutierrez Priority: Blocker Setting {{hbase.ssl.enabled}} to {{true}} doesn't enable SSL on the InfoServer. Found that the problem is down the InfoServer and HttpConfig in how we setup the protocol in the HttpServer: {code} for (URI ep : endpoints) { Connector listener = null; String scheme = ep.getScheme(); if (http.equals(scheme)) { listener = HttpServer.createDefaultChannelConnector(); } else if (https.equals(scheme)) { SslSocketConnector c = new SslSocketConnectorSecure(); c.setNeedClientAuth(needsClientAuth); c.setKeyPassword(keyPassword); {code} It depends what end points have been added by the InfoServer: {code} builder .setName(name) .addEndpoint(URI.create(http://; + bindAddress + : + port)) .setAppDir(HBASE_APP_DIR).setFindPort(findPort).setConf(c); {code} Basically we always use http and we don't look via HttConfig if {{hbase.ssl.enabled}} was set to true and we assign the right schema based on the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12979) Use setters instead of return values for handing back statistics from HRegion methods
[ https://issues.apache.org/jira/browse/HBASE-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310235#comment-14310235 ] Enis Soztutar commented on HBASE-12979: --- Why is the version 0.98.10.1? Did I miss anything. Use setters instead of return values for handing back statistics from HRegion methods - Key: HBASE-12979 URL: https://issues.apache.org/jira/browse/HBASE-12979 Project: HBase Issue Type: Improvement Affects Versions: 0.98.10 Reporter: Andrew Purtell Assignee: Jesse Yates Labels: phoenix Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.10.1 Attachments: hbase-12979-v0-0.98.patch, hbase-12979-v0-master.patch In HBASE-5162 (and backports such as HBASE-12729) we modified some HRegion methods to return statistics for consumption by callers. The statistics are ultimately passed back to the client as load feedback. [~lhofhansl] thinks handing back this information as return values from HRegion methods is a weird mix of concerns. This also produced a difficult to anticipate binary compatibility issue with Phoenix. There was no compile time issue because the code of course was not structured to assign from a method returning void, yet the method signature changes so the JVM cannot resolve it if older Phoenix binaries are installed into a 0.98.10 release. Let's change the HRegion methods back to returning 'void' and use setters instead. Officially we don't support use of HRegion (HBASE-12566) but we do not need to go out of our way to break things (smile) so I would also like to make a patch release containing just this change to help out our sister project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11409) Add more flexibility for input directory structure to LoadIncrementalHFiles
[ https://issues.apache.org/jira/browse/HBASE-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11409: -- Status: Patch Available (was: Open) See if it will work against hadoopqa (it has the 0.94 in its name so should be applied to 0.94 branch) Add more flexibility for input directory structure to LoadIncrementalHFiles --- Key: HBASE-11409 URL: https://issues.apache.org/jira/browse/HBASE-11409 Project: HBase Issue Type: Bug Affects Versions: 0.94.20 Reporter: churro morales Assignee: churro morales Attachments: HBASE-11409-0.94.patch, HBASE-11409.0.94.v1.patch Use case: We were trying to combine two very large tables into a single table. Thus we ran jobs in one datacenter that populated certain column families and another datacenter which populated other column families. Took a snapshot and exported them to their respective datacenters. Wanted to simply take the hdfs restored snapshot and use LoadIncremental to merge the data. It would be nice to add support where we could run LoadIncremental on a directory where the depth of store files is something other than two (current behavior). With snapshots it would be nice if you could pass a restored hdfs snapshot's directory and have the tool run. I am attaching a patch where I parameterize the bulkLoad timeout as well as the default store file depth. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-12897) Minimum memstore size is a percentage
[ https://issues.apache.org/jira/browse/HBASE-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-12897. --- Resolution: Fixed Fix Version/s: 1.1.0 2.0.0 1.0.0 Hadoop Flags: Reviewed Pushed to branch-1+ This patch doesn't work for 0.98. Its different. Thanks [~churromorales] Minimum memstore size is a percentage - Key: HBASE-12897 URL: https://issues.apache.org/jira/browse/HBASE-12897 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: HBASE-12897.patch We have a cluster which is optimized for random reads. Thus we have a large block cache and a small memstore. Currently our heap is 20GB and we wanted to configure the memstore to take 4% or 800MB. Right now the minimum memstore size is 5%. What do you guys think about reducing the minimum size to 1%? Suppose we log a warning if the memstore is below 5% but allow it? What do you folks think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310295#comment-14310295 ] Jerry He commented on HBASE-12949: -- Hi, [~stack], [~ram_krish] Thanks for the comments. Did a quick dig on the HBase checksum a little as you guys pointed. The code is doing what is documented, which is: HBase checksum is on by default, HBase will verify checksums for hfile blocks. Checksum verification inside FileSystem will be switched off. If the hbase-checksum verification fails, it will fall back to using FileSystem checksums. And it will go back to HBase Checksum later, as Stack mentioned, presumably HDFS will get the good blocks again. Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.94.3, 0.98.10 Reporter: Jerry He Attachments: HBASE-12949-master.patch We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12891) Parallel execution for Hbck checkRegionConsistency
[ https://issues.apache.org/jira/browse/HBASE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12891: -- Attachment: hbase-12891-addendum1.patch Reopened for addendum. Attaching what I had in mind (although a bit ugly). Rethrowing the exception is important IMO since if we cannot get the information for a subset of regions, and based on that do some fixes later on in the hbck execution it might be disastrous. Let's be safe instead. Parallel execution for Hbck checkRegionConsistency -- Key: HBASE-12891 URL: https://issues.apache.org/jira/browse/HBASE-12891 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0, 0.98.10, 1.1.0 Reporter: churro morales Assignee: churro morales Fix For: 2.0.0, 1.1.0, 0.98.11 Attachments: HBASE-12891-v1.patch, HBASE-12891.98.patch, HBASE-12891.patch, HBASE-12891.patch, hbase-12891-addendum1.patch We have a lot of regions on our cluster ~500k and noticed that hbck took quite some time in checkAndFixConsistency(). [~davelatham] patched our cluster to do this check in parallel to speed things up. I'll attach the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12980) Delete of a table may not clean all rows from hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-12980: --- Resolution: Fixed Fix Version/s: 0.98.11 Status: Resolved (was: Patch Available) Delete of a table may not clean all rows from hbase:meta Key: HBASE-12980 URL: https://issues.apache.org/jira/browse/HBASE-12980 Project: HBase Issue Type: Sub-task Components: Operability Reporter: stack Assignee: stack Fix For: 1.0.0, 2.0.0, 1.1.0, 0.98.11 Attachments: 12980.txt, HBASE-12980-0.98.patch One such case is if we miswrite the info:regioninfo column and it comes up empty, this row will remain in the table. We have a set of 'finally' cleanup tasks on table delete. Let me add one that for sure purges any rows to do with the deleted table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12956) Binding to 0.0.0.0 is broken after HBASE-10569
[ https://issues.apache.org/jira/browse/HBASE-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310353#comment-14310353 ] Hudson commented on HBASE-12956: SUCCESS: Integrated in HBase-1.0 #718 (See [https://builds.apache.org/job/HBase-1.0/718/]) HBASE-12956 Binding to 0.0.0.0 is broken after HBASE-10569 (enis: rev 15140bf48491d92dae2d514f2cc84c09205d87b7) * hbase-server/src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java Binding to 0.0.0.0 is broken after HBASE-10569 -- Key: HBASE-12956 URL: https://issues.apache.org/jira/browse/HBASE-12956 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Blocker Fix For: 1.0.0, 2.0.0, 1.1.0 Attachments: 0001-HBASE-12956-Binding-to-0.0.0.0-is-broken-after-HBASE.patch, HBASE-12956-v2.txt, HBASE-12956-v3.txt After the Region Server and Master code was merged, we lost the functionality to bind to 0.0.0.0 via hbase.regionserver.ipc.address and znodes now get created with the wildcard address which means that RSs and the master cannot connect to each other. Thanks to [~dimaspivak] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)