[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414302#comment-13414302 ] Hudson commented on HBASE-6394: --- Integrated in HBase-0.92 #476 (See [https://builds.apache.org/job/HBase-0.92/476/]) HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 1361471) Result = FAILURE jxiang : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414289#comment-13414289 ] Hudson commented on HBASE-6394: --- Integrated in HBase-TRUNK #3127 (See [https://builds.apache.org/job/HBase-TRUNK/3127/]) HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 1361469) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414288#comment-13414288 ] Hudson commented on HBASE-6394: --- Integrated in HBase-0.94-security #42 (See [https://builds.apache.org/job/HBase-0.94-security/42/]) HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 1361470) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414283#comment-13414283 ] Hudson commented on HBASE-6389: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #94 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/94/]) HBASE-6389 Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments (Aditya Kishore) (Revision 1361456) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > whi
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414284#comment-13414284 ] Hudson commented on HBASE-6394: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #94 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/94/]) HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 1361469) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414282#comment-13414282 ] Hudson commented on HBASE-6394: --- Integrated in HBase-0.94 #318 (See [https://builds.apache.org/job/HBase-0.94/318/]) HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 1361470) Result = ABORTED jxiang : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414274#comment-13414274 ] Hadoop QA commented on HBASE-6394: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536484/6394-trunk_v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2386//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2386//console This message is automatically generated. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6394: --- Resolution: Fixed Fix Version/s: 0.94.1 0.96.0 0.92.2 Status: Resolved (was: Patch Available) Integrated to 0.92, 0,.94 and 0.96. Thanks Ted for the review. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414265#comment-13414265 ] Hudson commented on HBASE-6389: --- Integrated in HBase-TRUNK #3126 (See [https://builds.apache.org/job/HBase-TRUNK/3126/]) HBASE-6389 Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments (Aditya Kishore) (Revision 1361456) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.is
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414250#comment-13414250 ] Hudson commented on HBASE-6389: --- Integrated in HBase-0.94 #316 (See [https://builds.apache.org/job/HBase-0.94/316/]) HBASE-6389 Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments (Aditya Kishore) (Revision 1361458) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isStopped() && >
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414241#comment-13414241 ] Hudson commented on HBASE-6389: --- Integrated in HBase-0.94-security #41 (See [https://builds.apache.org/job/HBase-0.94-security/41/]) HBASE-6389 Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments (Aditya Kishore) (Revision 1361458) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isSto
[jira] [Commented] (HBASE-6384) hbck should group together those sidelined regions need to be bulk loaded later
[ https://issues.apache.org/jira/browse/HBASE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414243#comment-13414243 ] Hudson commented on HBASE-6384: --- Integrated in HBase-0.94-security #41 (See [https://builds.apache.org/job/HBase-0.94-security/41/]) HBASE-6384 hbck should group together those sidelined regions need to be bulk loaded later (Revision 1361036) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java > hbck should group together those sidelined regions need to be bulk loaded > later > --- > > Key: HBASE-6384 > URL: https://issues.apache.org/jira/browse/HBASE-6384 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6384-trunk.patch > > > Currently, hbck sidelines some regions to break big overlap groups to avoid > possible compaction and region split. These sidelined regions should be > bulk loaded back later. Information about these regions is in the output. > It will be much easier to group them together under the same sideline rootdir, > for example, /hbase/.hbck/to_be_loaded/. If so, even we lose the output > file, we still know what regions to load back. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414242#comment-13414242 ] Hudson commented on HBASE-6380: --- Integrated in HBase-0.94-security #41 (See [https://builds.apache.org/job/HBase-0.94-security/41/]) HBASE-6380 bulkload should update the store.storeSize (Revision 1361204) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6389: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to 0.94 and 0.96 > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isStopped() && > count < maxToStart && > (lastCountChange+interval > now || timeout > slept || count < > minToStart) > ){ > .. > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414211#comment-13414211 ] Hadoop QA commented on HBASE-6394: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536479/6394-trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2385//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2385//console This message is automatically generated. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414201#comment-13414201 ] Lars Hofhansl commented on HBASE-6389: -- Ran TestHLogRecordReader locally. Passes fine (I did not expect that to be related to this patch). > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isStopped() && > count < maxToStart && > (lastCountChange+interval > now || timeout > slept || count < > minToStart) > ){ > .. > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6391) Master restart when enabling table will lead to region assignned twice
[ https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414186#comment-13414186 ] Lars Hofhansl edited comment on HBASE-6391 at 7/14/12 12:24 AM: I think this could be closed as DUP as well. Moving to 0.94.2 for now. was (Author: lhofhansl): I think this could be closed to DUP as well. Moving to 0.94.2 for now. > Master restart when enabling table will lead to region assignned twice > -- > > Key: HBASE-6391 > URL: https://issues.apache.org/jira/browse/HBASE-6391 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.2 > > > The Scenario can be reproduce below. > Enabling an table, some region is online on regionserver,some are still being > processed. > And restart the master. > when master failover: > // Region is being served and on an active server > // add only if region not in disabled and enabling table > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > the opened region will not add to the Regions in master. > and in the following recoverTableInEnablingState,the region will be assigned > again. > that will lead to the cluster inconsistent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414189#comment-13414189 ] Zhihong Ted Yu commented on HBASE-6394: --- +1 on patch v2. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6391) Master restart when enabling table will lead to region assignned twice
[ https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6391: - Fix Version/s: (was: 0.94.1) 0.94.2 I think this could be closed to DUP as well. Moving to 0.94.2 for now. > Master restart when enabling table will lead to region assignned twice > -- > > Key: HBASE-6391 > URL: https://issues.apache.org/jira/browse/HBASE-6391 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.2 > > > The Scenario can be reproduce below. > Enabling an table, some region is online on regionserver,some are still being > processed. > And restart the master. > when master failover: > // Region is being served and on an active server > // add only if region not in disabled and enabling table > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > the opened region will not add to the Regions in master. > and in the following recoverTableInEnablingState,the region will be assigned > again. > that will lead to the cluster inconsistent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6394: --- Attachment: 6394-trunk_v2.patch > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414185#comment-13414185 ] Lars Hofhansl commented on HBASE-6389: -- +1 on last patch. If there are no objections I'll commit this to 0.94 and 0.96. Let's discuss the failure after timeout idea in a different jira. > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isStopped() && > count < maxToStart && > (lastCountChange+interval > now || timeout > slept || count < > minToStart) > ){ > .. > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6394: --- Status: Patch Available (was: Open) Addressed Ted's comment. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6394: --- Status: Open (was: Patch Available) > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch, 6394-trunk_v2.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6395) TestFSSchedulerApp should be in scheduler.fair package
[ https://issues.apache.org/jira/browse/HBASE-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu resolved HBASE-6395. --- Resolution: Won't Fix This should have been a MAPREDUCE JIRA. > TestFSSchedulerApp should be in scheduler.fair package > -- > > Key: HBASE-6395 > URL: https://issues.apache.org/jira/browse/HBASE-6395 > Project: HBase > Issue Type: Bug >Reporter: Zhihong Ted Yu > > MAPREDUCE-3451 added Fair Scheduler to MRv2 > TestFSSchedulerApp was added under > src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair > but its package was declared to be > org.apache.hadoop.yarn.server.resourcemanager.scheduler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414179#comment-13414179 ] Jimmy Xiang commented on HBASE-6394: Sure, I will add that. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6395) TestFSSchedulerApp should be in scheduler.fair package
Zhihong Ted Yu created HBASE-6395: - Summary: TestFSSchedulerApp should be in scheduler.fair package Key: HBASE-6395 URL: https://issues.apache.org/jira/browse/HBASE-6395 Project: HBase Issue Type: Bug Reporter: Zhihong Ted Yu MAPREDUCE-3451 added Fair Scheduler to MRv2 TestFSSchedulerApp was added under src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair but its package was declared to be org.apache.hadoop.yarn.server.resourcemanager.scheduler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414161#comment-13414161 ] Zhihong Ted Yu commented on HBASE-6394: --- {code} +replicatedScanner.close(); {code} I was expecting 'replicatedScanner = null' following the above call. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6394: --- Status: Patch Available (was: Open) The log is from a previous version of HBase. So it is a little bit off with trunk. > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
Jimmy Xiang created HBASE-6394: -- Summary: verifyrep MR job map tasks throws NullPointerException Key: HBASE-6394 URL: https://issues.apache.org/jira/browse/HBASE-6394 Project: HBase Issue Type: Bug Components: replication Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: 6394-trunk.patch {noformat} 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running child java.lang.NullPointerException at org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:264) 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6394: --- Attachment: 6394-trunk.patch > verifyrep MR job map tasks throws NullPointerException > --- > > Key: HBASE-6394 > URL: https://issues.apache.org/jira/browse/HBASE-6394 > Project: HBase > Issue Type: Bug > Components: replication >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: 6394-trunk.patch > > > {noformat} > 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414154#comment-13414154 ] Hudson commented on HBASE-6380: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #93 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/93/]) HBASE-6380 bulkload should update the store.storeSize (Revision 1361203) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6381) AssignmentManager should use the same logic for clean startup and failover
[ https://issues.apache.org/jira/browse/HBASE-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-6381: -- Assignee: Jimmy Xiang > AssignmentManager should use the same logic for clean startup and failover > -- > > Key: HBASE-6381 > URL: https://issues.apache.org/jira/browse/HBASE-6381 > Project: HBase > Issue Type: Bug > Components: master >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > Currently AssignmentManager handles clean startup and failover very > differently. > Different logic is mingled together so it is hard to find out which is for > which. > We should clean it up and share the same logic so that AssignmentManager > handles > both cases the same way. This way, the code will much easier to understand > and > maintain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions
[ https://issues.apache.org/jira/browse/HBASE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-6392: -- Assignee: Jimmy Xiang > UnknownRegionException blocks hbck from sideline big overlap regions > > > Key: HBASE-6392 > URL: https://issues.apache.org/jira/browse/HBASE-6392 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > Before sidelining a big overlap region, hbck tries to close it and offline it > at first. However, sometimes, it throws NotServingRegion or > UnknownRegionException. > It could be because the region is not open/assigned at all, or some other > issue. > We should figure out why and fix it. > By the way, it's better to print out in the log the command line to bulk load > back sidelined regions, if any. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414106#comment-13414106 ] Hadoop QA commented on HBASE-6389: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536453/HBASE-6389_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHLogRecordReader Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2384//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2384//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2384//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2384//console This message is automatically generated. > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number
[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Kishore updated HBASE-6389: -- Attachment: HBASE-6389_trunk.patch The test failure were result of masked error in test code which this change brought out. There were two such errors. # The function org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster() was overriding the value of 'mintostart' and 'maxtostart' with a single value, even if the caller has set them explicitly. # org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing did not set these values even though it kills one RS during master initialization. The attached patch fixes these two. > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isStopped() && > count < maxToStart && > (lastCountChange+interval > now || t
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414031#comment-13414031 ] Aditya Kishore commented on HBASE-6389: --- I like the idea of treating timeout as error case and if we do decide on that, two things need to be taken care of. # The current default timeout of 4.5 sec may not be appropriate and may require upward revision (to the tune of few minutes), and # The master would need to do a cluster shutdown including other standby masters, otherwise each standby master may continue after the previous one has given up. In the worst case scenario of this case, if somehow 'minToStart' number of RSes join the last master, the cluster may be left with no standby master. For this JIRA, I would like to revert to the original behavior (until 0.92) of Master of waiting for 'minToStart' number of RSes. > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxTo
[jira] [Assigned] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark reassigned HBASE-4050: Assignee: Elliott Clark (was: Alex Baranau) > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Elliott Clark >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, > HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, > HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413945#comment-13413945 ] Elliott Clark commented on HBASE-4050: -- Test failure looks un-related. Works on my machine. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, > HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, > HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6393) Decouple audit event creation from storage in AccessController
[ https://issues.apache.org/jira/browse/HBASE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HBASE-6393: -- Attachment: accesslogger-v1.patch Current version of my code, tested with a custom implementation of the new AccessLogger interface. > Decouple audit event creation from storage in AccessController > -- > > Key: HBASE-6393 > URL: https://issues.apache.org/jira/browse/HBASE-6393 > Project: HBase > Issue Type: Brainstorming > Components: security >Reporter: Marcelo Vanzin > Attachments: accesslogger-v1.patch > > > Currently, AccessControler takes care of both generating audit events (by > performing access checks) and storing them (by creating a log message and > writing it to the AUDITLOG logger). > This makes the logging system the only way to catch audit events. It means > that if someone wants to do something fancier (like writing these records to > a database somewhere), they need to hack through the logging system, and > parse the messages generated by AccessController, which is not optimal. > The attached patch decouples generation and storage by introducing a new > interface, used by AccessController, to log the audit events. The current, > log-based storage is kept in place so that current users won't be affected by > the change. > I'm filing this as an RFC at this point, so the patch is not totally clean; > it's on top of HBase 0.92 (which is easier for me to test) and doesn't have > any unit tests, for starters. But the changes should be very similar on trunk > - I don't remember changes in this particular area of the code between those > versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6393) Decouple audit event creation from storage in AccessController
Marcelo Vanzin created HBASE-6393: - Summary: Decouple audit event creation from storage in AccessController Key: HBASE-6393 URL: https://issues.apache.org/jira/browse/HBASE-6393 Project: HBase Issue Type: Brainstorming Components: security Reporter: Marcelo Vanzin Currently, AccessControler takes care of both generating audit events (by performing access checks) and storing them (by creating a log message and writing it to the AUDITLOG logger). This makes the logging system the only way to catch audit events. It means that if someone wants to do something fancier (like writing these records to a database somewhere), they need to hack through the logging system, and parse the messages generated by AccessController, which is not optimal. The attached patch decouples generation and storage by introducing a new interface, used by AccessController, to log the audit events. The current, log-based storage is kept in place so that current users won't be affected by the change. I'm filing this as an RFC at this point, so the patch is not totally clean; it's on top of HBase 0.92 (which is easier for me to test) and doesn't have any unit tests, for starters. But the changes should be very similar on trunk - I don't remember changes in this particular area of the code between those versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong
[ https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413921#comment-13413921 ] Benjamin Kim commented on HBASE-6288: - It took a while for being gone for a vacation. Here goes the patches =) > In hbase-daemons.sh, description of the default backup-master file path is > wrong > > > Key: HBASE-6288 > URL: https://issues.apache.org/jira/browse/HBASE-6288 > Project: HBase > Issue Type: Task > Components: master, scripts, shell >Affects Versions: 0.92.0, 0.92.1, 0.94.0 >Reporter: Benjamin Kim > Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, > HBASE-6288-94.patch, HBASE-6288-trunk.patch > > > In hbase-daemons.sh, description of the default backup-master file path is > wrong > {code} > # HBASE_BACKUP_MASTERS File naming remote hosts. > # Default is ${HADOOP_CONF_DIR}/backup-masters > {code} > it says the default backup-masters file path is at a hadoop-conf-dir, but > shouldn't this be HBASE_CONF_DIR? > also adding following lines to conf/hbase-env.sh would be helpful > {code} > # File naming hosts on which backup HMaster will run. > $HBASE_HOME/conf/backup-masters by default. > export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong
[ https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Kim updated HBASE-6288: Attachment: HBASE-6288-trunk.patch HBASE-6288-94.patch HBASE-6288-92-1.patch HBASE-6288-92.patch > In hbase-daemons.sh, description of the default backup-master file path is > wrong > > > Key: HBASE-6288 > URL: https://issues.apache.org/jira/browse/HBASE-6288 > Project: HBase > Issue Type: Task > Components: master, scripts, shell >Affects Versions: 0.92.0, 0.92.1, 0.94.0 >Reporter: Benjamin Kim > Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, > HBASE-6288-94.patch, HBASE-6288-trunk.patch > > > In hbase-daemons.sh, description of the default backup-master file path is > wrong > {code} > # HBASE_BACKUP_MASTERS File naming remote hosts. > # Default is ${HADOOP_CONF_DIR}/backup-masters > {code} > it says the default backup-masters file path is at a hadoop-conf-dir, but > shouldn't this be HBASE_CONF_DIR? > also adding following lines to conf/hbase-env.sh would be helpful > {code} > # File naming hosts on which backup HMaster will run. > $HBASE_HOME/conf/backup-masters by default. > export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
[ https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-5997: -- Attachment: HBASE-5997_94 V3.patch Patch addressing Stack's comment > Fix concerns raised in HBASE-5922 related to HalfStoreFileReader > > > Key: HBASE-5997 > URL: https://issues.apache.org/jira/browse/HBASE-5997 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 >Reporter: ramkrishna.s.vasudevan >Assignee: Anoop Sam John > Fix For: 0.94.2 > > Attachments: HBASE-5997_0.94.patch, HBASE-5997_94 V2.patch, > HBASE-5997_94 V3.patch, Testcase.patch.txt > > > Pls refer to the comment > https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346. > Raised this issue to solve that comment. Just incase we don't forget it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
[ https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413884#comment-13413884 ] Anoop Sam John commented on HBASE-5997: --- bq. On the second item when we do the compare, are the offsets to where the key bytes start or to where the key starts (with its length preample)? For sure, we are comparing the row portions of keys? Offset will be to the key(with its length preample). KeyComparator will be used.we can see how the rowLength being considered. We compare the full key (rowKey and then CF, qualifier... ) > Fix concerns raised in HBASE-5922 related to HalfStoreFileReader > > > Key: HBASE-5997 > URL: https://issues.apache.org/jira/browse/HBASE-5997 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 >Reporter: ramkrishna.s.vasudevan >Assignee: Anoop Sam John > Fix For: 0.94.2 > > Attachments: HBASE-5997_0.94.patch, HBASE-5997_94 V2.patch, > Testcase.patch.txt > > > Pls refer to the comment > https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346. > Raised this issue to solve that comment. Just incase we don't forget it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6378) the javadoc of setEnabledTable maybe not describe accurately
[ https://issues.apache.org/jira/browse/HBASE-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413883#comment-13413883 ] David S. Wang commented on HBASE-6378: -- >From the patch: + * Sets the ENABLED state in the cache and Creates or force updates an node to + * the ENABLED state for the specified table. I'd modify the above to be: + * Sets the ENABLED state in the cache and creates or force updates a node to + * ENABLED state for the specified table. > the javadoc of setEnabledTable maybe not describe accurately > -- > > Key: HBASE-6378 > URL: https://issues.apache.org/jira/browse/HBASE-6378 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.2 > > Attachments: 6378.patch > > > /** >* Sets the ENABLED state in the cache and deletes the zookeeper node. Fails >* silently if the node is not in enabled in zookeeper >* >* @param tableName >* @throws KeeperException >*/ > public void setEnabledTable(final String tableName) throws KeeperException { > setTableState(tableName, TableState.ENABLED); > } > When setEnabledTable occours ,It will update the cache and the zookeeper > node,rather than to delete the zk node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413872#comment-13413872 ] Hadoop QA commented on HBASE-4050: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536404/HBASE-4050-8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.catalog.TestMetaReaderEditor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2383//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2383//console This message is automatically generated. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, > HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, > HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions
Jimmy Xiang created HBASE-6392: -- Summary: UnknownRegionException blocks hbck from sideline big overlap regions Key: HBASE-6392 URL: https://issues.apache.org/jira/browse/HBASE-6392 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Before sidelining a big overlap region, hbck tries to close it and offline it at first. However, sometimes, it throws NotServingRegion or UnknownRegionException. It could be because the region is not open/assigned at all, or some other issue. We should figure out why and fix it. By the way, it's better to print out in the log the command line to bulk load back sidelined regions, if any. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6384) hbck should group together those sidelined regions need to be bulk loaded later
[ https://issues.apache.org/jira/browse/HBASE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413871#comment-13413871 ] Jimmy Xiang commented on HBASE-6384: @Jon, as to the actual bulk load command line, it is a good idea. It will be addressed in HBASE-6392. > hbck should group together those sidelined regions need to be bulk loaded > later > --- > > Key: HBASE-6384 > URL: https://issues.apache.org/jira/browse/HBASE-6384 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6384-trunk.patch > > > Currently, hbck sidelines some regions to break big overlap groups to avoid > possible compaction and region split. These sidelined regions should be > bulk loaded back later. Information about these regions is in the output. > It will be much easier to group them together under the same sideline rootdir, > for example, /hbase/.hbck/to_be_loaded/. If so, even we lose the output > file, we still know what regions to load back. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6390) append() and increment() may result in inconsistent result on retries.
[ https://issues.apache.org/jira/browse/HBASE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413870#comment-13413870 ] Andrew Purtell commented on HBASE-6390: --- So what you are looking for here is a way for a user to, perhaps optionally, make idempotent requests out of Append and Increment, correct? Let me volunteer a couple of strawmen: 1) Could overload the timestamp of the Append and Increment requests. If the request is "out of date" relative to another request already applied, throw back a DoNotRetryException (or just a DNRE for that op if submitted as a MultiAction). This is roughly how ZooKeeper handles this class of distributed synchronization issue. Timestamp becomes a global sequence number. Not a logical sequence number so clocks must be closely synchronized. Each memstore would track the (server side) time of the most recent in-place update mutation. Could go further and keep a soft cache of in-place update times by row or even KV for use by append/increment/ICV. If more specific information gets evicted from the cache due to pressure then fallback to the per-memstore global timestamp would still insure correctness but potentially more resubmission work for the client/app. 2) A more generic option could be: * Extend the API where the user can set an optional cookie (a long). * Keep a ring buffer of recent cookies up on the server. * Check the buffer first if a request with given cookie has already been applied and throw an exception back to the client if so. Wouldn't guarantee correctness outside of some time bound. Also I worry about state management on the server. How large would that buffer need to be to capture all cookies submitted within ~(2 * time bound)? > append() and increment() may result in inconsistent result on retries. > -- > > Key: HBASE-6390 > URL: https://issues.apache.org/jira/browse/HBASE-6390 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.0, 0.96.0 >Reporter: Ashutosh Jindal > > append() and increment() api can give inconsistent result in following > scenarios : > 1- For eg, if the client does not receive the response in the specified time, > it retries. Now the first call to increment/append is already done and this > retry will again make the operation to succeed. > 2- Now if the sync() to WAL fails we get an IOException, on getting an > exception there is a retry done which again results in the doing the > increment/append again. > When may need some sort of roll back for the second problem. > For the first one we need to see how to handle this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten
[ https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5376: -- Affects Version/s: 0.90.7 Fix Version/s: (was: 0.90.7) > Add more logging to triage HBASE-5312: Closed parent region present in > Hlog.lastSeqWritten > -- > > Key: HBASE-5376 > URL: https://issues.apache.org/jira/browse/HBASE-5376 > Project: HBase > Issue Type: Sub-task >Affects Versions: 0.90.7 >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Attachments: hbase-5376.txt > > > It is hard to find out what exactly caused HBASE-5312. Some logging will be > helpful to shine some lights. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413856#comment-13413856 ] Jimmy Xiang commented on HBASE-6272: @Stack, thanks a lot for the review. I will respond on RB. I will backport this patch to 0.92 and 0.94 after it is applied to trunk. > In-memory region state is inconsistent > -- > > Key: HBASE-6272 > URL: https://issues.apache.org/jira/browse/HBASE-6272 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > AssignmentManger stores region state related information in several places: > regionsInTransition, regions (region info to server name map), and servers > (server name to region info set map). However the access to these places is > not coordinated properly. It leads to inconsistent in-memory region state > information. Sometimes, some region could even be offline, and not in > transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-4050: - Attachment: HBASE-4050-8.patch Addressed stack's comments. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, > HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, > HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413801#comment-13413801 ] Hudson commented on HBASE-6380: --- Integrated in HBase-0.94 #315 (See [https://builds.apache.org/job/HBase-0.94/315/]) HBASE-6380 bulkload should update the store.storeSize (Revision 1361204) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413785#comment-13413785 ] Hudson commented on HBASE-6380: --- Integrated in HBase-TRUNK #3125 (See [https://builds.apache.org/job/HBase-TRUNK/3125/]) HBASE-6380 bulkload should update the store.storeSize (Revision 1361203) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413772#comment-13413772 ] stack commented on HBASE-6299: -- bq. Is it possible that we can do something in an earlier stage to prevent double assignment? like in forceRegionStateToOffline()? Yes. Lets try. I was going to try and write up a reproduction of the bugs you describe above in a harness so can play with them in isolation rather than have to blow up someone's world. > RS starts region open while fails ack to HMaster.sendRegionOpen() causes > inconsistency in HMaster's region state and a series of successive problems. > - > > Key: HBASE-6299 > URL: https://issues.apache.org/jira/browse/HBASE-6299 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.94.0 >Reporter: Maryann Xue >Assignee: Maryann Xue >Priority: Critical > Attachments: HBASE-6299-v2.patch, HBASE-6299.patch > > > 1. HMaster tries to assign a region to an RS. > 2. HMaster creates a RegionState for this region and puts it into > regionsInTransition. > 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS > receives the open region request and starts to proceed, with success > eventually. However, due to network problems, HMaster fails to receive the > response for the openRegion() call, and the call times out. > 4. HMaster attemps to assign for a second time, choosing another RS. > 5. But since the HMaster's OpenedRegionHandler has been triggered by the > region open of the previous RS, and the RegionState has already been removed > from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK > node "RS_ZK_REGION_OPENING" updated by the second attempt. > 6. The unassigned ZK node stays and a later unassign fails coz > RS_ZK_REGION_CLOSING cannot be created. > {code} > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for > region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; > > plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., > src=swbss-hadoop-004,60020,1340890123243, > dest=swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:28,882 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,291 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED > event for > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, > regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node > 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Deleting existing unassigned node for > b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Successfully deleted unassigned node for > region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has > opened the region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > that was online on serverName=swbss-hadoop-006,60020,1340890678078, > load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) > 2012-06-29 07:07:41,140 WARN > org.apache.hadoop.hb
[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413752#comment-13413752 ] stack commented on HBASE-6272: -- @Ram What do you think? You think we should commit this to 0.96 and build fixes like 6060 on top of this or Maryann's issue on OFFLINE? Or you want to hold off? At the moment I'm thinking that fixes for 6060 will be big changes, not easily backported. @Jimmy I added review over on rb. Its looking good. > In-memory region state is inconsistent > -- > > Key: HBASE-6272 > URL: https://issues.apache.org/jira/browse/HBASE-6272 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > AssignmentManger stores region state related information in several places: > regionsInTransition, regions (region info to server name map), and servers > (server name to region info set map). However the access to these places is > not coordinated properly. It leads to inconsistent in-memory region state > information. Sometimes, some region could even be offline, and not in > transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6380: - Resolution: Fixed Fix Version/s: 0.94.1 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed 0.94 and trunk. Thanks for the patch Jie. > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413673#comment-13413673 ] Hudson commented on HBASE-5533: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/]) HBASE-6377. HBASE-5533 metrics miss all operations submitted via MultiAction Committed 6377-trunk-remove-get-put-delete-histograms.patch (Revision 1361026) Result = FAILURE apurtell : Files : * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java > Add more metrics to HBase > - > > Key: HBASE-5533 > URL: https://issues.apache.org/jira/browse/HBASE-5533 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.92.2, 0.94.0 >Reporter: Shaneal Manek >Assignee: Shaneal Manek >Priority: Minor > Fix For: 0.92.2, 0.94.0, 0.96.0 > > Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, > HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, > HBASE-5533-v7-0.92.patch, TimingOverhead.java, hbase-5533-0.92.patch, > hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, hbase5533-0.92-v5.patch, > histogram_web_ui.png > > > To debug/monitor production clusters, there are some more metrics I wish I > had available. > In particular: > - Although the average FS latencies are useful, a 'histogram' of recent > latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) > would be more useful > - Similar histograms of latencies on common operations (GET, PUT, DELETE) > would be useful > - Counting the number of accesses to each region to detect hotspotting > - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413674#comment-13413674 ] Hudson commented on HBASE-6377: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/]) HBASE-6377. HBASE-5533 metrics miss all operations submitted via MultiAction Committed 6377-trunk-remove-get-put-delete-histograms.patch (Revision 1361026) Result = FAILURE apurtell : Files : * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 0.96.0, 0.94.1 > > Attachments: 6377-0.94-remove-get-put-delete-histograms.patch, > 6377-0.94.patch, 6377-trunk-remove-get-put-delete-histograms.patch, > 6377-trunk-simple.patch, 6377.patch > > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6384) hbck should group together those sidelined regions need to be bulk loaded later
[ https://issues.apache.org/jira/browse/HBASE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413676#comment-13413676 ] Hudson commented on HBASE-6384: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/]) HBASE-6384 hbck should group together those sidelined regions need to be bulk loaded later (Revision 1361034) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java > hbck should group together those sidelined regions need to be bulk loaded > later > --- > > Key: HBASE-6384 > URL: https://issues.apache.org/jira/browse/HBASE-6384 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 > > Attachments: 6384-trunk.patch > > > Currently, hbck sidelines some regions to break big overlap groups to avoid > possible compaction and region split. These sidelined regions should be > bulk loaded back later. Information about these regions is in the output. > It will be much easier to group them together under the same sideline rootdir, > for example, /hbase/.hbck/to_be_loaded/. If so, even we lose the output > file, we still know what regions to load back. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable
[ https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413675#comment-13413675 ] Hudson commented on HBASE-6370: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/]) HBASE-6370 Add compression codec test at HMaster when createTable/modifyColumn/modifyTable (Revision 1361058) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > Add compression codec test at HMaster when > createTable/modifyColumn/modifyTable > --- > > Key: HBASE-6370 > URL: https://issues.apache.org/jira/browse/HBASE-6370 > Project: HBase > Issue Type: Improvement >Reporter: ShiXing >Assignee: ShiXing >Priority: Minor > Fix For: 0.96.0 > > Attachments: 6370v3.txt, HBASE-6370-trunk-V1.patch, > HBASE-6370-trunk-V2.patch, runAllTests.out > > > We deployed a cluster that none of the regionserver supports the compression > codec such like "lzo", but the cluster user/client does not know this, and he > specifies the family's compression codec by > HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO); > Because the HBaseAdmin's createTable is async, so the client is waiting all > the regions of the table to be online forever. And client does not know why > the regions are not online until the HBase administrator find this problem. > In deed, all of the regions are assigning by master, but regionserver's > openHRegion always failed. > In my option, we can suppose all the cluster's enviroment are the same, means > if the master is deployed some lib, the regionserver should also be deployed. > Of course above is just a suppose, in real deployment, the hbase dba may just > deploy lib on regionserver or master. > So I think this failure can be found earlier before master create the > CreateTableHandler thread, and we can tell client quickly we didn't support > this compression codec type. > I will upload the patch later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413592#comment-13413592 ] Jie Huang commented on HBASE-6380: -- Re-run those 2 test cases locally (on a 64-bit Linux server), Passed. {noformat} --- T E S T S --- Running org.apache.hadoop.hbase.client.TestFromClientSide Tests run: 56, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 172.105 sec Results : Tests run: 56, Failures: 0, Errors: 0, Skipped: 3 --- T E S T S --- Running org.apache.hadoop.hbase.master.TestSplitLogManager Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.548 sec Results : Tests run: 12, Failures: 0, Errors: 0, Skipped: 0 {noformat} > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4364) Filters applied to columns not in the selected column list are ignored
[ https://issues.apache.org/jira/browse/HBASE-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413590#comment-13413590 ] Hadoop QA commented on HBASE-4364: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536340/hbase-4364_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager org.apache.hadoop.hbase.catalog.TestMetaReaderEditor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2382//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2382//console This message is automatically generated. > Filters applied to columns not in the selected column list are ignored > -- > > Key: HBASE-4364 > URL: https://issues.apache.org/jira/browse/HBASE-4364 > Project: HBase > Issue Type: Bug > Components: filters >Affects Versions: 0.90.4, 0.92.0, 0.94.0 >Reporter: Todd Lipcon >Priority: Critical > Attachments: hbase-4364_trunk.patch > > > For a scan, if you select some set of columns using addColumns(), and then > apply a SingleColumnValueFilter that restricts the results based on some > other columns which aren't selected, then those filter conditions are ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice
[ https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413588#comment-13413588 ] rajeshbabu commented on HBASE-6391: --- I feel this is same as HBASE-6317 and we are trying to address the concerns in that. To answer your questions bq.may anyone tell me why not to add region in enabling state to regions in master Consider a case where i had disabled a table. Again try to ENABLE. But in the middle the master restarted. Now if we add the regions to the this.regions map then the EnableTableHandler will see if the regions are available in this.regions and wont call assign. So those regions will remain closed in the RS. bq.in my opinion, we could treat the case as failover rather than clean start. In HBASE-6317 we are making it as a failover only. {code} // store all the enabling state table names and corresponding online servers' regions. // This may be needed to avoid calling assign twice for the regions of the ENABLING table // that could have been assigned through processRIT. Map> enablingTables = new HashMap>(1); {code} In the patch available in HBASE-6317 we are trying to avoid double assignment by making a map of the enabling table regions so that if those regions are already assigned by processRIT we wont assign it now. Also even if roundrobinassignemt is set to true on master restart and if we find some partially enabled tables we go with single assignment. Please review the patch over in HBASE-6317 and let us know if you have some more open points. > Master restart when enabling table will lead to region assignned twice > -- > > Key: HBASE-6391 > URL: https://issues.apache.org/jira/browse/HBASE-6391 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.1 > > > The Scenario can be reproduce below. > Enabling an table, some region is online on regionserver,some are still being > processed. > And restart the master. > when master failover: > // Region is being served and on an active server > // add only if region not in disabled and enabling table > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > the opened region will not add to the Regions in master. > and in the following recoverTableInEnablingState,the region will be assigned > again. > that will lead to the cluster inconsistent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable
[ https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413587#comment-13413587 ] Hudson commented on HBASE-6370: --- Integrated in HBase-TRUNK #3124 (See [https://builds.apache.org/job/HBase-TRUNK/3124/]) HBASE-6370 Add compression codec test at HMaster when createTable/modifyColumn/modifyTable (Revision 1361058) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > Add compression codec test at HMaster when > createTable/modifyColumn/modifyTable > --- > > Key: HBASE-6370 > URL: https://issues.apache.org/jira/browse/HBASE-6370 > Project: HBase > Issue Type: Improvement >Reporter: ShiXing >Assignee: ShiXing >Priority: Minor > Fix For: 0.96.0 > > Attachments: 6370v3.txt, HBASE-6370-trunk-V1.patch, > HBASE-6370-trunk-V2.patch, runAllTests.out > > > We deployed a cluster that none of the regionserver supports the compression > codec such like "lzo", but the cluster user/client does not know this, and he > specifies the family's compression codec by > HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO); > Because the HBaseAdmin's createTable is async, so the client is waiting all > the regions of the table to be online forever. And client does not know why > the regions are not online until the HBase administrator find this problem. > In deed, all of the regions are assigning by master, but regionserver's > openHRegion always failed. > In my option, we can suppose all the cluster's enviroment are the same, means > if the master is deployed some lib, the regionserver should also be deployed. > Of course above is just a suppose, in real deployment, the hbase dba may just > deploy lib on regionserver or master. > So I think this failure can be found earlier before master create the > CreateTableHandler thread, and we can tell client quickly we didn't support > this compression codec type. > I will upload the patch later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413583#comment-13413583 ] Hadoop QA commented on HBASE-6380: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536342/6380-trunk.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2381//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2381//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2381//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2381//console This message is automatically generated. > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice
[ https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413576#comment-13413576 ] zhou wenjian commented on HBASE-6391: - in my opinion, we could treat the case as failover rather than clean start. > Master restart when enabling table will lead to region assignned twice > -- > > Key: HBASE-6391 > URL: https://issues.apache.org/jira/browse/HBASE-6391 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.1 > > > The Scenario can be reproduce below. > Enabling an table, some region is online on regionserver,some are still being > processed. > And restart the master. > when master failover: > // Region is being served and on an active server > // add only if region not in disabled and enabling table > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > the opened region will not add to the Regions in master. > and in the following recoverTableInEnablingState,the region will be assigned > again. > that will lead to the cluster inconsistent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice
[ https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413571#comment-13413571 ] zhou wenjian commented on HBASE-6391: - in HBASE-6317 rajeshbabu comments As per the current code two scenarios may cause assignment incosistent. 1)in EnableTableHandler we dont assign regions if they are present in regions map. final List onlineRegions =this.assignmentManager.getRegionsOfTable(tableName); regionsInMeta.removeAll(onlineRegions); But in case of enabling table regions during master start up we are not adding them to regions map in rebuldUseRegions even the regions in/transition to onlineServers. if (false == checkIfRegionBelongsToDisabled(regionInfo) && false == checkIfRegionsBelongsToEnabling(regionInfo)) { synchronized (this.regions) { regions.put(regionInfo, regionLocation); addToServers(regionLocation, regionInfo); } } So we will call assign to all the regions even they are in transition/already assigned to online servers which may cause double assignment. 2) If all the tables are in ENABLING we may consider as clean cluster startup(because regions map is empty) and again call assignment for all the regions.(Which may again cause double assignment) if we romove the check for RegionsBelongsToEnabling, the first scenario will not happen again. and for the other scenario we just need to worry about only one case. that is ,all tables are enabling ,and none of the regions' location are registered in the meta. > Master restart when enabling table will lead to region assignned twice > -- > > Key: HBASE-6391 > URL: https://issues.apache.org/jira/browse/HBASE-6391 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.1 > > > The Scenario can be reproduce below. > Enabling an table, some region is online on regionserver,some are still being > processed. > And restart the master. > when master failover: > // Region is being served and on an active server > // add only if region not in disabled and enabling table > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > the opened region will not add to the Regions in master. > and in the following recoverTableInEnablingState,the region will be assigned > again. > that will lead to the cluster inconsistent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413569#comment-13413569 ] Elliott Clark commented on HBASE-4050: -- bq.I suppose you don't need to set test size annotation on below because annotations are not a dependency when this is built: Correct. The hbase-hadoop-compat module has no hadoop dependency. In addition hbase-hadoop1-compat and hbase-hadoop2-compat currently only have unit tests, so they have the second test pass completely turned off. bq.Does BaseMetricsSource not implement MetricsSource? It does. I guess it's just a little too explicit. I'll fix it in the patch first thing tomorrow morning. bq.These need to be this accessible: Kind of but not 100%; I'm open to either way. In hadoop 1 metrics are pretty hard to test. Opening the maps up will make testing any classes that extend MetricsBaseSourceImpl easier. Those classes that add functionality will need those maps to be public for testing. However with that said this patch doesn't have those classes in it, so if you prefer I could make them protected and change that when needed. bq.The stuff below where we have a static boolean and in constructor we test something already created could be a PITA in minihbase setups? Does it have to be static? Aren't we slinging singletons here anyways? (The singletons are ok in minihbasecontext too?): We are currently slinging a singleton. However when we add in more than just replication metrics we'll have more than one BaseMetricsSourceImpl. The DefaultMetricsSystem.initialize call can be done multiple times as long as it's inited with the same string, however it complains quite loudly in logs. bq.'hasInited' is name of a method that tests 'inited' variable... suggest changing its name. Sure. Something like defaultMetricsInited bq.What about that jmx mess registering metrics in tests? The exception saying metrics already registered because we have more than one daemon in the one jvm. We still have that issue here? We'll still have that. A little bit less spam but not completely gone. Basically when all metrics are moved to metrics2 we'll see 4 or 5 log messages (one per dupe of ReplicationMeticsSource et al.) rather than the massive ammount we see now. Maybe on test we should silience the junit messages from those classes ? Probably a good issue to file for the metrics clean up. bq.Do we have to have metrics2 package? Can this new stuff be in the metrics package? Nope. Earlier you were asking to remove it. So everything is in the metrics namespace. That should make things a little nicer if we go the DI route, that's being discussed on the mailing list, and someone wants to go back to the old hadoop metrics. bq.I thought I saw a patch where you'd renamed the properties file to what LarsG suggested? Nope just replied that we could. That file needs some examples and other love (ganglia examples and examples for regionserver/rest). Seems like a good issue for me to file after this. I'll clean up the two javadocs tomorrow morning. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, > HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, > HBASE-4050-7.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413559#comment-13413559 ] nkeywal commented on HBASE-6389: We could remove the timeout? That would make things a little simpler. Or we could keep it as an error case, and throw an exception if the timeout is reached. The intend would be to stop the master. > Modify the conditions to ensure that Master waits for sufficient number of > Region Servers before starting region assignments > > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.96.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Critical > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6389_trunk.patch > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of > "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from > default of 1) can help prevent assignment of all regions to one (or a small > number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in > 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has > lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not > reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 *region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached > AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > > > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > > {code} > So with the current conditions, the wait will end as soon as timeout is > reached even lesser number of RS have checked-in with the Master and the > master will proceed with the region assignment among these RSes alone. > As mentioned in > -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, > and I concur, this could have disastrous effect in large cluster especially > now that MSLAB is turned on. > To enforce the required quorum as specified by > "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, > these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** >* Wait for the region servers to report in. >* We will wait until one of this condition is met: >* - the master is stopped >* - the 'hbase.master.wait.on.regionservers.maxtostart' number of >*region servers is reached >* - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND >* there have been no new region server in for >* 'hbase.master.wait.on.regionservers.interval' time AND >* the 'hbase.master.wait.on.regionservers.timeout' is reached >* >* @throws InterruptedException >*/ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", > Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isStopped() && > count < maxToStart && > (lastCountChange+interval > now || timeout > slept || count < > minToStart) > ){ > .. > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413558#comment-13413558 ] stack commented on HBASE-6272: -- High level Jimmy, how should we proceed with this patch? If we apply it, I think it means that any fixes on stuff like hbase-6060 will be for trunk only; they won't be backportable, at least not w/o a bunch of work. Maybe thats fine. Raising the question. > In-memory region state is inconsistent > -- > > Key: HBASE-6272 > URL: https://issues.apache.org/jira/browse/HBASE-6272 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > AssignmentManger stores region state related information in several places: > regionsInTransition, regions (region info to server name map), and servers > (server name to region info set map). However the access to these places is > not coordinated properly. It leads to inconsistent in-memory region state > information. Sometimes, some region could even be offline, and not in > transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice
[ https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413545#comment-13413545 ] stack commented on HBASE-6391: -- Is this the same as HBASE-6317 "Master clean start up and Partially enabled tables make region assignment inconsistent."? > Master restart when enabling table will lead to region assignned twice > -- > > Key: HBASE-6391 > URL: https://issues.apache.org/jira/browse/HBASE-6391 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.1 > > > The Scenario can be reproduce below. > Enabling an table, some region is online on regionserver,some are still being > processed. > And restart the master. > when master failover: > // Region is being served and on an active server > // add only if region not in disabled and enabling table > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > the opened region will not add to the Regions in master. > and in the following recoverTableInEnablingState,the region will be assigned > again. > that will lead to the cluster inconsistent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6387) Cache DNS lookups in HServerAddress
[ https://issues.apache.org/jira/browse/HBASE-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413552#comment-13413552 ] stack commented on HBASE-6387: -- HServerAddress is deprecated in trunk, replaced. On deserialization was doing a dns lookup. So this is 0.89fb only Mikhail? > Cache DNS lookups in HServerAddress > --- > > Key: HBASE-6387 > URL: https://issues.apache.org/jira/browse/HBASE-6387 > Project: HBase > Issue Type: Improvement >Reporter: Mikhail Bautin > > We have noticed that we rely on DNS lookups in some critical paths by using > HServerAddress, and Java only seems to be caching DNS data for 30 seconds by > default. Also, if DNS is down, Java's negative cache of DNS will ensure that > many successive attempts fail. However, we cannot just increase > networkaddress.cache.ttl to a large value, because e.g. namenode failover may > require resolving the same DNS name differently. Therefore I propose that we > add a DNS lookup cache in HServerAddress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice
[ https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413550#comment-13413550 ] zhou wenjian commented on HBASE-6391: - that is different, i think > Master restart when enabling table will lead to region assignned twice > -- > > Key: HBASE-6391 > URL: https://issues.apache.org/jira/browse/HBASE-6391 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.1 > > > The Scenario can be reproduce below. > Enabling an table, some region is online on regionserver,some are still being > processed. > And restart the master. > when master failover: > // Region is being served and on an active server > // add only if region not in disabled and enabling table > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > the opened region will not add to the Regions in master. > and in the following recoverTableInEnablingState,the region will be assigned > again. > that will lead to the cluster inconsistent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6338) Cache Method in RPC handler
[ https://issues.apache.org/jira/browse/HBASE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-6338: Attachment: HBASE-6338-trunk-2.patch > Cache Method in RPC handler > --- > > Key: HBASE-6338 > URL: https://issues.apache.org/jira/browse/HBASE-6338 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > Attachments: HBASE-6338-90-2.patch, HBASE-6338-90.patch, > HBASE-6338-92-2.patch, HBASE-6338-92.patch, HBASE-6338-94-2.patch, > HBASE-6338-94.patch, HBASE-6338-trunk-2.patch, HBASE-6338-trunk.patch > > > Every call in rpc handler a Method will be created, if we cache the method > will improve a little. > I test with 0.90, Average Class.getMethod(String name, Class... > parameterTypes) cost 4780 ns , if we cache it cost 2620 ns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6338) Cache Method in RPC handler
[ https://issues.apache.org/jira/browse/HBASE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-6338: Attachment: HBASE-6338-94-2.patch HBASE-6338-92-2.patch HBASE-6338-90-2.patch > Cache Method in RPC handler > --- > > Key: HBASE-6338 > URL: https://issues.apache.org/jira/browse/HBASE-6338 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > Attachments: HBASE-6338-90-2.patch, HBASE-6338-90.patch, > HBASE-6338-92-2.patch, HBASE-6338-92.patch, HBASE-6338-94-2.patch, > HBASE-6338-94.patch, HBASE-6338-trunk-2.patch, HBASE-6338-trunk.patch > > > Every call in rpc handler a Method will be created, if we cache the method > will improve a little. > I test with 0.90, Average Class.getMethod(String name, Class... > parameterTypes) cost 4780 ns , if we cache it cost 2620 ns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable
[ https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6370: - Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the patch ShiXing. > Add compression codec test at HMaster when > createTable/modifyColumn/modifyTable > --- > > Key: HBASE-6370 > URL: https://issues.apache.org/jira/browse/HBASE-6370 > Project: HBase > Issue Type: Improvement >Reporter: ShiXing >Assignee: ShiXing >Priority: Minor > Fix For: 0.96.0 > > Attachments: 6370v3.txt, HBASE-6370-trunk-V1.patch, > HBASE-6370-trunk-V2.patch, runAllTests.out > > > We deployed a cluster that none of the regionserver supports the compression > codec such like "lzo", but the cluster user/client does not know this, and he > specifies the family's compression codec by > HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO); > Because the HBaseAdmin's createTable is async, so the client is waiting all > the regions of the table to be online forever. And client does not know why > the regions are not online until the HBase administrator find this problem. > In deed, all of the regions are assigning by master, but regionserver's > openHRegion always failed. > In my option, we can suppose all the cluster's enviroment are the same, means > if the master is deployed some lib, the regionserver should also be deployed. > Of course above is just a suppose, in real deployment, the hbase dba may just > deploy lib on regionserver or master. > So I think this failure can be found earlier before master create the > CreateTableHandler thread, and we can tell client quickly we didn't support > this compression codec type. > I will upload the patch later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6380: - Attachment: 6380-trunk.txt Retry > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6380: - Status: Open (was: Patch Available) > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize
[ https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6380: - Status: Patch Available (was: Open) > bulkload should update the store.storeSize > -- > > Key: HBASE-6380 > URL: https://issues.apache.org/jira/browse/HBASE-6380 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.94.0, 0.96.0 >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Critical > Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch > > > After bulkloading some HFiles into the Table, we found the force-split didn't > work because of the MidKey == NULL. Only if we re-booted the HBase service, > the force-split can work normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413535#comment-13413535 ] stack commented on HBASE-4050: -- I suppose you don't need to set test size annotation on below because annotations are not a dependency when this is built: {code} +public class ReplicationMetricsSourceFactoryTest { {code} Does BaseMetricsSource not implement MetricsSource? {code} +public class BaseMetricsSourceImpl implements BaseMetricsSource, MetricsSource { {code} These need to be this accessible: {code} + public ConcurrentMap + gauges = new ConcurrentHashMap(); + public ConcurrentMap counters = + new ConcurrentHashMap(); + + protected String metricsContext; + protected String metricsName; + protected String metricsDescription; {code} (I see above twice) The stuff below where we have a static boolean and in constructor we test something already created could be a PITA in minihbase setups? Does it have to be static? Aren't we slinging singletons here anyways? (The singletons are ok in minihbasecontext too?): {code} +if (!hasInited) { + //Not too worried about mutli-threaded here as all it does is spam the logs. + hasInited = true; + DefaultMetricsSystem.initialize(HBASE_METRICS_SYSTEM_NAME); +} {code} 'hasInited' is name of a method that tests 'inited' variable... suggest changing its name. What about that jmx mess registering metrics in tests? The exception saying metrics already registered because we have more than one daemon in the one jvm. We still have that issue here? You wanted to complete this: "+/** BaseClass for */" Another class has no class comments though has the comment delimiters. Do we have to have metrics2 package? Can this new stuff be in the metrics package? I thought I saw a patch where you'd renamed the properties file to what LarsG suggested? You seem to have made it so we do not need to have a metrics2 in hbase... thats great... but in the properties file I see: {code} +# See package.html for org.apache.hadoop.metrics2 for details + +*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink {code} Is that just old stuff? Good stuff Elliott. I'd be up for committing this and then doing other stuff in other issues. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, > HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, > HBASE-4050-7.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira