[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maryann Xue updated HBASE-6299: --- Attachment: (was: HBASE-6299-v3.patch) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems. - Key: HBASE-6299 URL: https://issues.apache.org/jira/browse/HBASE-6299 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6299.patch, HBASE-6299-v2.patch 1. HMaster tries to assign a region to an RS. 2. HMaster creates a RegionState for this region and puts it into regionsInTransition. 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS receives the open region request and starts to proceed, with success eventually. However, due to network problems, HMaster fails to receive the response for the openRegion() call, and the call times out. 4. HMaster attemps to assign for a second time, choosing another RS. 5. But since the HMaster's OpenedRegionHandler has been triggered by the region open of the previous RS, and the RegionState has already been removed from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK node RS_ZK_REGION_OPENING updated by the second attempt. 6. The unassigned ZK node stays and a later unassign fails coz RS_ZK_REGION_CLOSING cannot be created. {code} 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., src=swbss-hadoop-004,60020,1340890123243, dest=swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. to swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:28,882 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,291 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED event for CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x2377fee2ae80007 Deleting existing unassigned node for b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x2377fee2ae80007 Successfully deleted unassigned node for region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has opened the region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. that was online on serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) 2012-06-29 07:07:41,140 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0, regions=575, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0 java.net.SocketTimeoutException: Call to
[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maryann Xue updated HBASE-6299: --- Attachment: HBASE-6299-v3.patch @ramkrishna, updated the patch. misunderstood the exception handling in HBaseClient. thank you for pointing this out! RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems. - Key: HBASE-6299 URL: https://issues.apache.org/jira/browse/HBASE-6299 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6299.patch, HBASE-6299-v2.patch, HBASE-6299-v3.patch 1. HMaster tries to assign a region to an RS. 2. HMaster creates a RegionState for this region and puts it into regionsInTransition. 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS receives the open region request and starts to proceed, with success eventually. However, due to network problems, HMaster fails to receive the response for the openRegion() call, and the call times out. 4. HMaster attemps to assign for a second time, choosing another RS. 5. But since the HMaster's OpenedRegionHandler has been triggered by the region open of the previous RS, and the RegionState has already been removed from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK node RS_ZK_REGION_OPENING updated by the second attempt. 6. The unassigned ZK node stays and a later unassign fails coz RS_ZK_REGION_CLOSING cannot be created. {code} 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., src=swbss-hadoop-004,60020,1340890123243, dest=swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. to swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:28,882 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,291 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED event for CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x2377fee2ae80007 Deleting existing unassigned node for b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x2377fee2ae80007 Successfully deleted unassigned node for region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has opened the region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. that was online on serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) 2012-06-29 07:07:41,140 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0,
[jira] [Updated] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-6260: -- Attachment: HBASE-6260-addendum2.patch Reattaching to try to kick off HadoopQA again. balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan reopened HBASE-6260: --- balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-6260: -- Status: Patch Available (was: Reopened) balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455614#comment-13455614 ] Hadoop QA commented on HBASE-6299: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545102/HBASE-6299-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2866//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2866//console This message is automatically generated. RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems. - Key: HBASE-6299 URL: https://issues.apache.org/jira/browse/HBASE-6299 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6299.patch, HBASE-6299-v2.patch, HBASE-6299-v3.patch 1. HMaster tries to assign a region to an RS. 2. HMaster creates a RegionState for this region and puts it into regionsInTransition. 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS receives the open region request and starts to proceed, with success eventually. However, due to network problems, HMaster fails to receive the response for the openRegion() call, and the call times out. 4. HMaster attemps to assign for a second time, choosing another RS. 5. But since the HMaster's OpenedRegionHandler has been triggered by the region open of the previous RS, and the RegionState has already been removed from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK node RS_ZK_REGION_OPENING updated by the second attempt. 6. The unassigned ZK node stays and a later unassign fails coz RS_ZK_REGION_CLOSING cannot be created. {code} 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., src=swbss-hadoop-004,60020,1340890123243, dest=swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. to swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:28,882 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,291 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED event for CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node 2012-06-29
[jira] [Commented] (HBASE-6780) On the master status page the Number of Requests per second is incorrect for RegionServer's
[ https://issues.apache.org/jira/browse/HBASE-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455627#comment-13455627 ] Hudson commented on HBASE-6780: --- Integrated in HBase-TRUNK #3330 (See [https://builds.apache.org/job/HBase-TRUNK/3330/]) HBASE-6780 On the master status page the Number of Requests per second is incorrect for RegionServer's (Revision 1384648) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ServerLoad.java On the master status page the Number of Requests per second is incorrect for RegionServer's --- Key: HBASE-6780 URL: https://issues.apache.org/jira/browse/HBASE-6780 Project: HBase Issue Type: Bug Components: master Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 0.96.0 Attachments: HBASE-6780-0.patch The number of requests per second is getting divided when it shouldn't be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455644#comment-13455644 ] Hadoop QA commented on HBASE-6260: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545107/HBASE-6260-addendum2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2867//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2867//console This message is automatically generated. balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6518) Bytes.toBytesBinary() incorrect trailing backslash escape
[ https://issues.apache.org/jira/browse/HBASE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455654#comment-13455654 ] Tudor Scurtu commented on HBASE-6518: - Thank you, too. Bytes.toBytesBinary() incorrect trailing backslash escape - Key: HBASE-6518 URL: https://issues.apache.org/jira/browse/HBASE-6518 Project: HBase Issue Type: Bug Components: util Reporter: Tudor Scurtu Assignee: Tudor Scurtu Priority: Trivial Labels: patch Fix For: 0.96.0 Attachments: HBASE-6518.patch Bytes.toBytesBinary() converts escaped strings to byte arrays. When encountering a '\' character, it looks at the next one to see if it is an 'x', without checking if it exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6694) Test scanner batching in export job feature HBASE-6372 AND report on improvement HBASE-6372 adds
[ https://issues.apache.org/jira/browse/HBASE-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455664#comment-13455664 ] Alexander Alten-Lorenz commented on HBASE-6694: --- YCSB won't work I use a ruby script now, nut that take some time to create a excessive number of columns. Test scanner batching in export job feature HBASE-6372 AND report on improvement HBASE-6372 adds Key: HBASE-6694 URL: https://issues.apache.org/jira/browse/HBASE-6694 Project: HBase Issue Type: Task Reporter: stack Assignee: Alexander Alten-Lorenz Attachments: HBASE-6694.patch From tail of HBASE-6372, Jon had raised issue that test added did not actually test the feature. This issue is about adding a test of HBASE-6372. We should also have numbers for the improvement that HBASE-6372 brings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6776) Opened region of disabled/enabling table is not added to online region list
[ https://issues.apache.org/jira/browse/HBASE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455690#comment-13455690 ] ramkrishna.s.vasudevan commented on HBASE-6776: --- - Adding all regions to online map? How does enabletablehandler work then? {code} ListHRegionInfo regionsInMeta; regionsInMeta = MetaReader.getTableRegions(this.ct, tableName, true); int countOfRegionsInTable = regionsInMeta.size(); ListHRegionInfo regions = regionsToAssign(regionsInMeta); {code} This piece of code will not allow the assignment to happen right ? - Changing from DISABLED to DISABLING In what condition this will happen? Because state is in DISABLED for sure we can know that the disabling is completed fully? Opened region of disabled/enabling table is not added to online region list --- Key: HBASE-6776 URL: https://issues.apache.org/jira/browse/HBASE-6776 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6776.patch For opened region of disabled table, it should be added to online region list, and then closed. We should not just ignore them. For opened region of enabling table, it should be added to online region list, so that we don't have to assign it again. Without adding it to online region list, it could be double assigned when assign it again later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6782) HBase shell's 'status 'detailed'' should escape the printed keys
[ https://issues.apache.org/jira/browse/HBASE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455723#comment-13455723 ] Viji commented on HBASE-6782: - Please assign this to me. HBase shell's 'status 'detailed'' should escape the printed keys Key: HBASE-6782 URL: https://issues.apache.org/jira/browse/HBASE-6782 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.90.1 Reporter: Viji Priority: Minor Currently the HBase shell's status command prints unescaped keys on the terminal causing the terminal to print garbage characters. We should escape the printed keys. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3834) Store ignores checksum errors when opening files
[ https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liang xie updated HBASE-3834: - Attachment: hbase-3834.tar.gz2 Hi all, i ran a manual test today, it turned out this issue should be gone away in latest version, at least for 0.94.0 Here is my test details: 0)My env: ubuntu 10.10, hbase-0.94.0 release, standalone mode 1)start HBase 2)from hbase shell: hbase(main):002:0 status 1 servers, 0 dead, 2. average load hbase(main):003:0 version 0.94.0, r1332822, Tue May 1 21:43:54 UTC 2012 hbase(main):004:0 create 'test','cf' 0 row(s) in 1.1520 seconds hbase(main):005:0 list 'test' TABLE test 1 row(s) in 0.0230 seconds hbase(main):010:0 put 'test','row1','cf:a','value1' 0 row(s) in 0.0160 seconds hbase(main):011:0 put 'test','row2','cf:b','value2' 0 row(s) in 0.0070 seconds hbase(main):012:0 put 'test','row3','cf:c','value3' 0 row(s) in 0.0070 seconds hbase(main):017:0 scan 'test' ROW COLUMN+CELL row1column=cf:a, timestamp=1347619555652, value=value1 row2column=cf:b, timestamp=1347619562943, value=value2 row3column=cf:c, timestamp=1347619576704, value=value3 3 row(s) in 0.0310 seconds hbase(main):027:0 flush 'test' 0 row(s) in 0.0770 seconds hbase(main):028:0 exit 3)shutdown hbase,then edit the according files with vim 4)start hbase again 5)from log file, we can see: 2012-09-14 18:54:15,096 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store null 2012-09-14 18:54:15,118 INFO org.apache.hadoop.fs.FSInputChecker: Found checksum error: b[0, 286]=454e01040012136866696c652e4156475f56414c55455f4c454e010400040d6866696c652e4c4153544b455901130004726f7733026366630139c462db8004a887830f545241424c4b2224010f00a60001001f02f60004000200016f72672e6170616368652e6861646f6f702e68626173652e4b657956616c7565244b6579436f6d70617261746f7201020a org.apache.hadoop.fs.ChecksumException: Checksum error: file:/tmp/hbase/test/4c9f2cfda63b2e9785815ed2e841d052/cf/269b59832d68465687ebce880026a301 at 512 at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277) at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241) at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176) ... 2012-09-14 18:54:15,123 INFO org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of region {NAME = 'test,,1347619334483.4c9f2cfda63b2e9785815ed2e841d052.', STARTKEY = '', ENDKEY = '', ENCODED = 4c9f2cfda63b2e9785815ed2e841d052,} failed, marking as FAILED_OPEN in ZK 2012-09-14 18:54:15,123 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:48394-0x139c46a10d40001 Attempting to transition node 4c9f2cfda63b2e9785815ed2e841d052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_FAILED_OPEN 2012-09-14 18:54:15,137 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:48394-0x139c46a10d40001 Successfully transitioned node 4c9f2cfda63b2e9785815ed2e841d052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_FAILED_OPEN 2012-09-14 18:54:15,138 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_FAILED_OPEN, server=xieliang,48394,1347620050131, region=4c9f2cfda63b2e9785815ed2e841d052 2012-09-14 18:54:15,141 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for 4c9f2cfda63b2e9785815ed2e841d052 2012-09-14 18:54:15,141
[jira] [Commented] (HBASE-3834) Store ignores checksum errors when opening files
[ https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455731#comment-13455731 ] liang xie commented on HBASE-3834: -- IMHO, this is up to Todd's initial expectation. Store ignores checksum errors when opening files Key: HBASE-3834 URL: https://issues.apache.org/jira/browse/HBASE-3834 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.2 Reporter: Todd Lipcon Assignee: liang xie Priority: Critical Fix For: 0.90.8 Attachments: hbase-3834.tar.gz2 If you corrupt one of the storefiles in a region (eg using vim to muck up some bytes), the region will still open, but that storefile will just be ignored with a log message. We should probably not do this in general - better to keep that region unassigned and force an admin to make a decision to remove the bad storefile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6780) On the master status page the Number of Requests per second is incorrect for RegionServer's
[ https://issues.apache.org/jira/browse/HBASE-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455740#comment-13455740 ] Hudson commented on HBASE-6780: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #173 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/173/]) HBASE-6780 On the master status page the Number of Requests per second is incorrect for RegionServer's (Revision 1384648) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ServerLoad.java On the master status page the Number of Requests per second is incorrect for RegionServer's --- Key: HBASE-6780 URL: https://issues.apache.org/jira/browse/HBASE-6780 Project: HBase Issue Type: Bug Components: master Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 0.96.0 Attachments: HBASE-6780-0.patch The number of requests per second is getting divided when it shouldn't be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6782) HBase shell's 'status 'detailed'' should escape the printed keys
[ https://issues.apache.org/jira/browse/HBASE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455745#comment-13455745 ] ramkrishna.s.vasudevan commented on HBASE-6782: --- @Viji Generally once the defect is committed then only the issue is assigned to that person. You can submit a patch and if it goes in we can add your name. That is the procedure as i know. HBase shell's 'status 'detailed'' should escape the printed keys Key: HBASE-6782 URL: https://issues.apache.org/jira/browse/HBASE-6782 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.90.1 Reporter: Viji Priority: Minor Currently the HBase shell's status command prints unescaped keys on the terminal causing the terminal to print garbage characters. We should escape the printed keys. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6783) Make read short circuit the default
nkeywal created HBASE-6783: -- Summary: Make read short circuit the default Key: HBASE-6783 URL: https://issues.apache.org/jira/browse/HBASE-6783 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Per mailing discussion, read short circuit has little or no drawback, hence should used by default. As a consequence, we activate it on the default tests. It's possible to launch the test with -Ddfs.client.read.shortcircuit=false to execute the tests without the shortcircuit, it will be used for some builds on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6783) Make read short circuit the default
[ https://issues.apache.org/jira/browse/HBASE-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6783: --- Attachment: HBASE-6783.v1.patch Make read short circuit the default --- Key: HBASE-6783 URL: https://issues.apache.org/jira/browse/HBASE-6783 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Attachments: HBASE-6783.v1.patch Per mailing discussion, read short circuit has little or no drawback, hence should used by default. As a consequence, we activate it on the default tests. It's possible to launch the test with -Ddfs.client.read.shortcircuit=false to execute the tests without the shortcircuit, it will be used for some builds on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6783) Make read short circuit the default
[ https://issues.apache.org/jira/browse/HBASE-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6783: --- Fix Version/s: 0.96.0 Affects Version/s: 0.96.0 Status: Patch Available (was: Open) Make read short circuit the default --- Key: HBASE-6783 URL: https://issues.apache.org/jira/browse/HBASE-6783 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: HBASE-6783.v1.patch Per mailing discussion, read short circuit has little or no drawback, hence should used by default. As a consequence, we activate it on the default tests. It's possible to launch the test with -Ddfs.client.read.shortcircuit=false to execute the tests without the shortcircuit, it will be used for some builds on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6416) hbck dies on NPE when a region folder exists but the table does not
[ https://issues.apache.org/jira/browse/HBASE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455771#comment-13455771 ] Jonathan Hsieh commented on HBASE-6416: --- I think that having to run hbck multiple times to clean up the corruptions, though not ideal, is acceptable. Ideally the tool will tell the user that they need to do that. At the end of the day, if hbase is down, having a slightly inefficient automatic solution to fix it is better than having no solution to automatically fix it. hbck dies on NPE when a region folder exists but the table does not --- Key: HBASE-6416 URL: https://issues.apache.org/jira/browse/HBASE-6416 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Fix For: 0.96.0, 0.94.3 Attachments: hbase-6416.patch, hbase-6416-v1.patch This is what I'm getting for leftover data that has no .regioninfo First: {quote} 12/07/17 23:13:37 WARN util.HBaseFsck: Failed to read .regioninfo file for region null java.io.FileNotFoundException: File does not exist: /hbase/stumble_info_urlid_user/bd5f6cfed674389b4d7b8c1be227cb46/.regioninfo at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) at org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:611) at org.apache.hadoop.hbase.util.HBaseFsck.access$2200(HBaseFsck.java:140) at org.apache.hadoop.hbase.util.HBaseFsck$WorkItemHdfsRegionInfo.call(HBaseFsck.java:2882) at org.apache.hadoop.hbase.util.HBaseFsck$WorkItemHdfsRegionInfo.call(HBaseFsck.java:2866) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} Then it hangs on: {quote} 12/07/17 23:13:39 INFO util.HBaseFsck: Attempting to handle orphan hdfs dir: hdfs://sfor3s24:10101/hbase/stumble_info_urlid_user/bd5f6cfed674389b4d7b8c1be227cb46 12/07/17 23:13:39 INFO util.HBaseFsck: checking orphan for table null Exception in thread main java.lang.NullPointerException at org.apache.hadoop.hbase.util.HBaseFsck$TableInfo.access$100(HBaseFsck.java:1634) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphan(HBaseFsck.java:435) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphans(HBaseFsck.java:408) at org.apache.hadoop.hbase.util.HBaseFsck.restoreHdfsIntegrity(HBaseFsck.java:529) at org.apache.hadoop.hbase.util.HBaseFsck.offlineHdfsIntegrityRepair(HBaseFsck.java:313) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:386) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3227) {quote} The NPE is sent by: {code} Preconditions.checkNotNull(Table + tableName + ' not present!, tableInfo); {code} I wonder why the condition checking was added if we don't handle it... In any case hbck dies but it hangs because there are some non-daemon hanging around. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6612) Hbase command line improvements
[ https://issues.apache.org/jira/browse/HBASE-6612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455781#comment-13455781 ] Michael Drzal commented on HBASE-6612: -- [~ionignat] would HBASE-6592 address your issues? If so, it might make sense to just make this a dupe of that jira. Hbase command line improvements --- Key: HBASE-6612 URL: https://issues.apache.org/jira/browse/HBASE-6612 Project: HBase Issue Type: New Feature Components: scripts, shell Affects Versions: 0.94.1 Reporter: Ionut Ignatescu Priority: Minor Currently, if the row key or any column value is something different than a string, when a scan is performed via command line, the value extracted are not decoded to a human-readable format. It would be nice to have support to some standard data types(long,double,etc..) or to specify some custom decoders(this would be extremely useful for tables having composed keys). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6783) Make read short circuit the default
[ https://issues.apache.org/jira/browse/HBASE-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455779#comment-13455779 ] Hadoop QA commented on HBASE-6783: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545138/HBASE-6783.v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2868//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2868//console This message is automatically generated. Make read short circuit the default --- Key: HBASE-6783 URL: https://issues.apache.org/jira/browse/HBASE-6783 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: HBASE-6783.v1.patch Per mailing discussion, read short circuit has little or no drawback, hence should used by default. As a consequence, we activate it on the default tests. It's possible to launch the test with -Ddfs.client.read.shortcircuit=false to execute the tests without the shortcircuit, it will be used for some builds on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6624) [Replication]currentNbOperations should set to 0 after update the shippedOpsRate
[ https://issues.apache.org/jira/browse/HBASE-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455796#comment-13455796 ] Michael Drzal commented on HBASE-6624: -- This seems like a reasonable change. You will need to rebase the patch as HBASE-6623 got committed, so you will need to move that down a line at the very least. Unless I'm misreading the code. [Replication]currentNbOperations should set to 0 after update the shippedOpsRate Key: HBASE-6624 URL: https://issues.apache.org/jira/browse/HBASE-6624 Project: HBase Issue Type: Bug Components: replication Affects Versions: 0.94.0 Reporter: terry zhang Assignee: terry zhang Attachments: jira-6624.patch now currentNbOperations will not reset to 0 and increase after replication start. Now this value is used for calculate shippedOpsRate. if it is not reset to 0 shippedOpsRate is not correct -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6627) TestMultiVersions.testGetRowVersions is flaky
[ https://issues.apache.org/jira/browse/HBASE-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455806#comment-13455806 ] Michael Drzal commented on HBASE-6627: -- [~nkeywal] do you want to keep this around or close it and reopen if the issue comes back up? TestMultiVersions.testGetRowVersions is flaky - Key: HBASE-6627 URL: https://issues.apache.org/jira/browse/HBASE-6627 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Environment: hadoop-qa mainly, seems to happen tests in parallel; difficult to reproduce on a single test. Reporter: nkeywal Assignee: nkeywal Attachments: 6627.v1.patch org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions Shutting down Stacktrace java.io.IOException: Shutting down at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:229) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:92) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:688) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:661) at org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions(TestMultiVersions.java:143) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6627) TestMultiVersions.testGetRowVersions is flaky
[ https://issues.apache.org/jira/browse/HBASE-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455808#comment-13455808 ] nkeywal commented on HBASE-6627: Hey, you're right, it disappeared from trunk. Let's close this. TestMultiVersions.testGetRowVersions is flaky - Key: HBASE-6627 URL: https://issues.apache.org/jira/browse/HBASE-6627 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Environment: hadoop-qa mainly, seems to happen tests in parallel; difficult to reproduce on a single test. Reporter: nkeywal Assignee: nkeywal Attachments: 6627.v1.patch org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions Shutting down Stacktrace java.io.IOException: Shutting down at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:229) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:92) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:688) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:661) at org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions(TestMultiVersions.java:143) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6627) TestMultiVersions.testGetRowVersions is flaky
[ https://issues.apache.org/jira/browse/HBASE-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6627: --- Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) TestMultiVersions.testGetRowVersions is flaky - Key: HBASE-6627 URL: https://issues.apache.org/jira/browse/HBASE-6627 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Environment: hadoop-qa mainly, seems to happen tests in parallel; difficult to reproduce on a single test. Reporter: nkeywal Assignee: nkeywal Attachments: 6627.v1.patch org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions Shutting down Stacktrace java.io.IOException: Shutting down at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:229) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:92) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:688) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:661) at org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions(TestMultiVersions.java:143) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6723) Make AssignmentManager pluggable
[ https://issues.apache.org/jira/browse/HBASE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455810#comment-13455810 ] Jonathan Hsieh commented on HBASE-6723: --- From discussion in earlier in the week -- balancers are already pluggable and probably the better place for the functionality you want. Make AssignmentManager pluggable Key: HBASE-6723 URL: https://issues.apache.org/jira/browse/HBASE-6723 Project: HBase Issue Type: Sub-task Reporter: Francis Liu -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6441) MasterFS doesn't set scheme for internal FileSystem
[ https://issues.apache.org/jira/browse/HBASE-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455833#comment-13455833 ] Jonathan Hsieh commented on HBASE-6441: --- i was trying to get the 0.94 and trunk builds to work against hadoop 2.0 a while back and had a problem with TestImportExport. (I believe writes went to one local fs and reads went to hdfs or vice versa.). I had a gross hack that made the test pass but I never got to the point understanding why it worked so I didn't commit it. MasterFS doesn't set scheme for internal FileSystem --- Key: HBASE-6441 URL: https://issues.apache.org/jira/browse/HBASE-6441 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0, 0.94.2 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0 Attachments: java_HBASE-6441_v0.patch FSUtils.getRootDir() just takes a configuration object, which is used to: 1) Get the name of the root directory 2) Create a filesystem (based on the configured scheme) 3) Qualify the root onto the filesystem However, the FileSystem from the master filesystem won't generate the correctly qualified root directory under hadoop-2.0 (though it works fine on hadoop-1.0). Seems to be an issue with the configuration parameters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6527) Make custom filters plugin
[ https://issues.apache.org/jira/browse/HBASE-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455844#comment-13455844 ] Jonathan Hsieh commented on HBASE-6527: --- This was in reference to HBASE-6509, a patch that I -0'ed. That filter took advantage of the fast fowarding mechanism in filtering, but I felt the filter was a bit too custom to be included in hbase core. I wasn't aware of jiras for scripted filters -- I think that would be a fine alternative to having a filter plugin mechanism. [~larsgeorge] can you provide a link to the scripted filter jira? Make custom filters plugin -- Key: HBASE-6527 URL: https://issues.apache.org/jira/browse/HBASE-6527 Project: HBase Issue Type: Bug Reporter: Ted Yu More and more custom Filters are created. We should provide plugin mechanism for these custom Filters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455849#comment-13455849 ] Ted Yu commented on HBASE-6710: --- bq. the client may report inaccurate results for is_enabled and is_disabled. How does user verify whether the above commands returned inaccurate results ? Thanks 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6784) TestCoprocessorScanPolicy is sometimes flaky when run locally
ramkrishna.s.vasudevan created HBASE-6784: - Summary: TestCoprocessorScanPolicy is sometimes flaky when run locally Key: HBASE-6784 URL: https://issues.apache.org/jira/browse/HBASE-6784 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Priority: Minor The problem is not seen in jenkins build. When we run TestCoprocessorScanPolicy.testBaseCases locally or in our internal jenkins we tend to get random failures. The reason is the 2 puts that we do here is sometimes getting the same timestamp. This is leading to improper scan results as the version check tends to skip one of the row seeing the timestamp to be same. Marking this as minor. As we are trying to solve testcase related failures just raising this incase we need to resolve this also. For eg, Both the puts are getting the time {code} time 1347635287360 time 1347635287360 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6438: -- Attachment: HBASE-6438_94_3.patch Updated patch for 94. RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies -- Key: HBASE-6438 URL: https://issues.apache.org/jira/browse/HBASE-6438 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: rajeshbabu Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94.patch, HBASE-6438_trunk.patch Seeing some of the recent issues in region assignment, RegionAlreadyInTransitionException is one reason after which the region assignment may or may not happen(in the sense we need to wait for the TM to assign). In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on master restart. Consider the following case, due to some reason like master restart or external assign call, we try to assign a region that is already getting opened in a RS. Now the next call to assign has already changed the state of the znode and so the current assign that is going on the RS is affected and it fails. The second assignment that started also fails getting RAITE exception. Finally both assignments not carrying on. Idea is to find whether any such RAITE exception can be retried or not. Here again we have following cases like where - The znode is yet to transitioned from OFFLINE to OPENING in RS - RS may be in the step of openRegion. - RS may be trying to transition OPENING to OPENED. - RS is yet to add to online regions in the RS side. Here in openRegion() and updateMeta() any failures we are moving the znode to FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other cases the assignment is stopped. The idea is to just add the current state of the region assignment in the RIT map in the RS side and using that info we can determine whether the assignment can be retried or not on getting an RAITE. Considering the current work going on in AM, pls do share if this is needed atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6776) Opened region of disabled/enabling table is not added to online region list
[ https://issues.apache.org/jira/browse/HBASE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455915#comment-13455915 ] Jimmy Xiang commented on HBASE-6776: regionsToAssign will remove those already online regions from the regionsInMeta, then assign the rest of regions. That's what we want. This could happen during unit test for sure, some man-made corruptions. Another scenario could be because of racing between disabling a table, and assigning the table without going through ZK. Opened region of disabled/enabling table is not added to online region list --- Key: HBASE-6776 URL: https://issues.apache.org/jira/browse/HBASE-6776 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6776.patch For opened region of disabled table, it should be added to online region list, and then closed. We should not just ignore them. For opened region of enabling table, it should be added to online region list, so that we don't have to assign it again. Without adding it to online region list, it could be double assigned when assign it again later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455919#comment-13455919 ] Ted Yu commented on HBASE-6710: --- I was trying to wrap up 0.92.2 release. So pardon me for late comments. Looking at code for void setTableState(final String tableName, final TableState state): {code} if (settingToEnabled) { ZKUtil.deleteNodeFailSilent(this.watcher, znode92); } {code} Should we fail silently if node deletion fails ? Would this increase the chance of inconsistency between znode92 and znode ? Should hbck be enhanced to detect / fix such inconsistencies ? 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455921#comment-13455921 ] Hadoop QA commented on HBASE-6438: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545165/HBASE-6438_94_3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2869//console This message is automatically generated. RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies -- Key: HBASE-6438 URL: https://issues.apache.org/jira/browse/HBASE-6438 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: rajeshbabu Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94.patch, HBASE-6438_trunk.patch Seeing some of the recent issues in region assignment, RegionAlreadyInTransitionException is one reason after which the region assignment may or may not happen(in the sense we need to wait for the TM to assign). In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on master restart. Consider the following case, due to some reason like master restart or external assign call, we try to assign a region that is already getting opened in a RS. Now the next call to assign has already changed the state of the znode and so the current assign that is going on the RS is affected and it fails. The second assignment that started also fails getting RAITE exception. Finally both assignments not carrying on. Idea is to find whether any such RAITE exception can be retried or not. Here again we have following cases like where - The znode is yet to transitioned from OFFLINE to OPENING in RS - RS may be in the step of openRegion. - RS may be trying to transition OPENING to OPENED. - RS is yet to add to online regions in the RS side. Here in openRegion() and updateMeta() any failures we are moving the znode to FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other cases the assignment is stopped. The idea is to just add the current state of the region assignment in the RIT map in the RS side and using that info we can determine whether the assignment can be retried or not on getting an RAITE. Considering the current work going on in AM, pls do share if this is needed atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455923#comment-13455923 ] Ted Yu commented on HBASE-6299: --- @Maryann: Latest patch looks good. In the future, please increase revision number for newer patches. Thanks RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems. - Key: HBASE-6299 URL: https://issues.apache.org/jira/browse/HBASE-6299 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6299.patch, HBASE-6299-v2.patch, HBASE-6299-v3.patch 1. HMaster tries to assign a region to an RS. 2. HMaster creates a RegionState for this region and puts it into regionsInTransition. 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS receives the open region request and starts to proceed, with success eventually. However, due to network problems, HMaster fails to receive the response for the openRegion() call, and the call times out. 4. HMaster attemps to assign for a second time, choosing another RS. 5. But since the HMaster's OpenedRegionHandler has been triggered by the region open of the previous RS, and the RegionState has already been removed from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK node RS_ZK_REGION_OPENING updated by the second attempt. 6. The unassigned ZK node stays and a later unassign fails coz RS_ZK_REGION_CLOSING cannot be created. {code} 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., src=swbss-hadoop-004,60020,1340890123243, dest=swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. to swbss-hadoop-006,60020,1340890678078 2012-06-29 07:03:38,870 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:28,882 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,291 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, region=b713fd655fa02395496c5a6e39ddf568 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED event for CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x2377fee2ae80007 Deleting existing unassigned node for b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x2377fee2ae80007 Successfully deleted unassigned node for region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has opened the region CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. that was online on serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) 2012-06-29 07:07:41,140 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0,
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455937#comment-13455937 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- [~jenraj] Thanks for the patch. This patch address only the current JIRA. Maryann's latest patch addresses HBASE-6299. RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies -- Key: HBASE-6438 URL: https://issues.apache.org/jira/browse/HBASE-6438 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: rajeshbabu Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94.patch, HBASE-6438_trunk.patch Seeing some of the recent issues in region assignment, RegionAlreadyInTransitionException is one reason after which the region assignment may or may not happen(in the sense we need to wait for the TM to assign). In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on master restart. Consider the following case, due to some reason like master restart or external assign call, we try to assign a region that is already getting opened in a RS. Now the next call to assign has already changed the state of the znode and so the current assign that is going on the RS is affected and it fails. The second assignment that started also fails getting RAITE exception. Finally both assignments not carrying on. Idea is to find whether any such RAITE exception can be retried or not. Here again we have following cases like where - The znode is yet to transitioned from OFFLINE to OPENING in RS - RS may be in the step of openRegion. - RS may be trying to transition OPENING to OPENED. - RS is yet to add to online regions in the RS side. Here in openRegion() and updateMeta() any failures we are moving the znode to FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other cases the assignment is stopped. The idea is to just add the current state of the region assignment in the RIT map in the RS side and using that info we can determine whether the assignment can be retried or not on getting an RAITE. Considering the current work going on in AM, pls do share if this is needed atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455937#comment-13455937 ] ramkrishna.s.vasudevan edited comment on HBASE-6438 at 9/15/12 3:56 AM: @Rajesh Thanks for the patch. This patch address only the current JIRA. Maryann's latest patch addresses HBASE-6299. was (Author: ram_krish): [~jenraj] Thanks for the patch. This patch address only the current JIRA. Maryann's latest patch addresses HBASE-6299. RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies -- Key: HBASE-6438 URL: https://issues.apache.org/jira/browse/HBASE-6438 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: rajeshbabu Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94.patch, HBASE-6438_trunk.patch Seeing some of the recent issues in region assignment, RegionAlreadyInTransitionException is one reason after which the region assignment may or may not happen(in the sense we need to wait for the TM to assign). In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on master restart. Consider the following case, due to some reason like master restart or external assign call, we try to assign a region that is already getting opened in a RS. Now the next call to assign has already changed the state of the znode and so the current assign that is going on the RS is affected and it fails. The second assignment that started also fails getting RAITE exception. Finally both assignments not carrying on. Idea is to find whether any such RAITE exception can be retried or not. Here again we have following cases like where - The znode is yet to transitioned from OFFLINE to OPENING in RS - RS may be in the step of openRegion. - RS may be trying to transition OPENING to OPENED. - RS is yet to add to online regions in the RS side. Here in openRegion() and updateMeta() any failures we are moving the znode to FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other cases the assignment is stopped. The idea is to just add the current state of the region assignment in the RIT map in the RS side and using that info we can determine whether the assignment can be retried or not on getting an RAITE. Considering the current work going on in AM, pls do share if this is needed atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6260: -- Attachment: 6260-addendum-3.txt Addendum v3 combines the first two addenda balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: 6260-addendum-3.txt, HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-5448: - Attachment: HBASE-5448_3.patch New patch updating package javadoc for org.apache.hadoop.hbase.coprocessor and org.apache.hadoop.hbase.client.coprocessor, addressing some review comments, and adding an example coprocessor Service implementation (RowCountEndpoint). Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-5448: - Status: Open (was: Patch Available) Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-5448: - Status: Patch Available (was: Open) Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6786) Convert MultiRowMutationProtocol to protocol buffer service
Gary Helmling created HBASE-6786: Summary: Convert MultiRowMutationProtocol to protocol buffer service Key: HBASE-6786 URL: https://issues.apache.org/jira/browse/HBASE-6786 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: Gary Helmling Fix For: 0.96.0 With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6785) Convert AggregateProtocol to protobuf defined coprocessor service
Gary Helmling created HBASE-6785: Summary: Convert AggregateProtocol to protobuf defined coprocessor service Key: HBASE-6785 URL: https://issues.apache.org/jira/browse/HBASE-6785 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: Gary Helmling Fix For: 0.96.0 With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6787) Convert RowProcessorProtocol to protocol buffer service
Gary Helmling created HBASE-6787: Summary: Convert RowProcessorProtocol to protocol buffer service Key: HBASE-6787 URL: https://issues.apache.org/jira/browse/HBASE-6787 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: Gary Helmling Fix For: 0.96.0 With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6788) Convert AuthenticationProtocol to protocol buffer service
Gary Helmling created HBASE-6788: Summary: Convert AuthenticationProtocol to protocol buffer service Key: HBASE-6788 URL: https://issues.apache.org/jira/browse/HBASE-6788 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: Gary Helmling Fix For: 0.96.0 With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. AccessControllerProtocol was converted as part of HBASE-5448, but the authentication token provider still needs to be changed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
Gary Helmling created HBASE-6789: Summary: Convert test CoprocessorProtocol implementations to protocol buffer services Key: HBASE-6789 URL: https://issues.apache.org/jira/browse/HBASE-6789 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: Gary Helmling Fix For: 0.96.0 With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. Several CoprocessorProtocol implementations are defined for tests: * ColumnAggregationProtocol * GenericProtocol * TestServerCustomProtocol.PingProtocol These should either be converted to PB services or removed if they duplicate other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6790) Expose protocol buffer based coprocessor services through REST server
Gary Helmling created HBASE-6790: Summary: Expose protocol buffer based coprocessor services through REST server Key: HBASE-6790 URL: https://issues.apache.org/jira/browse/HBASE-6790 Project: HBase Issue Type: Improvement Components: coprocessors, rest Reporter: Gary Helmling With coprocessor endpoints switching over to protocol buffer defined services, it should be a lot easier to support invoking endpoint methods over the REST server, using the REST support for PB serialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6776) Opened region of disabled/enabling table is not added to online region list
[ https://issues.apache.org/jira/browse/HBASE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455965#comment-13455965 ] ramkrishna.s.vasudevan commented on HBASE-6776: --- @Jimmy If i read the patch correctly we just add all the regions into the regiononline map. Now if i have 4 regions for a table. 1 is assigned, 2 in the process of assigning and 1 is yet to be assigned so the table is in ENABLING. Now if all the regions are added to online map then how will EnableTableHandler know that he has to assign the ones that are not assigned? Actually HBASe-6317 tries to solve the problem that you mentioned - double assignment. Pls correct me if am wrong Jimmy. May be am missing something in trunk after the refactorings? Opened region of disabled/enabling table is not added to online region list --- Key: HBASE-6776 URL: https://issues.apache.org/jira/browse/HBASE-6776 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6776.patch For opened region of disabled table, it should be added to online region list, and then closed. We should not just ignore them. For opened region of enabling table, it should be added to online region list, so that we don't have to assign it again. Without adding it to online region list, it could be double assigned when assign it again later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455969#comment-13455969 ] Ted Yu commented on HBASE-6438: --- @Rajesh: Latest patch looks good. nit: else keyword is not needed below: {code} +return -1; + } else { {code} Please produce patch for trunk and let Hadoop QA run the tests. RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies -- Key: HBASE-6438 URL: https://issues.apache.org/jira/browse/HBASE-6438 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: rajeshbabu Fix For: 0.96.0, 0.92.3, 0.94.3 Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94.patch, HBASE-6438_trunk.patch Seeing some of the recent issues in region assignment, RegionAlreadyInTransitionException is one reason after which the region assignment may or may not happen(in the sense we need to wait for the TM to assign). In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on master restart. Consider the following case, due to some reason like master restart or external assign call, we try to assign a region that is already getting opened in a RS. Now the next call to assign has already changed the state of the znode and so the current assign that is going on the RS is affected and it fails. The second assignment that started also fails getting RAITE exception. Finally both assignments not carrying on. Idea is to find whether any such RAITE exception can be retried or not. Here again we have following cases like where - The znode is yet to transitioned from OFFLINE to OPENING in RS - RS may be in the step of openRegion. - RS may be trying to transition OPENING to OPENED. - RS is yet to add to online regions in the RS side. Here in openRegion() and updateMeta() any failures we are moving the znode to FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other cases the assignment is stopped. The idea is to just add the current state of the region assignment in the RIT map in the RS side and using that info we can determine whether the assignment can be retried or not on getting an RAITE. Considering the current work going on in AM, pls do share if this is needed atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6776) Opened region of disabled/enabling table is not added to online region list
[ https://issues.apache.org/jira/browse/HBASE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455973#comment-13455973 ] Jimmy Xiang commented on HBASE-6776: That's a good question. In rebuilding the user regions, only these regions are added to online region map: 1. there is a server info in Meta for the region, and 2. the server is online. Here is the assumption: For the regions not assigned yet or still in assigning, the meta entry should not have the server info (assume no region in transition znodes, if there is, process region in transition will correct the assigning region's state). For tables not in ENABLING, this is our assumption too. So the question is whether the assumption correct, right? I think it is correct. Opened region of disabled/enabling table is not added to online region list --- Key: HBASE-6776 URL: https://issues.apache.org/jira/browse/HBASE-6776 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6776.patch For opened region of disabled table, it should be added to online region list, and then closed. We should not just ignore them. For opened region of enabling table, it should be added to online region list, so that we don't have to assign it again. Without adding it to online region list, it could be double assigned when assign it again later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-5547: --- Attachment: hbase-5547-0.94-backport-v0.patch Attaching patch for the 0.94 backport. I dropped support for the ZooKeeperTableArchive client given that is (1) just an example, (2) has a notoriously flakey test in trunk, (3) as can example can be found in trunk for interested parties. Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates Fix For: 0.96.0, 0.94.3 Attachments: 5547.addendum-v3, 5547-addendum-v4.txt, 5547-v12.txt, 5547-v16.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, hbase-5547-0.94-backport-v0.patch, hbase-5547-v9.patch, java_HBASE-5547.addendum, java_HBASE-5547.addendum-v1, java_HBASE-5547.addendum-v2, java_HBASE-5547_v13.patch, java_HBASE-5547_v14.patch, java_HBASE-5547_v15.patch, java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455986#comment-13455986 ] Jesse Yates commented on HBASE-5547: Also, ran the added tests and they passed locally. However, it did take a slight tweak to the pom since jettison couldn't be found - had to add it to the pom to get things to compile, though I didn't change any of those things. Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates Fix For: 0.96.0, 0.94.3 Attachments: 5547.addendum-v3, 5547-addendum-v4.txt, 5547-v12.txt, 5547-v16.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, hbase-5547-0.94-backport-v0.patch, hbase-5547-v9.patch, java_HBASE-5547.addendum, java_HBASE-5547.addendum-v1, java_HBASE-5547.addendum-v2, java_HBASE-5547_v13.patch, java_HBASE-5547_v14.patch, java_HBASE-5547_v15.patch, java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455987#comment-13455987 ] Hadoop QA commented on HBASE-6260: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545168/6260-addendum-3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2870//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2870//console This message is automatically generated. balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: 6260-addendum-3.txt, HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455990#comment-13455990 ] Hadoop QA commented on HBASE-5547: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545182/hbase-5547-0.94-backport-v0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2872//console This message is automatically generated. Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates Fix For: 0.96.0, 0.94.3 Attachments: 5547.addendum-v3, 5547-addendum-v4.txt, 5547-v12.txt, 5547-v16.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, hbase-5547-0.94-backport-v0.patch, hbase-5547-v9.patch, java_HBASE-5547.addendum, java_HBASE-5547.addendum-v1, java_HBASE-5547.addendum-v2, java_HBASE-5547_v13.patch, java_HBASE-5547_v14.patch, java_HBASE-5547_v15.patch, java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455993#comment-13455993 ] Michael Drzal commented on HBASE-6504: -- [~tsuna] does that work for you? Adding GC details prevents HBase from starting in non-distributed mode -- Key: HBASE-6504 URL: https://issues.apache.org/jira/browse/HBASE-6504 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Benoit Sigoure Assignee: Michael Drzal Priority: Trivial Labels: noob Attachments: HBASE-6504-output.txt, HBASE-6504.patch The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out examples of variables that could be useful, such as adding {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has the annoying side effect that the JVM prints a summary of memory usage when it exits, and it does so on stdout: {code} $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed false Heap par new generation total 19136K, used 4908K [0x00073a20, 0x00073b6c, 0x00075186) eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, 0x00073b2a) from space 2112K, 0% used [0x00073b2a, 0x00073b2a, 0x00073b4b) to space 2112K, 0% used [0x00073b4b, 0x00073b4b, 0x00073b6c) concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 0x0007556c, 0x0007f5a0) concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 0x0007f6ec, 0x0008) $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed /dev/null (nothing printed) {code} And this confuses {{bin/start-hbase.sh}} when it does {{distMode=`$bin/hbase --config $HBASE_CONF_DIR org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, because then the {{distMode}} variable is not just set to {{false}}, it also contains all this JVM spam. If you don't pay enough attention and realize that 3 processes are getting started (ZK, HM, RS) instead of just one (HM), then you end up with this confusing error message: {{Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.}}, which is even more puzzling because when you run {{netstat}} to see who owns that port, then you won't find any rogue process other than the one you just started. I'm wondering if the fix is not to just change the {{if [ $distMode == 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6791) Create a MetricsRate
Elliott Clark created HBASE-6791: Summary: Create a MetricsRate Key: HBASE-6791 URL: https://issues.apache.org/jira/browse/HBASE-6791 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark MetricsRate for the hadoop metrics1 system needs to be ported to metrics2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3834) Store ignores checksum errors when opening files
[ https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456000#comment-13456000 ] Todd Lipcon commented on HBASE-3834: Great to see this is fixed in 0.94. Does someone have time to try this on the latest 0.90.x, which is still in production a lot of places? Store ignores checksum errors when opening files Key: HBASE-3834 URL: https://issues.apache.org/jira/browse/HBASE-3834 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.2 Reporter: Todd Lipcon Assignee: liang xie Priority: Critical Fix For: 0.90.8 Attachments: hbase-3834.tar.gz2 If you corrupt one of the storefiles in a region (eg using vim to muck up some bytes), the region will still open, but that storefile will just be ignored with a log message. We should probably not do this in general - better to keep that region unassigned and force an admin to make a decision to remove the bad storefile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6783) Make read short circuit the default
[ https://issues.apache.org/jira/browse/HBASE-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456003#comment-13456003 ] Todd Lipcon commented on HBASE-6783: To be clear, this is only for the tests, right? We can't make it the default in production because it requires server-side changes. Make read short circuit the default --- Key: HBASE-6783 URL: https://issues.apache.org/jira/browse/HBASE-6783 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: HBASE-6783.v1.patch Per mailing discussion, read short circuit has little or no drawback, hence should used by default. As a consequence, we activate it on the default tests. It's possible to launch the test with -Ddfs.client.read.shortcircuit=false to execute the tests without the shortcircuit, it will be used for some builds on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6783) Make read short circuit the default
[ https://issues.apache.org/jira/browse/HBASE-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456008#comment-13456008 ] Todd Lipcon commented on HBASE-6783: + String readOnConf = conf.get(dfs.client.read.shortcircuit); + return (readOnConf == null ? true : Boolean.parseBoolean(readOnConf)); can use conf.getBoolean() The config/property name should also be clear that it's a setting for tests - eg hbase.tests.use.shortcircuit.reads + private void readShortCircuit(){ +if (isReadShortCircuitOn()){ + String curUser = System.getProperty(user.name); + LOG.info(read short circuit is ON for user +curUser); style: space before {s, space after '+' rename to enableReadShortCircuit() +if (util.isReadShortCircuitOn()){ + LOG.info(dfs.client.read.shortcircuit is on, + + testFullSystemBubblesFSErrors is not executed); + return; +} Can use junit Assume here - there's a spurious whitespace change Make read short circuit the default --- Key: HBASE-6783 URL: https://issues.apache.org/jira/browse/HBASE-6783 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: HBASE-6783.v1.patch Per mailing discussion, read short circuit has little or no drawback, hence should used by default. As a consequence, we activate it on the default tests. It's possible to launch the test with -Ddfs.client.read.shortcircuit=false to execute the tests without the shortcircuit, it will be used for some builds on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456016#comment-13456016 ] Hadoop QA commented on HBASE-5448: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545173/HBASE-5448_3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 16 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2871//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2871//console This message is automatically generated. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6409) Create histogram class for metrics 2
[ https://issues.apache.org/jira/browse/HBASE-6409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6409: - Attachment: HBASE-6409-5.patch Rebase Create histogram class for metrics 2 Key: HBASE-6409 URL: https://issues.apache.org/jira/browse/HBASE-6409 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6409-0.patch, HBASE-6409-1.patch, HBASE-6409-2.patch, HBASE-6409-3.patch, HBASE-6409-4.patch, HBASE-6409-5.patch Create the replacement for MetricsHistogram and PersistantTimeVaryingRate classes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456025#comment-13456025 ] stack commented on HBASE-5448: -- [~ghelmling] Does the TestReplication pass for you Gary locally? I'd doubt your patch the problem. I took a quick look at patch. package doc is great. +1 on commit. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6409) Create histogram class for metrics 2
[ https://issues.apache.org/jira/browse/HBASE-6409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456060#comment-13456060 ] Hadoop QA commented on HBASE-6409: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545189/HBASE-6409-5.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2873//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2873//console This message is automatically generated. Create histogram class for metrics 2 Key: HBASE-6409 URL: https://issues.apache.org/jira/browse/HBASE-6409 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-6409-0.patch, HBASE-6409-1.patch, HBASE-6409-2.patch, HBASE-6409-3.patch, HBASE-6409-4.patch, HBASE-6409-5.patch Create the replacement for MetricsHistogram and PersistantTimeVaryingRate classes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5673) The OOM problem of IPC client call cause all handle block
[ https://issues.apache.org/jira/browse/HBASE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anonymous updated HBASE-5673: - Tags: 1 Affects Version/s: (was: 0.90.6) Release Note: 1 Hadoop Flags: Incompatible change (was: Reviewed) Status: Patch Available (was: Reopened) The OOM problem of IPC client call cause all handle block -- Key: HBASE-5673 URL: https://issues.apache.org/jira/browse/HBASE-5673 Project: HBase Issue Type: Bug Environment: 0.90.6 Reporter: xufeng Assignee: xufeng Labels: Fix For: 0.92.3 Attachments: HBASE-5673-90.patch, HBASE-5673-90-V2.patch if HBaseClient meet unable to create new native thread exception, the call will never complete because it be lost in calls queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5673) The OOM problem of IPC client call cause all handle block
[ https://issues.apache.org/jira/browse/HBASE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456072#comment-13456072 ] Hadoop QA commented on HBASE-5673: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520564/HBASE-5673-90-V2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2874//console This message is automatically generated. The OOM problem of IPC client call cause all handle block -- Key: HBASE-5673 URL: https://issues.apache.org/jira/browse/HBASE-5673 Project: HBase Issue Type: Bug Environment: 0.90.6 Reporter: xufeng Assignee: xufeng Fix For: 0.92.3 Attachments: HBASE-5673-90.patch, HBASE-5673-90-V2.patch if HBaseClient meet unable to create new native thread exception, the call will never complete because it be lost in calls queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6781) .archive directory should be added to HConstants.HBASE_NON_USER_TABLE_DIRS
[ https://issues.apache.org/jira/browse/HBASE-6781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates resolved HBASE-6781. Resolution: Duplicate Marking this a duplicate of HBASE-6439. Its a leftover piece from HBASE-5547 - Sameer has started working on it. .archive directory should be added to HConstants.HBASE_NON_USER_TABLE_DIRS -- Key: HBASE-6781 URL: https://issues.apache.org/jira/browse/HBASE-6781 Project: HBase Issue Type: Bug Reporter: Ted Yu Fix For: 0.96.0 We can see the following in test output: {code} 2012-09-14 00:50:43,500 DEBUG [IPC Server handler 0 on 51461] util.FSTableDescriptors(175): Exception during readTableDecriptor. Current table name = .archive org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under hdfs://localhost:35107/user/jenkins/hbase/.archive at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:417) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:408) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:170) at org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:201) at org.apache.hadoop.hbase.master.HMaster.getTableDescriptors(HMaster.java:2199) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:357) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816) {code} .archive directory should be added to HConstants.HBASE_NON_USER_TABLE_DIRS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456124#comment-13456124 ] Benoit Sigoure commented on HBASE-6504: --- Yeah. One minor nit though: the form {{head -1}} is deprecated (and has been for years). Better to use {{head -n 1}}. Adding GC details prevents HBase from starting in non-distributed mode -- Key: HBASE-6504 URL: https://issues.apache.org/jira/browse/HBASE-6504 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Benoit Sigoure Assignee: Michael Drzal Priority: Trivial Labels: noob Attachments: HBASE-6504-output.txt, HBASE-6504.patch The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out examples of variables that could be useful, such as adding {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has the annoying side effect that the JVM prints a summary of memory usage when it exits, and it does so on stdout: {code} $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed false Heap par new generation total 19136K, used 4908K [0x00073a20, 0x00073b6c, 0x00075186) eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, 0x00073b2a) from space 2112K, 0% used [0x00073b2a, 0x00073b2a, 0x00073b4b) to space 2112K, 0% used [0x00073b4b, 0x00073b4b, 0x00073b6c) concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 0x0007556c, 0x0007f5a0) concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 0x0007f6ec, 0x0008) $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed /dev/null (nothing printed) {code} And this confuses {{bin/start-hbase.sh}} when it does {{distMode=`$bin/hbase --config $HBASE_CONF_DIR org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, because then the {{distMode}} variable is not just set to {{false}}, it also contains all this JVM spam. If you don't pay enough attention and realize that 3 processes are getting started (ZK, HM, RS) instead of just one (HM), then you end up with this confusing error message: {{Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.}}, which is even more puzzling because when you run {{netstat}} to see who owns that port, then you won't find any rogue process other than the one you just started. I'm wondering if the fix is not to just change the {{if [ $distMode == 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456128#comment-13456128 ] Gregory Chanan commented on HBASE-6710: --- bq. How does the user verify whether the above commands returned inaccurate results? Operations on the table will behave as if the table is not actually in the state returned by is_enabled or is_disabled. For example, is_disabled may return true, but a subsequent call to enabled may throw a TableNotDisabledException. 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456129#comment-13456129 ] Gregory Chanan commented on HBASE-6710: --- Above comment should read enable not enabled 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6791) Create a MetricsRate
[ https://issues.apache.org/jira/browse/HBASE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456135#comment-13456135 ] Elliott Clark commented on HBASE-6791: -- So on looking at it more the MetricStats that are provided by metrics2 give enough flexibility that we don't need our own rates. Create a MetricsRate Key: HBASE-6791 URL: https://issues.apache.org/jira/browse/HBASE-6791 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark MetricsRate for the hadoop metrics1 system needs to be ported to metrics2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Drzal updated HBASE-6504: - Attachment: HBASE-6504-v2.patch Adding GC details prevents HBase from starting in non-distributed mode -- Key: HBASE-6504 URL: https://issues.apache.org/jira/browse/HBASE-6504 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Benoit Sigoure Assignee: Michael Drzal Priority: Trivial Labels: noob Attachments: HBASE-6504-output.txt, HBASE-6504.patch, HBASE-6504-v2.patch The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out examples of variables that could be useful, such as adding {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has the annoying side effect that the JVM prints a summary of memory usage when it exits, and it does so on stdout: {code} $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed false Heap par new generation total 19136K, used 4908K [0x00073a20, 0x00073b6c, 0x00075186) eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, 0x00073b2a) from space 2112K, 0% used [0x00073b2a, 0x00073b2a, 0x00073b4b) to space 2112K, 0% used [0x00073b4b, 0x00073b4b, 0x00073b6c) concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 0x0007556c, 0x0007f5a0) concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 0x0007f6ec, 0x0008) $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed /dev/null (nothing printed) {code} And this confuses {{bin/start-hbase.sh}} when it does {{distMode=`$bin/hbase --config $HBASE_CONF_DIR org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, because then the {{distMode}} variable is not just set to {{false}}, it also contains all this JVM spam. If you don't pay enough attention and realize that 3 processes are getting started (ZK, HM, RS) instead of just one (HM), then you end up with this confusing error message: {{Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.}}, which is even more puzzling because when you run {{netstat}} to see who owns that port, then you won't find any rogue process other than the one you just started. I'm wondering if the fix is not to just change the {{if [ $distMode == 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6783) Make read short circuit the default
[ https://issues.apache.org/jira/browse/HBASE-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456138#comment-13456138 ] nkeywal commented on HBASE-6783: @todd Yes, tests only. For production, we can only document this as 'should be set except if you have a reason for not doing it'. Thanks for the review, I'm ok with the comments. Make read short circuit the default --- Key: HBASE-6783 URL: https://issues.apache.org/jira/browse/HBASE-6783 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: HBASE-6783.v1.patch Per mailing discussion, read short circuit has little or no drawback, hence should used by default. As a consequence, we activate it on the default tests. It's possible to launch the test with -Ddfs.client.read.shortcircuit=false to execute the tests without the shortcircuit, it will be used for some builds on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6791) Create a MetricsRate
[ https://issues.apache.org/jira/browse/HBASE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark resolved HBASE-6791. -- Resolution: Invalid Create a MetricsRate Key: HBASE-6791 URL: https://issues.apache.org/jira/browse/HBASE-6791 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark MetricsRate for the hadoop metrics1 system needs to be ported to metrics2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Drzal updated HBASE-6504: - Status: Open (was: Patch Available) Adding GC details prevents HBase from starting in non-distributed mode -- Key: HBASE-6504 URL: https://issues.apache.org/jira/browse/HBASE-6504 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Benoit Sigoure Assignee: Michael Drzal Priority: Trivial Labels: noob Attachments: HBASE-6504-output.txt, HBASE-6504.patch, HBASE-6504-v2.patch The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out examples of variables that could be useful, such as adding {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has the annoying side effect that the JVM prints a summary of memory usage when it exits, and it does so on stdout: {code} $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed false Heap par new generation total 19136K, used 4908K [0x00073a20, 0x00073b6c, 0x00075186) eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, 0x00073b2a) from space 2112K, 0% used [0x00073b2a, 0x00073b2a, 0x00073b4b) to space 2112K, 0% used [0x00073b4b, 0x00073b4b, 0x00073b6c) concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 0x0007556c, 0x0007f5a0) concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 0x0007f6ec, 0x0008) $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed /dev/null (nothing printed) {code} And this confuses {{bin/start-hbase.sh}} when it does {{distMode=`$bin/hbase --config $HBASE_CONF_DIR org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, because then the {{distMode}} variable is not just set to {{false}}, it also contains all this JVM spam. If you don't pay enough attention and realize that 3 processes are getting started (ZK, HM, RS) instead of just one (HM), then you end up with this confusing error message: {{Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.}}, which is even more puzzling because when you run {{netstat}} to see who owns that port, then you won't find any rogue process other than the one you just started. I'm wondering if the fix is not to just change the {{if [ $distMode == 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Drzal updated HBASE-6504: - Status: Patch Available (was: Open) Adding GC details prevents HBase from starting in non-distributed mode -- Key: HBASE-6504 URL: https://issues.apache.org/jira/browse/HBASE-6504 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Benoit Sigoure Assignee: Michael Drzal Priority: Trivial Labels: noob Attachments: HBASE-6504-output.txt, HBASE-6504.patch, HBASE-6504-v2.patch The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out examples of variables that could be useful, such as adding {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has the annoying side effect that the JVM prints a summary of memory usage when it exits, and it does so on stdout: {code} $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed false Heap par new generation total 19136K, used 4908K [0x00073a20, 0x00073b6c, 0x00075186) eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, 0x00073b2a) from space 2112K, 0% used [0x00073b2a, 0x00073b2a, 0x00073b4b) to space 2112K, 0% used [0x00073b4b, 0x00073b4b, 0x00073b6c) concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 0x0007556c, 0x0007f5a0) concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 0x0007f6ec, 0x0008) $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed /dev/null (nothing printed) {code} And this confuses {{bin/start-hbase.sh}} when it does {{distMode=`$bin/hbase --config $HBASE_CONF_DIR org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, because then the {{distMode}} variable is not just set to {{false}}, it also contains all this JVM spam. If you don't pay enough attention and realize that 3 processes are getting started (ZK, HM, RS) instead of just one (HM), then you end up with this confusing error message: {{Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.}}, which is even more puzzling because when you run {{netstat}} to see who owns that port, then you won't find any rogue process other than the one you just started. I'm wondering if the fix is not to just change the {{if [ $distMode == 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456141#comment-13456141 ] Michael Drzal commented on HBASE-6504: -- Changed it to head -n 1 since this is the second time this came up in the review, and I don't care either way. Adding GC details prevents HBase from starting in non-distributed mode -- Key: HBASE-6504 URL: https://issues.apache.org/jira/browse/HBASE-6504 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Benoit Sigoure Assignee: Michael Drzal Priority: Trivial Labels: noob Attachments: HBASE-6504-output.txt, HBASE-6504.patch, HBASE-6504-v2.patch The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out examples of variables that could be useful, such as adding {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has the annoying side effect that the JVM prints a summary of memory usage when it exits, and it does so on stdout: {code} $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed false Heap par new generation total 19136K, used 4908K [0x00073a20, 0x00073b6c, 0x00075186) eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, 0x00073b2a) from space 2112K, 0% used [0x00073b2a, 0x00073b2a, 0x00073b4b) to space 2112K, 0% used [0x00073b4b, 0x00073b4b, 0x00073b6c) concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 0x0007556c, 0x0007f5a0) concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 0x0007f6ec, 0x0008) $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed /dev/null (nothing printed) {code} And this confuses {{bin/start-hbase.sh}} when it does {{distMode=`$bin/hbase --config $HBASE_CONF_DIR org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, because then the {{distMode}} variable is not just set to {{false}}, it also contains all this JVM spam. If you don't pay enough attention and realize that 3 processes are getting started (ZK, HM, RS) instead of just one (HM), then you end up with this confusing error message: {{Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.}}, which is even more puzzling because when you run {{netstat}} to see who owns that port, then you won't find any rogue process other than the one you just started. I'm wondering if the fix is not to just change the {{if [ $distMode == 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456142#comment-13456142 ] Ted Yu commented on HBASE-6710: --- I don't expect user to issue 'enable' command just to verify whether the table is really disabled :-) 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456146#comment-13456146 ] Gregory Chanan commented on HBASE-6710: --- bq. Should hbck be enhanced to detect / fix such inconsistencies ? It could, but I don't think it's worth the effort. You only see the issue on 0.92.0/0.92.1 clients, and hbck in those versions can't fix it anyway. So you'd have to go to a 0.94.2 client and run hbck, when you could just retry on the 0.92.0/0.92.1 client. 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456147#comment-13456147 ] Andrew Purtell commented on HBASE-5448: --- Big thanks for adding the rowcounter service-based coprocessor example Gary. Looks good to me. I see the follow up JIRAs also, makes sense. +1 for commit. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456150#comment-13456150 ] Gregory Chanan commented on HBASE-6710: --- bq. I don't expect user to issue 'enable' command just to verify whether the table is really disabled Right, but what they could do, is call disable which blocks until is_disabled returns true. is_disabled may return true incorrectly and a subsequent call to enable may throw a TableNotDisabledException. 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6241) HBaseCluster interface for interacting with the cluster from system tests
[ https://issues.apache.org/jira/browse/HBASE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-6241: - Attachment: HBASE-6241_v5.patch Attaching a patch which should fix the compilation issue. Thanks Stack for trying it out. HBaseCluster interface for interacting with the cluster from system tests -- Key: HBASE-6241 URL: https://issues.apache.org/jira/browse/HBASE-6241 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HBASE-6241_v0.2.patch, HBASE-6241_v1.patch, HBASE-6241_v4.patch, HBASE-6241_v5.patch We need to abstract away the cluster interactions for system tests running on actual clusters. MiniHBaseCluster and RealHBaseCluster should both implement this interface, and system tests should work with both. I'll split Devaraj's patch in HBASE-6053 for the initial version. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456160#comment-13456160 ] Gregory Chanan commented on HBASE-6710: --- bq. Should we fail silently if node deletion fails ? Would this increase the chance of inconsistency between znode92 and znode ? The name deleteNodeFailSilent is bad. It's only silent on a NoNodeException, which is fine because that's the state you want anyway. 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6710) 0.92/0.94 compatibility issues due to HBASE-5206
[ https://issues.apache.org/jira/browse/HBASE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456170#comment-13456170 ] Gregory Chanan commented on HBASE-6710: --- Here's a second draft of the release note... This issue introduces a compatibility mode on the HMaster for 0.92.0 and 0.92.1 clients that requires a configuration change on those client to turn on. Without the compatibility mode, 0.92.0 and 0.92.1 clients will hang on calls to enableTable and is_enabled will always return false, even for enabled tables. To use the compatibility mode, 0.92.0 and 0.92.1 clients should make the following configuration change: namezookeeper.znode.tableEnableDisable/name valuetable92/value In rare failure cases, even with the compatibility mode on, the client may report incorrect results for is_enabled and is_disabled. For example, is_enabled may return true even though the table is disabled (the correct value can be checked via the HMaster UI). This issue can be corrected by calling enable or disable to return the table to the desired state. 0.92/0.94 compatibility issues due to HBASE-5206 Key: HBASE-6710 URL: https://issues.apache.org/jira/browse/HBASE-6710 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6710-v3.patch HBASE-5206 introduces some compatibility issues between {0.94,0.94.1} and {0.92.0,0.92.1}. The release notes of HBASE-5155 describes the issue (HBASE-5206 is a backport of HBASE-5155). I think we can make 0.94.2 compatible with both {0.94.0,0.94.1} and {0.92.0,0.92.1}, although one of those sets will require configuration changes. The basic problem is that there is a znode for each table zookeeper.znode.tableEnableDisable that is handled differently. On 0.92.0 and 0.92.1 the states for this table are: [ disabled, disabling, enabling ] or deleted if the table is enabled On 0.94.1 and 0.94.2 the states for this table are: [ disabled, disabling, enabling, enabled ] What saves us is that the location of this znode is configurable. So the basic idea is to have the 0.94.2 master write two different znodes, zookeeper.znode.tableEnableDisabled92 and zookeeper.znode.tableEnableDisabled94 where the 92 node is in 92 format, the 94 node is in 94 format. And internally, the master would only use the 94 format in order to solve the original bug HBASE-5155 solves. We can of course make one of these the same default as exists now, so we don't need to make config changes for one of 0.92 or 0.94 clients. I argue that 0.92 clients shouldn't have to make config changes for the same reason I argued above. But that is debatable. Then, I think the only question left is the question of how to bring along the {0.94.0, 0.94.1} crew. A {0.94.0, 0.94.1} client would work against a 0.94.2 cluster by just configuring zookeeper.znode.tableEnableDisable in the client to be whatever zookeeper.znode.tableEnableDisabled94 is in the cluster. A 0.94.2 client would work against both a {0.94.0, 0.94.1} and {0.92.0, 0.92.1} cluster if it had HBASE-6268 applied. About rolling upgrade from {0.94.0, 0.94.1} to 0.94.2 -- I'd have to think about that. Do the regionservers ever read the tableEnableDisabled znode? On the mailing list, Lars H suggested the following: The only input I'd have is that format we'll use going forward will not have a version attached to it. So maybe the 92 version would still be called zookeeper.znode.tableEnableDisable and the new node could have a different name zookeeper.znode.tableEnableDisableNew (or something). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456177#comment-13456177 ] Gary Helmling commented on HBASE-5448: -- Thanks for the reviews guys. [~saint@gmail.com] TestReplication is timing out for me locally both on current trunk and on my branch, so don't think it's anything specific to this change. [~yuzhih...@gmail.com] I'll fix up your additional comments on commit if that's okay with you. Have local changes for them already. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6516) hbck cannot detect any IOException while .tableinfo file is missing
[ https://issues.apache.org/jira/browse/HBASE-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456179#comment-13456179 ] Elliott Clark commented on HBASE-6516: -- An InterfaceAudience slipped into 0.94 here. It breaks 0.94 for older versions of hadoop. hbck cannot detect any IOException while .tableinfo file is missing - Key: HBASE-6516 URL: https://issues.apache.org/jira/browse/HBASE-6516 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Assignee: Jie Huang Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: hbase-6516-94-v5.patch, hbase-6516.patch, hbase-6516-v2.patch, hbase-6516-v3.patch, hbase-6516-v4.patch, hbase-6516-v5a.patch, hbase-6516-v5.patch HBaseFsck checks those missing .tableinfo files in loadHdfsRegionInfos() function. However, no IoException will be catched while .tableinfo is missing, since FSTableDescriptors.getTableDescriptor doesn't throw any IoException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456183#comment-13456183 ] Gregory Chanan commented on HBASE-6260: --- HadoopQA hung in a couple tests: {noformat} dev-support/findHangingTest.sh https://builds.apache.org/job/PreCommit-HBASE-Build/2870//console % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 159k0 159k0 0 563k 0 --:--:-- --:--:-- --:--:-- 1413k Hanging test: Running org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap Hanging test: Running org.apache.hadoop.hbase.util.TestHBaseFsck {noformat} Ran the unit tests locally and they passed: {noformat} Running org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.809 sec {noformat} {noformat} Running org.apache.hadoop.hbase.util.TestHBaseFsck Tests run: 27, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 210.967 sec {noformat} balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: 6260-addendum-3.txt, HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6571) Generic multi-thread/cross-process error handling framework
[ https://issues.apache.org/jira/browse/HBASE-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6571: -- Fix Version/s: hbase-6055 Generic multi-thread/cross-process error handling framework --- Key: HBASE-6571 URL: https://issues.apache.org/jira/browse/HBASE-6571 Project: HBase Issue Type: Sub-task Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: Distributed Error Monitoring.docx, java_HBASE-6571-v0.patch The idea for a generic inter-process error-handling framework came from working on HBASE-6055 (snapshots). Distributed snapshots require tight time constraints in taking a snapshot to minimize offline time in face of errors. However, we often need to coordinate errors between processes and the current Abortable framework is not sufficiently flexible to handle the multitude of situations that can occur when coordinating between all region servers, the master and zookeeper. Using this framework error handling for snapshots was a simple matter, amounting to maybe 200 LOC. This seems to be a generally useful framework and can be used to easily add inter-process error handling in HBase. The most obvious immediate usage is as part of HBASE-5487 when coordinating multiple sub-tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6573) Distributed Three-Phase Commit framework.
[ https://issues.apache.org/jira/browse/HBASE-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6573: -- Fix Version/s: (was: 0.96.0) hbase-6055 Distributed Three-Phase Commit framework. - Key: HBASE-6573 URL: https://issues.apache.org/jira/browse/HBASE-6573 Project: HBase Issue Type: Sub-task Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: java_HBASE-6573-v0.patch For HBASE-6055 (snapshots), we do two-phase commit in several places. This is a generally useful paradigm for a distributed system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6765) 'Take a snapshot' interface
[ https://issues.apache.org/jira/browse/HBASE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6765: -- Fix Version/s: (was: 0.96.0) hbase-6055 'Take a snapshot' interface --- Key: HBASE-6765 URL: https://issues.apache.org/jira/browse/HBASE-6765 Project: HBase Issue Type: Sub-task Components: client, master, snapshots Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: hbase-6765-v0.patch Add interfaces taking a snapshot. This is in hopes of cutting down on the overhead involved in reviewing snapshots. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6777) Snapshot Restore interface
[ https://issues.apache.org/jira/browse/HBASE-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6777: -- Fix Version/s: (was: 0.96.0) hbase-6055 Snapshot Restore interface -- Key: HBASE-6777 URL: https://issues.apache.org/jira/browse/HBASE-6777 Project: HBase Issue Type: Sub-task Components: client, master, snapshots Affects Versions: 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: hbase-6055 Add interfaces for restoring a snapshot -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6055) Snapshots in HBase 0.96
[ https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6055: -- Fix Version/s: hbase-6055 Snapshots in HBase 0.96 --- Key: HBASE-6055 URL: https://issues.apache.org/jira/browse/HBASE-6055 Project: HBase Issue Type: New Feature Components: client, master, regionserver, snapshots, zookeeper Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0, hbase-6055 Attachments: Snapshots in HBase.docx Continuation of HBASE-50 for the current trunk. Since the implementation has drastically changed, opening as a new ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6516) hbck cannot detect any IOException while .tableinfo file is missing
[ https://issues.apache.org/jira/browse/HBASE-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456186#comment-13456186 ] Jonathan Hsieh commented on HBASE-6516: --- since its been about 2 weeks, I'll file a new issue and fix. Thanks for finding this elliot. hbck cannot detect any IOException while .tableinfo file is missing - Key: HBASE-6516 URL: https://issues.apache.org/jira/browse/HBASE-6516 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Assignee: Jie Huang Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: hbase-6516-94-v5.patch, hbase-6516.patch, hbase-6516-v2.patch, hbase-6516-v3.patch, hbase-6516-v4.patch, hbase-6516-v5a.patch, hbase-6516-v5.patch HBaseFsck checks those missing .tableinfo files in loadHdfsRegionInfos() function. However, no IoException will be catched while .tableinfo is missing, since FSTableDescriptors.getTableDescriptor doesn't throw any IoException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6792) Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516
Jonathan Hsieh created HBASE-6792: - Summary: Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516 Key: HBASE-6792 URL: https://issues.apache.org/jira/browse/HBASE-6792 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh bq. An InterfaceAudience slipped into 0.94 here. It breaks 0.94 for older versions of hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6792) Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516
[ https://issues.apache.org/jira/browse/HBASE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6792: -- Attachment: hbase-6792.patch Trivial patch. Thanks for finding this problem Elliot. Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516 --- Key: HBASE-6792 URL: https://issues.apache.org/jira/browse/HBASE-6792 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Attachments: hbase-6792.patch bq. An InterfaceAudience slipped into 0.94 here. It breaks 0.94 for older versions of hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456190#comment-13456190 ] Hadoop QA commented on HBASE-6504: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545213/HBASE-6504-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2875//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2875//console This message is automatically generated. Adding GC details prevents HBase from starting in non-distributed mode -- Key: HBASE-6504 URL: https://issues.apache.org/jira/browse/HBASE-6504 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Benoit Sigoure Assignee: Michael Drzal Priority: Trivial Labels: noob Attachments: HBASE-6504-output.txt, HBASE-6504.patch, HBASE-6504-v2.patch The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out examples of variables that could be useful, such as adding {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has the annoying side effect that the JVM prints a summary of memory usage when it exits, and it does so on stdout: {code} $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed false Heap par new generation total 19136K, used 4908K [0x00073a20, 0x00073b6c, 0x00075186) eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, 0x00073b2a) from space 2112K, 0% used [0x00073b2a, 0x00073b2a, 0x00073b4b) to space 2112K, 0% used [0x00073b4b, 0x00073b4b, 0x00073b6c) concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 0x0007556c, 0x0007f5a0) concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 0x0007f6ec, 0x0008) $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed /dev/null (nothing printed) {code} And this confuses {{bin/start-hbase.sh}} when it does {{distMode=`$bin/hbase --config $HBASE_CONF_DIR org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, because then the {{distMode}} variable is not just set to {{false}}, it also contains all this JVM spam. If you don't pay enough attention and realize that 3 processes are getting started (ZK, HM, RS) instead of just one (HM), then you end up with this confusing error message: {{Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.}}, which is even more puzzling because when you run {{netstat}} to see who owns that port, then you won't find any rogue process other than the one you just started. I'm wondering if the fix is not to just change the {{if [ $distMode == 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6792) Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516
[ https://issues.apache.org/jira/browse/HBASE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh resolved HBASE-6792. --- Resolution: Fixed Fix Version/s: 0.94.3 0.92.3 Assignee: Jonathan Hsieh Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516 --- Key: HBASE-6792 URL: https://issues.apache.org/jira/browse/HBASE-6792 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: 0.92.3, 0.94.3 Attachments: hbase-6792.patch bq. An InterfaceAudience slipped into 0.94 here. It breaks 0.94 for older versions of hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6792) Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516
[ https://issues.apache.org/jira/browse/HBASE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456195#comment-13456195 ] Elliott Clark commented on HBASE-6792: -- Thanks for getting to this so fast. Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516 --- Key: HBASE-6792 URL: https://issues.apache.org/jira/browse/HBASE-6792 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: 0.92.3, 0.94.3 Attachments: hbase-6792.patch bq. An InterfaceAudience slipped into 0.94 here. It breaks 0.94 for older versions of hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6260) balancer state should be stored in ZK
[ https://issues.apache.org/jira/browse/HBASE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456197#comment-13456197 ] Ted Yu commented on HBASE-6260: --- Thanks for being careful. I think addendum v3 is ready to go. balancer state should be stored in ZK - Key: HBASE-6260 URL: https://issues.apache.org/jira/browse/HBASE-6260 Project: HBase Issue Type: Task Components: master, zookeeper Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Blocker Attachments: 6260-addendum-3.txt, HBASE-6260-addendum2.patch, HBASE-6260-addendum2.patch, HBASE-6260-addendum.patch, HBASE-6260.patch, HBASE-6260-v2.patch See: https://issues.apache.org/jira/browse/HBASE-5953?focusedCommentId=13270200page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13270200 And: https://issues.apache.org/jira/browse/HBASE-5630?focusedCommentId=13399225page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399225 In short, we need to move the balancer state to ZK so that it won't have to be restarted if the master dies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456199#comment-13456199 ] Ted Yu commented on HBASE-5448: --- @Gary: Totally fine with integration. It would be nice to see the performance difference between your rowcounter service-based coprocessor example and the map/reduce job example. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira