[jira] [Created] (HBASE-6196) MR testcases does not run with hadoop - 2.0.
ramkrishna.s.vasudevan created HBASE-6196: - Summary: MR testcases does not run with hadoop - 2.0. Key: HBASE-6196 URL: https://issues.apache.org/jira/browse/HBASE-6196 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0 Reporter: ramkrishna.s.vasudevan Fix For: 0.96.0 The MR related testcases are failing in hadoop-2.0. The resource manager scheduler is not getting spawned. The following fix solves the problem. If you feel it is ok, I can submit as patch and commit. {code} String rmSchedulerAdress = mrClusterJobConf.get(yarn.resourcemanager.scheduler.address); if (rmSchedulerAdress != null) { conf.set(yarn.resourcemanager.scheduler.address, rmSchedulerAdress); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6012) Handling RegionOpeningState for bulk assign since SSH using
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6012: Attachment: HBASE-6012v6.patch Handling RegionOpeningState for bulk assign since SSH using --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign since SSH using
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292711#comment-13292711 ] chunhui shen commented on HBASE-6012: - {code} // Can be a socket timeout, EOF, NoRouteToHost, etc LOG.info(Unable to communicate with the region server in order + to assign regions, e); - return false; + // Server may already get RPC + return true; {code} bq.What was the reasoning behind the above change ? At first, I think we can't ensure regionserver not receive the OPENREGION rpc, in order to prevent double assign, so I make this change. Also here could return false because of protecting through single assign. {code}throw new RegionAlreadyInTransitionException(Received: + currentAction + for the region: + region.getRegionNameAsString() + ,which we are already trying to + (openAction ? OPEN : CLOSE)+ .); {code} bq.Please add some sentence for the log above. We have added message about it in RegionAlreadyInTransitionException, so I think we needn't add again. For other comments, I update it in patch v6. Thanks. Handling RegionOpeningState for bulk assign since SSH using --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign since SSH using
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292715#comment-13292715 ] chunhui shen commented on HBASE-6012: - Created a Review Request with the patch v6. https://review.cloudera.org/r/2132/ Could you review again? Thanks. Handling RegionOpeningState for bulk assign since SSH using --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign since SSH using
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292718#comment-13292718 ] chunhui shen commented on HBASE-6012: - e...I find I can't upload diff for Review Request... Is it a server's error? Handling RegionOpeningState for bulk assign since SSH using --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...
[ https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292724#comment-13292724 ] steven zhuang commented on HBASE-4064: -- any one still work around this issue? Pls tell me how to fix this. Thanks! Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long... Key: HBASE-4064 URL: https://issues.apache.org/jira/browse/HBASE-4064 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.3 Reporter: Jieshan Bean Fix For: 0.90.7 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, disableflow.png 1. If there is a rubbish RegionState object with PENDING_CLOSE in regionsInTransition(The RegionState was remained by some exception which should be removed, that's why I called it as rubbish object), but the region is not currently assigned anywhere, TimeoutMonitor will fall into an endless loop: 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:21,438 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:21,441 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere . 2 In the following scenario, two concurrent unassigning call of the same region may lead to the above problem: the first unassign call send rpc call success, the master watched the event of RS_ZK_REGION_CLOSED, process this event, will create a ClosedRegionHandler to remove the state of the region in master.eg. while ClosedRegionHandler is running in hbase.master.executor.closeregion.threads thread (A), another unassign call of same region run in another thread(B). while thread B run if (!regions.containsKey(region)), this.regions have the region info, now cpu switch to thread A. The thread A will remove the region from the sets of this.regions and regionsInTransition, then switch to thread B. the thread B run continue, will throw an exception with the msg of Server null returned java.lang.NullPointerException: Passed server is null for 9a6e26d40293663a79523c58315b930f, but without removing the new-adding RegionState from regionsInTransition,and it can not be removed for ever. public void unassign(HRegionInfo region, boolean force) { LOG.debug(Starting unassignment of region +
[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...
[ https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292731#comment-13292731 ] ramkrishna.s.vasudevan commented on HBASE-4064: --- @Steven Which version of Hbase you are using? In the latest versions HBase-0.94 and above we are trying to solve these type of issues. Can you try with those versions? Thanks. Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long... Key: HBASE-4064 URL: https://issues.apache.org/jira/browse/HBASE-4064 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.3 Reporter: Jieshan Bean Fix For: 0.90.7 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, disableflow.png 1. If there is a rubbish RegionState object with PENDING_CLOSE in regionsInTransition(The RegionState was remained by some exception which should be removed, that's why I called it as rubbish object), but the region is not currently assigned anywhere, TimeoutMonitor will fall into an endless loop: 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:21,438 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:21,441 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere . 2 In the following scenario, two concurrent unassigning call of the same region may lead to the above problem: the first unassign call send rpc call success, the master watched the event of RS_ZK_REGION_CLOSED, process this event, will create a ClosedRegionHandler to remove the state of the region in master.eg. while ClosedRegionHandler is running in hbase.master.executor.closeregion.threads thread (A), another unassign call of same region run in another thread(B). while thread B run if (!regions.containsKey(region)), this.regions have the region info, now cpu switch to thread A. The thread A will remove the region from the sets of this.regions and regionsInTransition, then switch to thread B. the thread B run continue, will throw an exception with the msg of Server null returned java.lang.NullPointerException: Passed server is null for 9a6e26d40293663a79523c58315b930f, but without removing the new-adding RegionState from regionsInTransition,and it can not be removed for
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign since SSH using
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292742#comment-13292742 ] chunhui shen commented on HBASE-6012: - The above Review Request is discarded New: https://review.cloudera.org/r/2136/ Handling RegionOpeningState for bulk assign since SSH using --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292743#comment-13292743 ] chunhui shen commented on HBASE-6134: - Create new review request https://review.cloudera.org/r/2137/ Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292746#comment-13292746 ] ramkrishna.s.vasudevan commented on HBASE-5564: --- We got the problem. It was because there was a space created in the latest patch in the testcase ' = org.apache.hadoop.hbase.mapreduce.TsvImporterCustomTestMapper,'. There should not be any space before and after '='. Will rebase the patch so that it can be recommitted. Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Fix For: 0.96.0 Attachments: 5564.lint, 5564v5.txt, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6012: -- Summary: Handling RegionOpeningState for bulk assign (was: Handling RegionOpeningState for bulk assign since SSH using) Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6092) Authorize flush, split, compact operations in AccessController
[ https://issues.apache.org/jira/browse/HBASE-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laxman updated HBASE-6092: -- Fix Version/s: 0.94.1 0.96.0 Affects Version/s: 0.94.1 0.96.0 0.94.0 Status: Patch Available (was: Open) Patch as per the above approach. Authorize flush, split, compact operations in AccessController -- Key: HBASE-6092 URL: https://issues.apache.org/jira/browse/HBASE-6092 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.94.0, 0.96.0, 0.94.1 Reporter: Laxman Assignee: Laxman Labels: acl, security Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6092.patch Currently, flush, split and compaction are not checked for authorization in AccessController. With the current implementation any unauthorized client can trigger these operations on a table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6092) Authorize flush, split, compact operations in AccessController
[ https://issues.apache.org/jira/browse/HBASE-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laxman updated HBASE-6092: -- Attachment: HBASE-6092.patch Authorize flush, split, compact operations in AccessController -- Key: HBASE-6092 URL: https://issues.apache.org/jira/browse/HBASE-6092 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.94.0, 0.96.0, 0.94.1 Reporter: Laxman Assignee: Laxman Labels: acl, security Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6092.patch Currently, flush, split and compaction are not checked for authorization in AccessController. With the current implementation any unauthorized client can trigger these operations on a table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...
[ https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292777#comment-13292777 ] steven zhuang commented on HBASE-4064: -- hi, Ramkrishna, thanks for you reply, I am now using HBase0.92, and no, I cannot upgrade the cluster at present. actually my problem is a slightly different from this. I sort of closed the ROOT region under certain condition, later the cluster cannot be restarted, saying unsigning ROOT region because it's not assigned anywhere. We fixed it by clear the root region node from the zookeeper's unsigned nodes list. thanks anyway. Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long... Key: HBASE-4064 URL: https://issues.apache.org/jira/browse/HBASE-4064 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.3 Reporter: Jieshan Bean Fix For: 0.90.7 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, disableflow.png 1. If there is a rubbish RegionState object with PENDING_CLOSE in regionsInTransition(The RegionState was remained by some exception which should be removed, that's why I called it as rubbish object), but the region is not currently assigned anywhere, TimeoutMonitor will fall into an endless loop: 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:21,438 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:21,441 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere . 2 In the following scenario, two concurrent unassigning call of the same region may lead to the above problem: the first unassign call send rpc call success, the master watched the event of RS_ZK_REGION_CLOSED, process this event, will create a ClosedRegionHandler to remove the state of the region in master.eg. while ClosedRegionHandler is running in hbase.master.executor.closeregion.threads thread (A), another unassign call of same region run in another thread(B). while thread B run if (!regions.containsKey(region)), this.regions have the region info, now cpu switch to thread A. The thread A will remove the region from the sets of this.regions and regionsInTransition, then switch to thread B. the thread B run continue, will throw an exception
[jira] [Commented] (HBASE-6195) Increment data will lost when the memstore flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292825#comment-13292825 ] Xing Shi commented on HBASE-6195: - I just found that the append interface also has thie problem. May be I should open a new JIRA for it? Increment data will lost when the memstore flushed -- Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6195) Increment data will lost when the memstore flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6195: -- Status: Patch Available (was: Open) Increment data will lost when the memstore flushed -- Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6197) the append operation will lost data
Xing Shi created HBASE-6197: --- Summary: the append operation will lost data Key: HBASE-6197 URL: https://issues.apache.org/jira/browse/HBASE-6197 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Like the HBASE-6195, when flushing the append thread will read out the old value for the larger timestamp in snapshot and smaller timestamp in memstore. We Should make the first-in-thread generates the smaller timestamp. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6197) HRegion's append operation will lost data
[ https://issues.apache.org/jira/browse/HBASE-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Shi updated HBASE-6197: Summary: HRegion's append operation will lost data (was: the append operation will lost data) HRegion's append operation will lost data - Key: HBASE-6197 URL: https://issues.apache.org/jira/browse/HBASE-6197 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Like the HBASE-6195, when flushing the append thread will read out the old value for the larger timestamp in snapshot and smaller timestamp in memstore. We Should make the first-in-thread generates the smaller timestamp. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292835#comment-13292835 ] stack commented on HBASE-6060: -- What do I have to change in that section of code? It seems fine as is. Or are you thinking we need to allow it being in OFFLINE state? Yes, agree that before setting PENDING_OPEN, need to ensure it has not moved to OPENING. That should be easy enough (in fact, its not possible now I think about it since there a lock on RegionState that surrounds the single-assign -- need to make sure handleRegion respects it). What do you think of this approach Ram? The one where we rely on RegionState rather than on RegionPlan? I won't have the issue you see in Rajesh's patch, right? I don't intend to collect regions from outstanding regionplans. If the RS goes down before sending the RPC, then the single-assign will fail and retry but it won't be included in the SSH set? Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.96.0, 0.94.1, 0.92.3 Attachments: 6060-94-v3.patch, 6060-94-v4.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-trunk.patch, 6060-trunk.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, HBASE-6060-92.patch, HBASE-6060-94.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6195) Increment data will lost when the memstore flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292849#comment-13292849 ] Zhihong Ted Yu commented on HBASE-6195: --- @Xing: Hadoop QA run wasn't triggered. Can you add a unit test showing this problem and present test suite results ? Thanks Increment data will lost when the memstore flushed -- Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk-V3.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6195) Increment data will lost when the memstore flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Shi updated HBASE-6195: Attachment: HBASE-6195-trunk-V3.patch Increment data will lost when the memstore flushed -- Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk-V3.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96
[ https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292852#comment-13292852 ] Jonathan Hsieh commented on HBASE-6055: --- I still a bit confused -- still at the basic admin level. I think it would help if we give the restoring/export parts some more attention and talk about usage as opposed to mechanism first. I'm going to pose some use case/examples/scenarios which hopefully will be easier to discuss. Let's say I am an admin, and we are pre hdfs hardlinks. I issue a snapshot command at the shell/master. * HBase creates a new .snapshot subdir, and it contains references to HLogs and HFiles. This is a snapshot ** This step is called: snapshotting, taking a snapshot, and also materializing right? I currently have a snapshot. I want read-only access its contents to compare with the current table. * Does HBase know how to interpret the stuff in a .snapshot dir such that it act like a read-only table? * Do I, as an admin, need to execute some step to make it appear in HBase as a read-only table? (if so what is this called?) I currently have a snapshot. Oops! I accidentally truncated the table I had snapshotted. I don't want the truncated version of the table anymore and I want to replace the table with the snapshot so I have read write access. * This is called restoring the snapshot right? (and I do this by issuing a something like restore command at the shell?) * Does HBase copy or move the data referred to in the snapshot? I currently have a snapshot. I want the current version but I'd like to clone of the snapshotted table that provides read/write access to the clone. * Is/should this be supported? * Is this called restoring or exporting the snapshot (to a new name)? * For this to work I need to convert all references into actual copies of the HFiles and HLogs right? Is this conversion called exporting? (FYI, this is what I meant materializing to mean, but let's just stick to your definitions) I currently have a snapshot. I want to send a copy of the snapshot to a remote cluster so that it can provide read/write access to the data. * Is/should this be supported? * Do both HBase instances need to be up at the same time? ** This process would need to dereference the snapshot's references and copy them. What is it called? exporting? Source of confusion bq. Export is taking a snapshot from the .snapshot/ directory and possibly having a special snapshot distcp to somewhere. I would consider materialization as taking the exported snapshot and then 'hooking it back up' to another cluster (or the same) as a new table. You could throw materialization of the exported snapshot, but they are in fact distinct. I think the first materialization is supposed to be restoration yeah? I don't quite get the last sentence. Snapshots in HBase 0.96 --- Key: HBASE-6055 URL: https://issues.apache.org/jira/browse/HBASE-6055 Project: HBase Issue Type: New Feature Components: client, master, regionserver, zookeeper Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0 Attachments: Snapshots in HBase.docx Continuation of HBASE-50 for the current trunk. Since the implementation has drastically changed, opening as a new ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6195) Increment data will lost when the memstore flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292862#comment-13292862 ] Zhihong Ted Yu commented on HBASE-6195: --- In patch v3: {code} + long now = EnvironmentEdgeManager.currentTimeMillis(); Integer lid = getLock(lockid, row, true); {code} Variable now isn't actually referenced. Do we need it ? {code} + //store the kvs to the tmp memory for write hlog first, then write memory {code} The above should read: 'to temporary memstore before writing HLog' Increment data will lost when the memstore flushed -- Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk-V3.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6195) Increment data will be lost when the memstore is flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6195: -- Assignee: ShiXing Hadoop Flags: Reviewed Summary: Increment data will be lost when the memstore is flushed (was: Increment data will lost when the memstore flushed) Increment data will be lost when the memstore is flushed Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Assignee: ShiXing Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk-V3.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292868#comment-13292868 ] Zhihong Ted Yu commented on HBASE-6012: --- @Chunhui: Hadoop QA is not functioning. Can you run the whole test suite and post the result ? Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6188) Remove the concept of table owner
[ https://issues.apache.org/jira/browse/HBASE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292878#comment-13292878 ] ramkrishna.s.vasudevan commented on HBASE-6188: --- @Andy Had a small discussion with Laxman regarding the role of CREATE. I think the suggestion given by Laxman makes sense. Even if online schema modification is introduced the role of CREATE with the above said functions will still apply. Just wanted to add my thoughts in this. Thanks Andy and Laxman. Remove the concept of table owner - Key: HBASE-6188 URL: https://issues.apache.org/jira/browse/HBASE-6188 Project: HBase Issue Type: Sub-task Components: security Reporter: Andrew Purtell Assignee: Laxman Labels: security The table owner concept was a design simplification in the initial drop. First, the design changes under review means only a user with GLOBAL CREATE permission can create a table, which will probably be an administrator. Then, granting implicit permissions may lead to oversights and it adds unnecessary conditionals to our code. So instead the administrator with GLOBAL CREATE permission should make the appropriate grants at table create time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292883#comment-13292883 ] ramkrishna.s.vasudevan commented on HBASE-6060: --- bq. I don't intend to collect regions from outstanding regionplans I think we need this.(like the one in earlier patches). Or else if the region has been moved to OPENING and the destination RS goes down how do we try to process them. It will not be in META also right as it is still not opened in the destination RS. So a logic similar to Rajesh's patch will be needed but that along with your suggestion needs some minor tweaks. @Rajesh Correct me if am wrong. This is what we discussed today? @Stack We had a patch but could not upload it from office due to some other tasks. May be we could upload one by tomo morn once we reach office. Pls correct me if am wrong. (We discussed lot of scenarios today also, may be i can miss somethings here). Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.96.0, 0.94.1, 0.92.3 Attachments: 6060-94-v3.patch, 6060-94-v4.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-trunk.patch, 6060-trunk.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, HBASE-6060-92.patch, HBASE-6060-94.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292884#comment-13292884 ] stack commented on HBASE-6060: -- bq. Or else if the region has been moved to OPENING and the destination RS goes down how do we try to process them. The RegionState will have the server that reported the region in OPENING. We don't need to get it from RegionPlan (in AM#processServerShutdown, as part of the run through all RITs, we need to look for RegionStates that are against the server that has just died). Looking forward to your patch tomorrow. I'll upload one too (working on tests today). Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.96.0, 0.94.1, 0.92.3 Attachments: 6060-94-v3.patch, 6060-94-v4.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-trunk.patch, 6060-trunk.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, HBASE-6060-92.patch, HBASE-6060-94.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state
[ https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292890#comment-13292890 ] Laxman commented on HBASE-5947: --- @Matt, any update on this issue? Check for valid user/table/family/qualifier and acl state - Key: HBASE-5947 URL: https://issues.apache.org/jira/browse/HBASE-5947 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl HBase Shell grant/revoke doesn't check for valid user or table/family/qualifier so can you end up having rights for something that doesn't exists. We might also want to ensure, upon table/column creation, that no entries are already stored at the acl table. We might still have residual acl entries if something goes wrong, in postDeleteTable(), postDeleteColumn(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state
[ https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292906#comment-13292906 ] Matteo Bertozzi commented on HBASE-5947: No news on that... check for column qualifier require a deep scan or keeping ref-counted qualifiers somewhere. For user is not that easy unless we have some ldap integration or similar. But if you want take the ownership of this, go ahead! Check for valid user/table/family/qualifier and acl state - Key: HBASE-5947 URL: https://issues.apache.org/jira/browse/HBASE-5947 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl HBase Shell grant/revoke doesn't check for valid user or table/family/qualifier so can you end up having rights for something that doesn't exists. We might also want to ensure, upon table/column creation, that no entries are already stored at the acl table. We might still have residual acl entries if something goes wrong, in postDeleteTable(), postDeleteColumn(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6196) MR testcases does not run with hadoop - 2.0.
[ https://issues.apache.org/jira/browse/HBASE-6196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292914#comment-13292914 ] Andrew Purtell commented on HBASE-6196: --- Didn't this go in with HBASE-5966 MapReduce based tests broken on Hadoop 2.0.0-alpha ? See svn commit r1337448 - /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java MR testcases does not run with hadoop - 2.0. Key: HBASE-6196 URL: https://issues.apache.org/jira/browse/HBASE-6196 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0 Reporter: ramkrishna.s.vasudevan Fix For: 0.96.0 The MR related testcases are failing in hadoop-2.0. The resource manager scheduler is not getting spawned. The following fix solves the problem. If you feel it is ok, I can submit as patch and commit. {code} String rmSchedulerAdress = mrClusterJobConf.get(yarn.resourcemanager.scheduler.address); if (rmSchedulerAdress != null) { conf.set(yarn.resourcemanager.scheduler.address, rmSchedulerAdress); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6188) Remove the concept of table owner
[ https://issues.apache.org/jira/browse/HBASE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292922#comment-13292922 ] Andrew Purtell commented on HBASE-6188: --- The trouble here is CREATE loses most of its meaning when there won't be a concept of table owner (initialized to the creator) and it is a large subset of ADMIN permission. A user with CREATE permissions on a table can do everything except assign or move a region? Why does that make sense when disable/enable will move all of the regions around, much more disruptive? What I am after here is a justification for keeping around the legacy permission CREATE. Remove the concept of table owner - Key: HBASE-6188 URL: https://issues.apache.org/jira/browse/HBASE-6188 Project: HBase Issue Type: Sub-task Components: security Reporter: Andrew Purtell Assignee: Laxman Labels: security The table owner concept was a design simplification in the initial drop. First, the design changes under review means only a user with GLOBAL CREATE permission can create a table, which will probably be an administrator. Then, granting implicit permissions may lead to oversights and it adds unnecessary conditionals to our code. So instead the administrator with GLOBAL CREATE permission should make the appropriate grants at table create time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5564: -- Status: Patch Available (was: Reopened) Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Fix For: 0.96.0 Attachments: 5564.lint, 5564v5.txt, HBASE-5564.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5564: -- Attachment: HBASE-5564.patch New patch for trunk. This time the testcases should run. Pls review and provide your comments. Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Fix For: 0.96.0 Attachments: 5564.lint, 5564v5.txt, HBASE-5564.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6198) Sometimes we synchronize on RegionState instance updating it, most of the time we don't. Fix
stack created HBASE-6198: Summary: Sometimes we synchronize on RegionState instance updating it, most of the time we don't. Fix Key: HBASE-6198 URL: https://issues.apache.org/jira/browse/HBASE-6198 Project: HBase Issue Type: Bug Reporter: stack We synchronize on region state doing single assign of a region but then over in the handleRegion zk callback, we don't synchronize on the regionstate instance. Makes no sense. Either get rid of all synchronization or put synchronization everywhere (we should probably do the latter since it makes things easier to reason about, and these states are already complicated. There could be a performance issue though). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6188) Remove the concept of table owner
[ https://issues.apache.org/jira/browse/HBASE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292939#comment-13292939 ] Laxman commented on HBASE-6188: --- Thanks Ram for pitching in. Andy, we definitely agree to your point. Just reiterating my previous comments. {quote} I agree with you Andy. But if we keep DisableTable/EnableTable permission with ADMIN alone, to delete/modify a table a user should have both ADMIN and CREATE permissions. ADMIN access to disable a table and CREATE access to delete/modify the table. Or user with CREATE only access has to request the ADMIN user to disable/enable the table before/after DDL. {quote} So, to delete a table requires two different users or one user with both permissions. This is my only concern. Thanks for clarification. Please provide your opinion of this. CREATE -(DDL) CreateTable, AddColumn, DeleteColumn, DeleteTable, ModifyColumn, ModifyTable ADMIN - DisableTable, EnableTable bq. it is a large subset of ADMIN permission. Please note that above are two disjoint sets. That means, DDL operations can't be done by ADMIN. Hope that should make them clean. Remove the concept of table owner - Key: HBASE-6188 URL: https://issues.apache.org/jira/browse/HBASE-6188 Project: HBase Issue Type: Sub-task Components: security Reporter: Andrew Purtell Assignee: Laxman Labels: security The table owner concept was a design simplification in the initial drop. First, the design changes under review means only a user with GLOBAL CREATE permission can create a table, which will probably be an administrator. Then, granting implicit permissions may lead to oversights and it adds unnecessary conditionals to our code. So instead the administrator with GLOBAL CREATE permission should make the appropriate grants at table create time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state
[ https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292950#comment-13292950 ] Laxman commented on HBASE-5947: --- :) I just asked as I noticed parent issue can be closed if this issue is closed. bq. But if you want take the ownership of this, go ahead! I'm not comfortable with LDAP. Will check once and get back to you on this. A brief approach may be helpful for me to understand. Check for valid user/table/family/qualifier and acl state - Key: HBASE-5947 URL: https://issues.apache.org/jira/browse/HBASE-5947 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl HBase Shell grant/revoke doesn't check for valid user or table/family/qualifier so can you end up having rights for something that doesn't exists. We might also want to ensure, upon table/column creation, that no entries are already stored at the acl table. We might still have residual acl entries if something goes wrong, in postDeleteTable(), postDeleteColumn(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state
[ https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292951#comment-13292951 ] Enis Soztutar commented on HBASE-5947: -- bq. No news on that... check for column qualifier require a deep scan or keeping ref-counted qualifiers somewhere. For qualifiers, I think it is fine to not enforce that they exists, but we should check for table / cf. For preCreateTable, and postDelete, we have to do the scan on ACL table, not on the actual table, no? Check for valid user/table/family/qualifier and acl state - Key: HBASE-5947 URL: https://issues.apache.org/jira/browse/HBASE-5947 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl HBase Shell grant/revoke doesn't check for valid user or table/family/qualifier so can you end up having rights for something that doesn't exists. We might also want to ensure, upon table/column creation, that no entries are already stored at the acl table. We might still have residual acl entries if something goes wrong, in postDeleteTable(), postDeleteColumn(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96
[ https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292952#comment-13292952 ] Matteo Bertozzi commented on HBASE-6055: @Jon inline replies {quote} I issue a snapshot command at the shell/master. * HBase creates a new .snapshot subdir, and it contains references to HLogs and HFiles. This is a snapshot ** This step is called: snapshotting, taking a snapshot, and also materializing right? {quote} Yes, When you issue a snapshot command, hbase create a new .snapshot subdir containing references to hlog and hfile. This is taking a snapshot or snapshotting... but not materialization, I think materialization is when you copy the hfiles/hlogs somewhere else... {quote} I currently have a snapshot. I want read-only access its contents to compare with the current table. * Does HBase know how to interpret the stuff in a .snapshot dir such that it act like a read-only table? * Do I, as an admin, need to execute some step to make it appear in HBase as a read-only table? (if so what is this called?) {quote} I think that the first point is more like a snapshot-scan... that scan the hfiles + hlog in the snapshot directory and show you the result... The second point seems more like a Restore on different table and marking the table as readonly {quote} I currently have a snapshot. Oops! I accidentally truncated the table I had snapshotted. I don't want the truncated version of the table anymore and I want to replace the table with the snapshot so I have read write access. * This is called restoring the snapshot right? (and I do this by issuing a something like restore command at the shell?) * Does HBase copy or move the data referred to in the snapshot? {quote} Restore is when you replace your current table with the snapshot version, and you do it by restore snapshot-name Yeah you need to copy the old hfiles to restore the snapshot (but maybe not every hfiles are removed from the current table) {quote} I currently have a snapshot. I want the current version but I'd like to clone of the snapshotted table that provides read/write access to the clone. * Is/should this be supported? * Is this called restoring or exporting the snapshot (to a new name)? * For this to work I need to convert all references into actual copies of the HFiles and HLogs right? Is this conversion called exporting? (FYI, this is what I meant materializing to mean, but let's just stick to your definitions) {quote} Yeah this is really easy with HardLink... some more work is needed to keep track of references files This is Restore on a different table, export is when you're copying the .snapshot/name folder to another cluster... If you think in term of HardLink you don't need to copy the hfiles but just doing an HardLink... more code is needed to use Reference Files but you can avoid the copy. (Note that HLog need to be replayed, so this is the only one that need to be copied. {quote} I currently have a snapshot. I want to send a copy of the snapshot to a remote cluster so that it can provide read/write access to the data. Is/should this be supported? * Do both HBase instances need to be up at the same time? * This process would need to dereference the snapshot's references and copy them. What is it called? exporting? {quote} Yes this is Import/Export that besically is a distcp of the .snapshot/name folder I Think that is enough having both hdfs up at the same time. Yeah in this case you need to physically copy the hfiles. Snapshots in HBase 0.96 --- Key: HBASE-6055 URL: https://issues.apache.org/jira/browse/HBASE-6055 Project: HBase Issue Type: New Feature Components: client, master, regionserver, zookeeper Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0 Attachments: Snapshots in HBase.docx Continuation of HBASE-50 for the current trunk. Since the implementation has drastically changed, opening as a new ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
stack created HBASE-6199: Summary: Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6199: - Attachment: pending_open.txt Lets see how this patch does. Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Attachments: pending_open.txt, pending_open2.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6199: - Attachment: pending_open2.txt First patch had mistake in it. Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: pending_open.txt, pending_open2.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6199: - Assignee: stack Status: Patch Available (was: Open) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: pending_open.txt, pending_open2.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state
[ https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292959#comment-13292959 ] Enis Soztutar commented on HBASE-5947: -- Are we sure we want to check for users? Check for valid user/table/family/qualifier and acl state - Key: HBASE-5947 URL: https://issues.apache.org/jira/browse/HBASE-5947 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl HBase Shell grant/revoke doesn't check for valid user or table/family/qualifier so can you end up having rights for something that doesn't exists. We might also want to ensure, upon table/column creation, that no entries are already stored at the acl table. We might still have residual acl entries if something goes wrong, in postDeleteTable(), postDeleteColumn(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6199: - Attachment: pending_open3.txt Fix comments. Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: pending_open.txt, pending_open2.txt, pending_open3.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state
[ https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292970#comment-13292970 ] Matteo Bertozzi commented on HBASE-5947: @Enis maybe not, since unless we have an ldap server or something similar we've no way to decide which user are available... Check for valid user/table/family/qualifier and acl state - Key: HBASE-5947 URL: https://issues.apache.org/jira/browse/HBASE-5947 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl HBase Shell grant/revoke doesn't check for valid user or table/family/qualifier so can you end up having rights for something that doesn't exists. We might also want to ensure, upon table/column creation, that no entries are already stored at the acl table. We might still have residual acl entries if something goes wrong, in postDeleteTable(), postDeleteColumn(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5352) ACL improvements
[ https://issues.apache.org/jira/browse/HBASE-5352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292972#comment-13292972 ] Laxman commented on HBASE-5352: --- Request for review for subtask: HBASE-6092 ACL improvements Key: HBASE-5352 URL: https://issues.apache.org/jira/browse/HBASE-5352 Project: HBase Issue Type: Improvement Components: security Affects Versions: 0.92.1, 0.94.0 Reporter: Enis Soztutar Assignee: Enis Soztutar In this issue I would like to open discussion for a few minor ACL related improvements. The proposed changes are as follows: 1. Introduce something like AccessControllerProtocol.checkPermissions(Permission[] permissions) API, so that clients can check access rights before carrying out the operations. We need this kind of operation for HCATALOG-245, which introduces authorization providers for hbase over hcat. We cannot use getUserPermissions() since it requires ADMIN permissions on the global/table level. 2. getUserPermissions(tableName)/grant/revoke and drop/modify table operations should not check for global CREATE/ADMIN rights, but table CREATE/ADMIN rights. The reasoning is that if a user is able to admin or read from a table, she should be able to read the table's permissions. We can choose whether we want only READ or ADMIN permissions for getUserPermission(). Since we check for global permissions first for table permissions, configuring table access using global permissions will continue to work. 3. Grant/Revoke global permissions - HBASE-5342 (included for completeness) From all 3, we may want to backport the first one to 0.92 since without it, Hive/Hcatalog cannot use Hbase's authorization mechanism effectively. I will create subissues and convert HBASE-5342 to a subtask when we get some feedback, and opinions for going further. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4391) Add ability to start RS as root and call mlockall
[ https://issues.apache.org/jira/browse/HBASE-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292975#comment-13292975 ] Enis Soztutar commented on HBASE-4391: -- I've seen smt similar in accumulo code base: http://svn.apache.org/viewvc/accumulo/trunk/server/src/main/c%2B%2B/mlock/ http://svn.apache.org/viewvc/accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/MLock.java?view=log Add ability to start RS as root and call mlockall - Key: HBASE-4391 URL: https://issues.apache.org/jira/browse/HBASE-4391 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.96.0 Attachments: HBASE-4391-v0.patch A common issue we've seen in practice is that users oversubscribe their region servers with too many MR tasks, etc. As soon as the machine starts swapping, the RS grinds to a halt, loses ZK session, aborts, etc. This can be combatted by starting the RS as root, calling mlockall(), and then setuid down to the hbase user. We should not require this, but we should provide it as an option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6199: - Status: Open (was: Patch Available) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: 6199v4.txt, pending_open.txt, pending_open2.txt, pending_open3.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6199: - Status: Patch Available (was: Open) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: 6199v4.txt, pending_open.txt, pending_open2.txt, pending_open3.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6199: - Attachment: 6199v4.txt Cleaner version (the v3 didn't apply to fresh trunk) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: 6199v4.txt, pending_open.txt, pending_open2.txt, pending_open3.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6194) add open time for a region and list recently closed regions in a regionserver UI
[ https://issues.apache.org/jira/browse/HBASE-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifei Ji updated HBASE-6194: - Description: The region server currently lists all the regions that it is hosting. It will be useful to report when those regions were opened on this server. It will also be useful to report what and when were the recent regions closed. (was: The region server currently lists all the region servers that it is hosting. It will be useful to report when those regions were opened on this server. It will also be useful to report what and when were the recent regions closed.) add open time for a region and list recently closed regions in a regionserver UI Key: HBASE-6194 URL: https://issues.apache.org/jira/browse/HBASE-6194 Project: HBase Issue Type: Improvement Reporter: Feifei Ji The region server currently lists all the regions that it is hosting. It will be useful to report when those regions were opened on this server. It will also be useful to report what and when were the recent regions closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state
[ https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293075#comment-13293075 ] Enis Soztutar commented on HBASE-5947: -- Then let's reduce the scope for this issue to be: - Check for table / cf existence in grant. not sure about revoke, since we may end up in an inconsistent state between ACL and table metadata, so revoke can just remove what is available in ACL table. - Ensure that there is no table/cf/qualifier level permissions are stored in ACL in preCreateTable Check for valid user/table/family/qualifier and acl state - Key: HBASE-5947 URL: https://issues.apache.org/jira/browse/HBASE-5947 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl HBase Shell grant/revoke doesn't check for valid user or table/family/qualifier so can you end up having rights for something that doesn't exists. We might also want to ensure, upon table/column creation, that no entries are already stored at the acl table. We might still have residual acl entries if something goes wrong, in postDeleteTable(), postDeleteColumn(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
Jean-Daniel Cryans created HBASE-6200: - Summary: KeyComparator.compareWithoutRow can be wrong when families have the same prefix Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.1, 0.90.6 Reporter: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 As reported by magma on IRC (AKA Desert Rose on the ML), Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6092) Authorize flush, split, compact operations in AccessController
[ https://issues.apache.org/jira/browse/HBASE-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293122#comment-13293122 ] Andrew Purtell commented on HBASE-6092: --- Let's look at flush as an example. If HRegionServer.flushRegion - HRegion.flushcache - coprocessor.preFlush, then AccessController.getActiveUser will find the user making the request in the RPC request context. Otherwise getActiveUser will substitute the system principal. Thus ADMIN permission must be granted to the system principal or flushes will fail. ADMIN permission is granted to the system principal by AccessController code when it initializes. As long as that doesn't change, this is ok. Please update the comments in the tests you added. TestCompact does not verify that superuser and admin can create tables, etc. Otherwise the patch looks good. Authorize flush, split, compact operations in AccessController -- Key: HBASE-6092 URL: https://issues.apache.org/jira/browse/HBASE-6092 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.94.0, 0.96.0, 0.94.1 Reporter: Laxman Assignee: Laxman Labels: acl, security Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6092.patch Currently, flush, split and compaction are not checked for authorization in AccessController. With the current implementation any unauthorized client can trigger these operations on a table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-6200: -- Description: As reported by Sergio Esteves on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. was: As reported by magma on IRC (AKA Desert Rose on the ML), Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 As reported by Sergio Esteves on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6188) Remove the concept of table owner
[ https://issues.apache.org/jira/browse/HBASE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293128#comment-13293128 ] Andrew Purtell commented on HBASE-6188: --- bq. DDL operations can't be done by ADMIN. I'm not sure there is a situation where it would make sense to disallow an administrator from making a DDL operation. You've convinced me of this: CREATE -(DDL) CreateTable, AddColumn, DeleteColumn, DeleteTable, ModifyColumn, ModifyTable, DisableTable, EnableTable ADMIN - All of the above plus Flush, Split, Compact It's not useful to give add/delete/modify schema privileges without enable/disable to have them take effect. So either we do the above or we get rid of CREATE. I think the above distinction is still useful. Thanks for having the discussion. Remove the concept of table owner - Key: HBASE-6188 URL: https://issues.apache.org/jira/browse/HBASE-6188 Project: HBase Issue Type: Sub-task Components: security Reporter: Andrew Purtell Assignee: Laxman Labels: security The table owner concept was a design simplification in the initial drop. First, the design changes under review means only a user with GLOBAL CREATE permission can create a table, which will probably be an administrator. Then, granting implicit permissions may lead to oversights and it adds unnecessary conditionals to our code. So instead the administrator with GLOBAL CREATE permission should make the appropriate grants at table create time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6188) Remove the concept of table owner
[ https://issues.apache.org/jira/browse/HBASE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293128#comment-13293128 ] Andrew Purtell edited comment on HBASE-6188 at 6/11/12 10:10 PM: - bq. DDL operations can't be done by ADMIN. I'm not sure there is a situation where it would make sense to disallow an administrator from making a DDL operation. You've convinced me of this: CREATE -(DDL) CreateTable, AddColumn, DeleteColumn, DeleteTable, ModifyColumn, ModifyTable, DisableTable, EnableTable ADMIN - All of the above plus Flush, Split, Compact It's not useful to give add/delete/modify schema privileges without enable/disable to have them take effect. So either we do the above or we get rid of CREATE. I think the above distinction is still useful. Edit: I don't like that non-ADMIN can do enable/disable table, because it can really affect the cluster if the table is large. However I think on balance it would be more confusing than useful to remove EnableTable and DisableTable from the set of operations CREATE permission allows until online schema update-in-place without disable is always possible. Thanks for having the discussion. was (Author: apurtell): bq. DDL operations can't be done by ADMIN. I'm not sure there is a situation where it would make sense to disallow an administrator from making a DDL operation. You've convinced me of this: CREATE -(DDL) CreateTable, AddColumn, DeleteColumn, DeleteTable, ModifyColumn, ModifyTable, DisableTable, EnableTable ADMIN - All of the above plus Flush, Split, Compact It's not useful to give add/delete/modify schema privileges without enable/disable to have them take effect. So either we do the above or we get rid of CREATE. I think the above distinction is still useful. Thanks for having the discussion. Remove the concept of table owner - Key: HBASE-6188 URL: https://issues.apache.org/jira/browse/HBASE-6188 Project: HBase Issue Type: Sub-task Components: security Reporter: Andrew Purtell Assignee: Laxman Labels: security The table owner concept was a design simplification in the initial drop. First, the design changes under review means only a user with GLOBAL CREATE permission can create a table, which will probably be an administrator. Then, granting implicit permissions may lead to oversights and it adds unnecessary conditionals to our code. So instead the administrator with GLOBAL CREATE permission should make the appropriate grants at table create time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-6200: -- Description: As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. was: As reported by Sergio Esteves on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. Changing the user's name again at his request. KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6182) Make HBase works with jdk1.7
[ https://issues.apache.org/jira/browse/HBASE-6182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6182: --- Attachment: large-tests.log medium-tests.log Some medium and large tests still fail, if someone wants to help, please file a sub-task. Otherwise, I will work on them one by one, starting from medium tests. Make HBase works with jdk1.7 Key: HBASE-6182 URL: https://issues.apache.org/jira/browse/HBASE-6182 Project: HBase Issue Type: Task Reporter: Jimmy Xiang Fix For: 0.96.0 Attachments: large-tests.log, medium-tests.log jdk1.7 is out for a while. HBase should support it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5874) When 'fs.default.name' not configured, the hbck tool and Merge tool throw IllegalArgumentException.
[ https://issues.apache.org/jira/browse/HBASE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293235#comment-13293235 ] fulin wang commented on HBASE-5874: --- Thanks Hudson for your attention. When 'fs.default.name' not configured, the hbck tool and Merge tool throw IllegalArgumentException. --- Key: HBASE-5874 URL: https://issues.apache.org/jira/browse/HBASE-5874 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.6 Reporter: fulin wang Assignee: fulin wang Fix For: 0.90.7, 0.96.0, 0.94.1, 0.92.3 Attachments: HBASE-5874-0.90-v2.patch, HBASE-5874-0.90.patch, HBASE-5874-trunk-v2.patch, HBASE-5874-trunk-v3.patch, HBASE-5874-trunk.patch The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException. the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to the code. hbck exception: Exception in thread main java.lang.IllegalArgumentException: Wrong FS: hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:128) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489) at org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565) at org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596) at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907) Merge exception: [2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 381] exiting due to error java.lang.IllegalArgumentException: Wrong FS: hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634) at org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276) at org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261) at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293238#comment-13293238 ] Hadoop QA commented on HBASE-5564: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531698/HBASE-5564.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//console This message is automatically generated. Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Fix For: 0.96.0 Attachments: 5564.lint, 5564v5.txt, HBASE-5564.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293240#comment-13293240 ] Hadoop QA commented on HBASE-6199: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531715/6199v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2135//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2135//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2135//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2135//console This message is automatically generated. Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: 6199v4.txt, pending_open.txt, pending_open2.txt, pending_open3.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293267#comment-13293267 ] Hadoop QA commented on HBASE-6012: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531650/HBASE-6012v6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestAssignmentManager org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2137//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2137//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2137//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2137//console This message is automatically generated. Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6195) Increment data will be lost when the memstore is flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293268#comment-13293268 ] Hadoop QA commented on HBASE-6195: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531685/HBASE-6195-trunk-V3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2138//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2138//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2138//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2138//console This message is automatically generated. Increment data will be lost when the memstore is flushed Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Assignee: ShiXing Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk-V3.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293270#comment-13293270 ] Zhihong Ted Yu commented on HBASE-6012: --- There was NPE in TestAssignmentManager#testSSHWhenSplitRegionInProgress Please fix. Thanks Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6201) HBase integration/system tests
Enis Soztutar created HBASE-6201: Summary: HBase integration/system tests Key: HBASE-6201 URL: https://issues.apache.org/jira/browse/HBASE-6201 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.96.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Integration and general system tests have been discussed previously, and the conclusion is that we need to unify how we do release candidate testing (HBASE-6091). In this issue, I would like to discuss and agree on a general plan, and open subtickets for execution so that we can carry out most of the tests in HBASE-6091 automatically. Initially, here is what I have in mind: 1. Create hbase-it (or hbase-tests) containing forward port of HBASE-4454 (without any tests). This will allow integration test to be run with {code} mvn verify {code} 2. Add ability to run all integration/system tests on a given cluster. Smt like: {code} mvn verify -Dconf=/etc/hbase/conf/ {code} should run the test suite on the given cluster. (Right now we can launch some of the tests (TestAcidGuarantees) from command line). Most of the system tests will be client side, and interface with the cluster through public APIs. We need a tool on top of MiniHBaseCluster or improve HBaseTestingUtility, so that tests can interface with the mini cluster or the actual cluster uniformly. 3. Port candidate unit tests to the integration tests module. Some of the candidates are: - TestAcidGuarantees / TestAtomicOperation - TestRegionBalancing (HBASE-6053) - TestFullLogReconstruction - TestMasterFailover - TestImportExport - TestMultiVersions / TestKeepDeletes - TestFromClientSide - TestShell and src/test/ruby - TestRollingRestart - Test**OnCluster - Balancer tests These tests should continue to be run as unit tests w/o any change in semantics. However, given an actual cluster, they should use that, instead of spinning a mini cluster. 4. Add more tests, especially, long running ingestion tests (goraci, BigTop's TestLoadAndVerify, LoadTestTool), and chaos monkey style fault tests. All suggestions welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6199) Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING
[ https://issues.apache.org/jira/browse/HBASE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293301#comment-13293301 ] Hadoop QA commented on HBASE-6199: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531715/6199v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2140//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2140//console This message is automatically generated. Change PENDING_OPEN scope from pre-rpc open to OPENING to just post-rpc open to OPENING --- Key: HBASE-6199 URL: https://issues.apache.org/jira/browse/HBASE-6199 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack Attachments: 6199v4.txt, pending_open.txt, pending_open2.txt, pending_open3.txt PENDING_OPEN currently is a murky state. Its a master in-memory state with no corresponding znode state that sits between OFFLINE and OPENING states. The OFFLINE state is set by the master when it goes to open a region. OPENING is set by the regionserver after its assumed control of a region and is moving it through the OPENING process. PENDING_OPEN currently spans the open rpc invocation. This state is in place pre-open-rpc-invocation, during open-rpc-invocation, and post-rpc-invocation until we get the OPENING callback. That PENDING_OPEN covers this many different conditions effectively makes it unactionable. This issue proposes PENDING_OPEN only be in place post-rpc-invocation. Now its meaning is clear as the space between rpc-open-invocation and our receiving the callback which sets RegionState to OPENING. PENDING_OPEN becomes actionable too in that if a regionserver dies post rpc-open-invocation, we know that we can reassign the region. See https://issues.apache.org/jira/browse/HBASE-6060?focusedCommentId=13292646page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292646 for more discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6134: Attachment: HBASE-6134v3-92.patch Uploading patchv3 for 0.92 Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3-92.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6092) Authorize flush, split, compact operations in AccessController
[ https://issues.apache.org/jira/browse/HBASE-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293309#comment-13293309 ] Hadoop QA commented on HBASE-6092: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531662/HBASE-6092.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestRowProcessorEndpoint org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.regionserver.TestSplitLogWorker org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2139//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2139//console This message is automatically generated. Authorize flush, split, compact operations in AccessController -- Key: HBASE-6092 URL: https://issues.apache.org/jira/browse/HBASE-6092 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.94.0, 0.96.0, 0.94.1 Reporter: Laxman Assignee: Laxman Labels: acl, security Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6092.patch Currently, flush, split and compaction are not checked for authorization in AccessController. With the current implementation any unauthorized client can trigger these operations on a table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6012: Attachment: HBASE-6012v7.patch Updating patchv7, fix the two failed test case Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch, HBASE-6012v7.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6195) Increment data will be lost when the memstore is flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Shi updated HBASE-6195: Attachment: HBASE-6195-trunk-V4.patch //In patch v3: //+ long now = //EnvironmentEdgeManager.currentTimeMillis(); // Integer lid = getLock(lockid, row, true); //Variable now isn't actually referenced. Do we need it ? Variable now is used for generate the newKV, V3 generate the Variable also before the lock, V4 fix it. //+ //store the kvs to the tmp memory for write hlog first, then write memory //The above should read: 'to temporary memstore before writing HLog' Let me think how to show the problem in UintTest. Increment data will be lost when the memstore is flushed Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: Xing Shi Assignee: ShiXing Attachments: HBASE-6195-trunk-V2.patch, HBASE-6195-trunk-V3.patch, HBASE-6195-trunk-V4.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6012: Attachment: HBASE-6012v8.patch Updating patch v8, fetch ted's review board comments in Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch, HBASE-6012v7.patch, HBASE-6012v8.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6188) Remove the concept of table owner
[ https://issues.apache.org/jira/browse/HBASE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293324#comment-13293324 ] ramkrishna.s.vasudevan commented on HBASE-6188: --- bq.You've convinced me of this: bq.CREATE -(DDL) CreateTable, AddColumn, DeleteColumn, DeleteTable, ModifyColumn, ModifyTable, DisableTable, EnableTable bq. ADMIN - All of the above plus Flush, Split, Compact Thanks Andy. Remove the concept of table owner - Key: HBASE-6188 URL: https://issues.apache.org/jira/browse/HBASE-6188 Project: HBase Issue Type: Sub-task Components: security Reporter: Andrew Purtell Assignee: Laxman Labels: security The table owner concept was a design simplification in the initial drop. First, the design changes under review means only a user with GLOBAL CREATE permission can create a table, which will probably be an administrator. Then, granting implicit permissions may lead to oversights and it adds unnecessary conditionals to our code. So instead the administrator with GLOBAL CREATE permission should make the appropriate grants at table create time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...
[ https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293328#comment-13293328 ] ramkrishna.s.vasudevan commented on HBASE-4064: --- @Steven Can you provide us some logs to see the problem. Because its important we solve such issues. May be it exists in other versions also? Thanks Steve. Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long... Key: HBASE-4064 URL: https://issues.apache.org/jira/browse/HBASE-4064 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.3 Reporter: Jieshan Bean Fix For: 0.90.7 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, disableflow.png 1. If there is a rubbish RegionState object with PENDING_CLOSE in regionsInTransition(The RegionState was remained by some exception which should be removed, that's why I called it as rubbish object), but the region is not currently assigned anywhere, TimeoutMonitor will fall into an endless loop: 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:21,326 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:21,438 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:21,441 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:31,207 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:31,215 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. state=PENDING_CLOSE, ts=1309141555301 2011-06-27 10:32:41,164 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. (offlining) 2011-06-27 10:32:41,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is not currently assigned anywhere . 2 In the following scenario, two concurrent unassigning call of the same region may lead to the above problem: the first unassign call send rpc call success, the master watched the event of RS_ZK_REGION_CLOSED, process this event, will create a ClosedRegionHandler to remove the state of the region in master.eg. while ClosedRegionHandler is running in hbase.master.executor.closeregion.threads thread (A), another unassign call of same region run in another thread(B). while thread B run if (!regions.containsKey(region)), this.regions have the region info, now cpu switch to thread A. The thread A will remove the region from the sets of this.regions and regionsInTransition, then switch to thread B. the thread B run continue, will throw an exception with the msg of Server null returned java.lang.NullPointerException: Passed server is null for 9a6e26d40293663a79523c58315b930f, but without removing the new-adding RegionState from regionsInTransition,and it can not be removed for ever. public
[jira] [Commented] (HBASE-6188) Remove the concept of table owner
[ https://issues.apache.org/jira/browse/HBASE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293332#comment-13293332 ] Laxman commented on HBASE-6188: --- bq. CREATE -(DDL) CreateTable, AddColumn, DeleteColumn, DeleteTable, ModifyColumn, ModifyTable, DisableTable, EnableTable bq. ADMIN - All of the above plus Flush, Split, Compact Thanks a lot Andy. I will consider this. I didn't consider flush, split and compact as the context is DDL. Remove the concept of table owner - Key: HBASE-6188 URL: https://issues.apache.org/jira/browse/HBASE-6188 Project: HBase Issue Type: Sub-task Components: security Reporter: Andrew Purtell Assignee: Laxman Labels: security The table owner concept was a design simplification in the initial drop. First, the design changes under review means only a user with GLOBAL CREATE permission can create a table, which will probably be an administrator. Then, granting implicit permissions may lead to oversights and it adds unnecessary conditionals to our code. So instead the administrator with GLOBAL CREATE permission should make the appropriate grants at table create time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293343#comment-13293343 ] Hadoop QA commented on HBASE-6012: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531768/HBASE-6012v8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2141//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2141//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2141//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2141//console This message is automatically generated. Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch, HBASE-6012v7.patch, HBASE-6012v8.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293345#comment-13293345 ] Hadoop QA commented on HBASE-6134: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531760/HBASE-6134v3-92.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2142//console This message is automatically generated. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3-92.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293356#comment-13293356 ] Zhihong Ted Yu commented on HBASE-6012: --- I ran the two failed tests manually and they passed. Will integrate tomorrow if there is no objection. Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch, HBASE-6012v7.patch, HBASE-6012v8.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5914) Bulk assign regions in the process of ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-5914: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Bulk assign regions in the process of ServerShutdownHandler --- Key: HBASE-5914 URL: https://issues.apache.org/jira/browse/HBASE-5914 Project: HBase Issue Type: Improvement Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-5914.patch, HBASE-5914v2.patch, HBASE-5914v3.patch In the process of ServerShutdownHandler, we currently assign regions singly. In the large cluster, one regionserver always carried many regions, this action is quite slow. What about using bulk assign regions like cluster start up. In current logic, if we failed assigning many regions to one destination server, we will wait unitl timeout, however in the process of ServerShutdownHandler, we should retry it to another server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6134: Attachment: HBASE-6134v4.patch Updating pach v4 with Ted's comments in review board : Keep the config param: conf.getInt(hbase.splitlog.report.openedfiles, 3); Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3-92.patch, HBASE-6134v3.patch, HBASE-6134v4.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293367#comment-13293367 ] ramkrishna.s.vasudevan commented on HBASE-5564: --- All the tests are passing.. Will integrate tomorrow if there are no objections. Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Fix For: 0.96.0 Attachments: 5564.lint, 5564v5.txt, HBASE-5564.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) Handling RegionOpeningState for bulk assign
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293371#comment-13293371 ] Hadoop QA commented on HBASE-6012: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531768/HBASE-6012v8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2143//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2143//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2143//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2143//console This message is automatically generated. Handling RegionOpeningState for bulk assign --- Key: HBASE-6012 URL: https://issues.apache.org/jira/browse/HBASE-6012 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6012.patch, HBASE-6012v2.patch, HBASE-6012v3.patch, HBASE-6012v4.patch, HBASE-6012v5.patch, HBASE-6012v6.patch, HBASE-6012v7.patch, HBASE-6012v8.patch Since HBASE-5914, we using bulk assign for SSH But in the bulk assign case if we get an ALREADY_OPENED case there is no one to clear the znode created by bulk assign. Another thing, when RS opening a list of regions, if one region is already in transition, it will throw RegionAlreadyInTransitionException and stop opening other regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293375#comment-13293375 ] Zhihong Ted Yu commented on HBASE-5564: --- Minor comment: {code} + throw new BadTsvLineException(Invalid timestamp); {code} Can the timestamp string be included ? Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Fix For: 0.96.0 Attachments: 5564.lint, 5564v5.txt, HBASE-5564.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira