[ https://issues.apache.org/jira/browse/HBASE-15056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-15056: --------------------------- Status: Open (was: Patch Available) > Split fails with KeeperException$NoNodeException when namespace quota is > enabled > -------------------------------------------------------------------------------- > > Key: HBASE-15056 > URL: https://issues.apache.org/jira/browse/HBASE-15056 > Project: HBase > Issue Type: Bug > Affects Versions: 1.2.0 > Reporter: Ted Yu > Attachments: 15056-branch-1-v1.txt, > split-fails-when-exceeding-quota-with-znode-loss.test > > > When trying to port HBASE-15044 to branch-1, I found that region split fails > with KeeperException$NoNodeException when namespace quota is enabled and the > split would exceed allocated quota. > Here is related test output: > {code} > 2015-12-30 09:50:16,764 WARN [RS:0;10.22.24.71:65256-splits-1451497816754] > zookeeper.ZKAssign(885): regionserver:65256-0x151f402c21c0001, > quorum=localhost:57662, baseZNode=/ hbase Attempt to transition the > unassigned node for 17fc99c04a8027b653e9d5ef5d578461 from > RS_ZK_REQUEST_REGION_SPLIT to RS_ZK_REQUEST_REGION_SPLIT failed, the node > existed and was in the expected state but then when setting data it no > longer existed > 2015-12-30 09:50:16,866 DEBUG [RS:0;10.22.24.71:65256-splits-1451497816754] > zookeeper.ZKUtil(718): regionserver:65256-0x151f402c21c0001, > quorum=localhost:57662, baseZNode=/hbase Unable to get data of znode > /hbase/region-in-transition/17fc99c04a8027b653e9d5ef5d578461 because node > does not exist (not necessarily an error) > 2015-12-30 09:50:16,866 INFO [RS:0;10.22.24.71:65256-splits-1451497816754] > regionserver.SplitRequest(97): Running rollback/cleanup of failed split of > np2: > testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461.; > Failed getting SPLITTING znode on > np2:testRegionNormalizationSplitOnCluster,zzzzz, > 1451497806295.17fc99c04a8027b653e9d5ef5d578461. > java.io.IOException: Failed getting SPLITTING znode on > np2:testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461. > at > org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:200) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:381) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:277) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:560) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Data is null, splitting node > 17fc99c04a8027b653e9d5ef5d578461 no longer exists > at > org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:166) > ... 8 more > 2015-12-30 09:50:16,869 DEBUG [RS:0;10.22.24.71:65256-splits-1451497816754] > zookeeper.ZKUtil(718): regionserver:65256-0x151f402c21c0001, > quorum=localhost:57662, baseZNode=/hbase Unable to get data of znode > /hbase/region-in-transition/17fc99c04a8027b653e9d5ef5d578461 because node > does not exist (not necessarily an error) > 2015-12-30 09:50:16,869 INFO [RS:0;10.22.24.71:65256-splits-1451497816754] > coordination.ZKSplitTransactionCoordination(268): Failed cleanup zk node of > np2: > testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461. > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:452) > at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:381) > at > org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.clean(ZKSplitTransactionCoordination.java:261) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.rollback(SplitTransactionImpl.java:948) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.rollback(SplitTransactionImpl.java:900) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:99) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > {code} > Strangely there is no QuotaExceededException thrown. > In master branch, quota check is done in response to > TransitionCode.READY_TO_SPLIT > In branch-1, that code path wouldn't be executed when useZKForAssignment is > true (the default case): > {code} > } else if (services != null && !useZKForAssignment) { > if (!services.reportRegionStateTransition(TransitionCode.READY_TO_SPLIT, > parent.getRegionInfo(), hri_a, hri_b)) { > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)