[ 
https://issues.apache.org/jira/browse/HBASE-29282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-29282.
-------------------------------
    Fix Version/s: 2.6.3
                   2.5.12
     Hadoop Flags: Reviewed
       Resolution: Fixed

Pushed to all active branches.

Thanks [~reidchan] for reviewing!

> Regions are left in CLOSED state after merging
> ----------------------------------------------
>
>                 Key: HBASE-29282
>                 URL: https://issues.apache.org/jira/browse/HBASE-29282
>             Project: HBase
>          Issue Type: Bug
>          Components: proc-v2, Region Assignment
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12
>
>
> When running ITBLL, some regions are left in CLOSED state for a long time and 
> finally were cleaned up by CatalogJanitor.
> After checking, the regions are merged, which should have been removed in 
> hbase:meta, but seems they were still present in hbase:meta table with CLOSED 
> state.
> Need to dig more.
> {noformat}
> 2025-05-01T00:08:32,903 INFO  [PEWorker-15] procedure2.ProcedureExecutor: 
> Finished pid=3512, state=SUCCESS, hasLock=false; MergeTableRegionsProcedure 
> table=IntegrationTestBigLinkedList, 
> regions=[6a98dc86a491041b8d3ac584ac73c0a0, c9f07f77792feb0d8a845d6d9751f048], 
> force=false in 734 msec
> 2025-05-01T00:11:26,333 WARN  [master/meta02:16000.Chore.1] 
> janitor.CatalogJanitor: 
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0./IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1.,
>  
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1./IntegrationTestBigLinkedList,\xA2!RV,1746028626716.c9f07f77792feb0d8a845d6d9751f048.
> 2025-05-01T00:41:40,856 WARN  [master/meta02:16000.Chore.1] 
> janitor.CatalogJanitor: 283c738f170f361157b470868f6ad89., 
> overlap=IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042178.815020ca73a2679bc0c0a298e4dddfda./IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042179.278a2eeee359488f859ac5334ee3cde0.,
>  
> overlap=IntegrationTestBigLinkedList,\x91\x10\xA3\x07\x03\xAC\xC7\xC3\xCCY\xAE\xE4!1\xD1i,1746029042179.278a2eeee359488f859ac5334ee3cde0./IntegrationTestBigLinkedList,\x95U\x0D9}\xAB\xE1\x98\x80w\xED\xA7+\xF9\xA4\xED,1746029042178.b64120d20856552cd7d154b63bd2ce81.,
>  
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0./IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1.,
>  
> overlap=IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626717.24435a6eefc045cf36ddff9a30409ff1./IntegrationTestBigLinkedList,\xA2!RV,1746028626716.c9f07f77792feb0d8a845d6d9751f048.
> 2025-05-01T00:42:00,853 INFO  [PEWorker-12] procedure.FlushRegionProcedure: 
> State of region {ENCODED => 6a98dc86a491041b8d3ac584ac73c0a0, NAME => 
> 'IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0.',
>  STARTKEY => '\x99\x99\x99\x99\x99\x99\x99\x99', ENDKEY => '\xA2!RV'} is not 
> OPEN or in transition. Skip pid=5810, ppid=5789, state=RUNNABLE, 
> hasLock=true; org.apache.hadoop.hbase.master.procedure.FlushRegionProcedure 
> ...
> 2025-05-01T00:44:32,339 INFO  [PEWorker-3] 
> procedure.MasterProcedureScheduler: Took xlock for pid=5964, ppid=5943, 
> state=RUNNABLE, hasLock=false; SnapshotRegionProcedure 
> 6a98dc86a491041b8d3ac584ac73c0a0
> 2025-05-01T00:44:32,340 WARN  [PEWorker-3] procedure.SnapshotRegionProcedure: 
> pid=5964, ppid=5943, state=RUNNABLE, hasLock=true; SnapshotRegionProcedure 
> 6a98dc86a491041b8d3ac584ac73c0a0 can not run currently because region state 
> of 
> IntegrationTestBigLinkedList,\x99\x99\x99\x99\x99\x99\x99\x99,1746028626716.6a98dc86a491041b8d3ac584ac73c0a0.
>  is CLOSED, wait 1000 ms to retry
> {noformat}
> {noformat}
> 2025-05-01 00:27:59,824 WARN [RPCClient-NioEventLoopGroup-1-2] 
> org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator: Failed to locate 
> region in 'IntegrationTestBigLinkedList', 
> row='\xA6\x8B\x9E\xC1\xA98&K}g+7N/\xA1\x05', locateType=CURRENT
> org.apache.hadoop.hbase.HBaseIOException: No location found for 
> 'IntegrationTestBigLinkedList', row='\xA6\x8B\x9E\xC1\xA98&K}g+7N/\xA1\x05', 
> locateType=CURRENT
>       at 
> org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator.onScanNext(AsyncNonMetaRegionLocator.java:322)
>       at 
> org.apache.hadoop.hbase.client.AsyncNonMetaRegionLocator$1.onNext(AsyncNonMetaRegionLocator.java:437)
>       at 
> org.apache.hadoop.hbase.client.AsyncScanSingleRegionRpcRetryingCaller.onComplete(AsyncScanSingleRegionRpcRetryingCaller.java:535)
>       at 
> org.apache.hadoop.hbase.client.AsyncScanSingleRegionRpcRetryingCaller.start(AsyncScanSingleRegionRpcRetryingCaller.java:636)
>       at 
> org.apache.hadoop.hbase.client.AsyncRpcRetryingCallerFactory$ScanSingleRegionCallerBuilder.start(AsyncRpcRetryingCallerFactory.java:322)
>       at 
> org.apache.hadoop.hbase.client.AsyncClientScanner.startScan(AsyncClientScanner.java:208)
>       at 
> org.apache.hadoop.hbase.client.AsyncClientScanner.lambda$openScanner$2(AsyncClientScanner.java:268)
>       at 
> org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:71)
>       at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
>       at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
>       at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
>       at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
>       at 
> org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.lambda$call$4(AsyncSingleRequestRpcRetryingCaller.java:92)
>       at 
> org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:71)
>       at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
>       at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
>       at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
>       at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
>       at 
> org.apache.hadoop.hbase.client.AsyncClientScanner.lambda$callOpenScanner$0(AsyncClientScanner.java:187)
>       at 
> org.apache.hbase.thirdparty.com.google.protobuf.RpcUtil$1.run(RpcUtil.java:56)
>       at 
> org.apache.hbase.thirdparty.com.google.protobuf.RpcUtil$1.run(RpcUtil.java:47)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:400)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:430)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425)
>       at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:117)
>       at org.apache.hadoop.hbase.ipc.Call.setResponse(Call.java:149)
>       at 
> org.apache.hadoop.hbase.ipc.RpcConnection.finishCall(RpcConnection.java:396)
>       at 
> org.apache.hadoop.hbase.ipc.RpcConnection.readResponse(RpcConnection.java:461)
>       at 
> org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.readResponse(NettyRpcDuplexHandler.java:125)
>       at 
> org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelRead(NettyRpcDuplexHandler.java:140)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
>       at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
>       at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
>       at 
> org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:289)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
>       at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>       at java.base/java.lang.Thread.run(Thread.java:840)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to