Hello.

Occasionally, when closing a region, the RS_CLOSE_REGION thread is unable to 
acquire a lock and is still in the WAITING.
(These days, the cluster load increase.)
So the Region state is PENDING_CLOSE persists.
The thread holding the lock is the RPC handler.

If you have any good tips on moving regions, please share them.
It would be nice if the timeout could be set.

The HBase version is 1.2.6.

Best regards,
Minwoo Kang

----

[thread dump]
"RS_CLOSE_REGION" waiting on condition [abc]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <abcd> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
        at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1426)
        at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1372)
        - locked <e> (a java.lang.Object)
        at 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:138)
        at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
   Locked ownable synchronizers:
        - <f> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"RpcServer.handler" waiting on condition [bcd]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <abcd> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
        at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
        at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:8177)
        at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:8164)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:8073)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2547)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2541)
        at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6830)
        at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6809)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2049)
        at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
        at java.lang.Thread.run(Thread.java:748)
   Locked ownable synchronizers:
        - None


Reply via email to