[ 
https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799126#comment-13799126
 ] 

Samir Ahmic commented on HBASE-8912:
------------------------------------

Look like there are multiple scenarios for triggering that AssignmentManager 
throws IllegalStateException from PENDING_OPEN to OFFLINE. Here is my case 
(hbase-0.94.6.1): we have updated configuration on two RS (5 total in cluster) 
to wrong value of hbase.client.keyvalue.maxsize (instead of -1 value was set to 
1) after restarting cluster regionservers with wrong value started to throwing 
exception like this:
{code}
2013-10-14 06:53:22,267 WARN 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
Exception running postOpenDeployTasks; region=59badb0e2a41e7831162654227d32049
java.lang.IllegalArgumentException: KeyValue size too large
{code}
 which led to:
{code}
2013-10-14 06:53:22,271 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Closed third_party
_storages,,1342430453242.59badb0e2a41e7831162654227d32049.
{code}
At same time AssignmentManger tried to reassign problematic region to other 
regionservers and after one more failed attempt finally hit server with correct 
value of hbase.client.keyvalue.maxsize and here is relevant log from that RS:

{code}
2013-10-14 06:53:26,390 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13e31052be39645 Attempting to transition node 
59badb0e2a41e7831162654227d32049 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2013-10-14 06:53:26,397 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13e31052be39645 Successfully transitioned node 
59badb0e2a41e7831162654227d32049 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
........................

2013-10-14 06:53:26,459 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13e31052be39645 Successfully transitioned node 
59badb0e2a41e7831162654227d32049 from RS_ZK_
REGION_OPENING to RS_ZK_REGION_OPENED
{code}

At same time AssignmetManager throws this exception and aborted:
{code}
2013-10-14 06:53:26,400 FATAL org.apache.hadoop.hbase.master.HMaster: 
Unexpected state : third_party_storages,,1342430453242.59badb0e
2a41e7831162654227d32049. state=PENDING_OPEN, ts=1381748006399, 
server=rsdfw-10-177-161-197,60020,1381747996145.. Cannot transit it to OFFLINE.
java.lang.IllegalStateException: Unexpected state : 
third_party_storages,,1342430453242.59badb0e2a41e7831162654227d32049. 
state=PENDING_OPEN, ts=1381748006399, 
server=rsdfw-10-177-161-197.internal.personal.com,60020,1381747996145 .. Cannot 
transit it to OFFLINE.
        at 
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1820)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1659)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
        at 
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
        at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
2013-10-14 06:53:26,400 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
{code}

Hope this will help.



> [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to 
> OFFLINE
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-8912
>                 URL: https://issues.apache.org/jira/browse/HBASE-8912
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>             Fix For: 0.94.13
>
>         Attachments: HBase-0.94 #1036 test - testRetrying [Jenkins].html
>
>
> AM throws this exception which subsequently causes the master to abort: 
> {code}
> java.lang.IllegalStateException: Unexpected state : 
> testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. 
> state=PENDING_OPEN, ts=1372891751912, 
> server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE.
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>       at 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>       at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>       at java.lang.Thread.run(Thread.java:662)
> {code}
> This exception trace is from the failing test TestMetaReaderEditor which is 
> failing pretty frequently, but looking at the test code, I think this is not 
> a test-only issue, but affects the main code path. 
> https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to