[ https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861999#comment-13861999 ]
Lars Hofhansl commented on HBASE-8912: -------------------------------------- Yeah, would be nice if the AM could retain a history of assignments of a region and avoid retrying the same RS over and over, it should also do per region rate limiting. Too risky to add this to 0.94, though. As for the warning... You are probably right. The warning might still be an indication for double assignments (i.e. the region was OPEN already as far as the AM was concerned and yet it got another OPENED message from ZK). I think in 0.94 we should leave the warning in, in case we see more issues here in the future. In 0.96+ it's not an issue anyway. > [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to > OFFLINE > ---------------------------------------------------------------------------------- > > Key: HBASE-8912 > URL: https://issues.apache.org/jira/browse/HBASE-8912 > Project: HBase > Issue Type: Bug > Reporter: Enis Soztutar > Assignee: Lars Hofhansl > Priority: Critical > Fix For: 0.94.16 > > Attachments: 8912-0.94-alt2.txt, 8912-0.94.txt, 8912-fix-race.txt, > HBASE-8912.patch, HBase-0.94 #1036 test - testRetrying [Jenkins].html, > log.txt, org.apache.hadoop.hbase.catalog.TestMetaReaderEditor-output.txt > > > AM throws this exception which subsequently causes the master to abort: > {code} > java.lang.IllegalStateException: Unexpected state : > testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. > state=PENDING_OPEN, ts=1372891751912, > server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE. > at > org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394) > at > org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > {code} > This exception trace is from the failing test TestMetaReaderEditor which is > failing pretty frequently, but looking at the test code, I think this is not > a test-only issue, but affects the main code path. > https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/ -- This message was sent by Atlassian JIRA (v6.1.5#6160)